Validator.nu HTML Parser 1.0.7 Released

April 5th, 2008 by Henri Sivonen

There is now a new release of the Validator.nu HTML Parser. Change highlights:

  • Adds optional support for heuristic encoding sniffing using the ICU4J sniffer, jchardet or both.
  • Adds support for rewinding and reparsing when becoming confident about the character encoding and the tentative encoding was wrong.
  • Performs encoding name matching per spec instead of using the JDK mechanism.
  • Implements spec changes up until just before SVG and MathML support. (Those will merit 1.1 or something.)
  • Warning: The semantics of the doctype token have changed in case you have your own token handler (unlikely).

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>