Validator.nu HTML Parser 1.4 Available
A new release of the Validator.nu HTML Parser is available. The new version 1.4 contains minor adjustments to spec compliance and fixes for notable Java-specific problems (of the crash and infinite loop sort). Also, the parser is again available from the Maven Central Repository (groupId
: nu.validator.htmlparser
, artifactId
: htmlparser
, version
: 1.4
).
Upgrading to the newest version is recommended for all users of all previous versions.
Changelog:
- No longer crashes in
setErrorHandler()
in the DOM case. - No longer crashes with
ArrayIndexOutOfBoundsException
in themeta
prescan. - Correctness tweaks to HTML integration point and MathML text integration point behavior.
- Slight adjustments to error and warning reporting.
- The XLink namespace is now serialized more nicely.
- Unicode decoder returning zero-length output in the middle of the file is now dealt with correctly.
- No longer goes to infinite loop with the HotSpot workaround applied.
- Builds again with Maven.
Much as I would love to upgrade (previously using 1.2.1 because it was the latest in Maven), the XOM integration is unfortunately broken due to this code in the constructor of nu.validator.htmlparser.xom.HtmlBuilder:
this.driver = null;
this.driver.setXmlnsPolicy(XmlViolationPolicy.ALTER_INFOSET);
That’s a guaranteed NullPointerException!