Description
@anthonyharrison and I had a quick discussion about doing schema validation of XML before it's loaded. We've got a few places where we load XML that should have known schemas: the new java package parser added in the PR above and the SBOM parser.
It looks like xmlschema will work for what we need.
Since the schemas we're talking about shouldn't change very frequently (unlike the NVD data), we probably want to cache them and may want to consider putting them directly into the cve-bin-tool package so they're immediately available to users.
Note that xml schema validation is commonly used as a defense-in-depth measure for safe XML parsing. The defusedxml parser we're using does in fact fail on malformed data and has built-in protection against a number of known XML attacks. Adding schema validation would still add another layer of protection against malformed data and improve error messages for the user.