Skip to content

[GR-39200] XML schema related JDK classes not fully usable (JDK javax.xml.validation.Schema and friends). #4608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michael-simons opened this issue May 31, 2022 · 14 comments · Fixed by #4649 or #4770
Assignees

Comments

@michael-simons
Copy link

Usecase

I want to use javax.xml.validation.Schema for loading XML Schema files, assigning them to a javax.xml.parsers.DocumentBuilder with the ultimate goal of loading and validating documents.

This can be represented as such:

package simons;

import java.io.File;

import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;

import org.w3c.dom.Document;

public class Main {

	public static void main(String... a) throws Exception {


		SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
		Schema schema = schemaFactory.newSchema(new File(a[0]));

		DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
		documentBuilderFactory.setSchema(schema);
		documentBuilderFactory.setExpandEntityReferences(false);
		documentBuilderFactory.setNamespaceAware(true);

		DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
		Document document = documentBuilder.parse(new File(a[1]));
		System.out.println(document.getDocumentElement().getLocalName());
	}
}

Together with a schema

<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="whatever"
		   xmlns="whatever">

	<xs:element name="migration" />
</xs:schema>

and a document

<?xml version="1.0" encoding="UTF-8"?>
<migration xmlns="whatever">
</migration>

Environment

I tested this under

  • OS: macOS 12.3.1
  • X86_64

With both 22.1.0

java -version
openjdk version "17.0.3" 2022-04-19
OpenJDK Runtime Environment GraalVM CE 22.1.0 (build 17.0.3+7-jvmci-22.1-b06)
OpenJDK 64-Bit Server VM GraalVM CE 22.1.0 (build 17.0.3+7-jvmci-22.1-b06, mixed mode, sharing)
native-image --version                                                                      
GraalVM 22.1.0 Java 17 CE (Java Version 17.0.3+7-jvmci-22.1-b06)

and 21.3.2

java -version
openjdk version "17.0.3" 2022-04-19
OpenJDK Runtime Environment GraalVM CE 21.3.2 (build 17.0.3+7-jvmci-21.3-b14)
OpenJDK 64-Bit Server VM GraalVM CE 21.3.2 (build 17.0.3+7-jvmci-21.3-b14, mixed mode, sharing)
native-image --version
GraalVM 21.3.2 Java 17 CE (Java Version 17.0.3+7-jvmci-21.3-b14)

under Java 17.

I also tried various --release (8 and 11) switches to Javac before applying native-image.

So, the above program can be run with as a single class executable without issues: java simons/Main.java schema.xsd document.xml and prints migration.

Compilation

(I have been using -Ob below to spare me a bit of time, issues applies both with and without quick mode).

javac simons/Main.java --release 17   
native-image -Ob simons.Main main  

gives

You enabled -Ob for this image build. This will configure some optimizations to reduce image build time.
This feature should only be used during development and never for deployment.
========================================================================================================================
GraalVM Native Image: Generating 'main' (executable)...
========================================================================================================================
[1/7] Initializing...                                                                                    (3,8s @ 0,21GB)
 Version info: 'GraalVM 22.1.0 Java 17 CE'
 C compiler: cc (apple, x86_64, 13.1.6)
 Garbage collector: Serial GC
[2/7] Performing analysis...  [******]                                                                   (8,8s @ 1,34GB)
   3.877 (79,37%) of  4.885 classes reachable
   6.385 (59,41%) of 10.748 fields reachable
  18.827 (49,60%) of 37.955 methods reachable
      28 classes,     0 fields, and   361 methods registered for reflection
      58 classes,    60 fields, and    51 methods registered for JNI access
[3/7] Building universe...                                                                               (0,9s @ 1,64GB)
[4/7] Parsing methods...      [*]                                                                        (0,7s @ 0,76GB)
[5/7] Inlining methods...     [****]                                                                     (0,6s @ 2,00GB)
[6/7] Compiling methods...    [**]                                                                       (4,0s @ 0,52GB)
[7/7] Creating image...                                                                                  (1,5s @ 1,06GB)
   8,34MB (44,41%) for code area:   11.591 compilation units
   9,15MB (48,75%) for image heap:   2.437 classes and 113.393 objects
   1,28MB ( 6,84%) for other data
  18,77MB in total
------------------------------------------------------------------------------------------------------------------------
Top 10 packages in code area:                               Top 10 object types in image heap:
 730,21KB java.util                                            1,85MB byte[] for code metadata
 547,22KB c.s.org.apache.xerces.internal.impl.xs.traversers    1,11MB byte[] for general heap data
 476,62KB com.sun.org.apache.xerces.internal.impl              1,11MB java.lang.String
 411,60KB com.sun.org.apache.xerces.internal.impl.xs         913,02KB java.lang.Class
 352,43KB java.lang                                          702,18KB byte[] for java.lang.String
 291,14KB com.sun.org.apache.xerces.internal.impl.dv.xs      388,78KB java.util.HashMap$Node
 275,07KB java.text                                          302,89KB com.oracle.svm.core.hub.DynamicHubCompanion
 237,92KB java.util.regex                                    219,64KB java.lang.String[]
 233,23KB com.oracle.svm.jni                                 169,27KB java.util.HashMap$Node[]
 221,11KB c.sun.org.apache.xerces.internal.impl.xpath.regex  156,09KB java.util.concurrent.ConcurrentHashMap$Node
      ... 163 additional packages                                 ... 853 additional object types
                                           (use GraalVM Dashboard to see all)
------------------------------------------------------------------------------------------------------------------------
                        1,4s (6,0% of total time) in 20 GCs | Peak RSS: 3,48GB | CPU load: 9,80
------------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /Users/msimons/Downloads/some-client/src/main/java/main (executable)
 /Users/msimons/Downloads/some-client/src/main/java/main.build_artifacts.txt
========================================================================================================================
Finished generating 'main' in 22,0s.

1st problem: Missing classes

The following issues still seems to persist: #684 and #1387.

Running main produces

Exception in thread "main" com.sun.org.apache.xerces.internal.utils.ConfigurationError: Provider com.sun.org.apache.xerces.internal.impl.dv.xs.SchemaDVFactoryImpl not found
	at com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:168)
	at com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:148)
	at com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:73)
	at com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:57)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.reset(XMLSchemaLoader.java:1053)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:564)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:543)
	at com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory.newSchema(XMLSchemaFactory.java:281)
	at javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:612)
	at javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:628)
	at simons.Main.main(Main.java:19)

This can be fixed with

[
{
  "name":"com.sun.org.apache.xerces.internal.impl.dv.xs.ExtendedSchemaDVFactoryImpl",
  "methods":[{"name":"<init>","parameterTypes":[] }]
},
{
  "name":"com.sun.org.apache.xerces.internal.impl.dv.xs.SchemaDVFactoryImpl",
  "methods":[{"name":"<init>","parameterTypes":[] }]
},
{
  "name":"com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl",
  "methods":[{"name":"<init>","parameterTypes":[] }]
}
]

2nd problem: Missing messages

The generated image now works "better", but comes with that message after a recompilation:

./main schema.xsd document.xml                                                                         
Exception in thread "main" java.lang.ExceptionInInitializerError
	at java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:480)
	at com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:163)
	at com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:148)
	at com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:73)
	at com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:57)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.reset(XMLSchemaLoader.java:1053)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:564)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:543)
	at com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory.newSchema(XMLSchemaFactory.java:281)
	at javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:612)
	at javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:628)
	at simons.Main.main(Main.java:19)
Caused by: java.lang.RuntimeException: internal error
	at com.sun.org.apache.xerces.internal.impl.dv.xs.XSSimpleTypeDecl.applyFacets1(XSSimpleTypeDecl.java:754)
	at com.sun.org.apache.xerces.internal.impl.dv.xs.BaseSchemaDVFactory.createBuiltInTypes(BaseSchemaDVFactory.java:207)
	at com.sun.org.apache.xerces.internal.impl.dv.xs.SchemaDVFactoryImpl.createBuiltInTypes(SchemaDVFactoryImpl.java:47)
	at com.sun.org.apache.xerces.internal.impl.dv.xs.SchemaDVFactoryImpl.<clinit>(SchemaDVFactoryImpl.java:42)
	... 13 more

which is really hard to understand, tbh.

This can be fixed with

{
  "resources":{
  "includes":[]},
  "bundles":[{
      "name":"com.sun.org.apache.xerces.internal.impl.xpath.regex.message",
      "locales":[
        "", 
        "de"
      ]
    }]
}

Conclusion

The agent produces both files as expected:

java  -agentlib:native-image-agent=config-output-dir=config simons.Main schema.xsd document.xml       

and compilation with

native-image -Ob  -H:ResourceConfigurationFiles=./config/resource-config.json -H:ReflectionConfigurationFiles=./config/reflect-config.json simons.Main main   

produces a working image

My expectation however is that none of this is necessary, tbh or at least have a convenient switch like --enable-https but for xml, meaning "do the right thing".

I settled for xml in my actual usecase because it's dependency and mostly hassle free use from Java and so I have a bit of expectation here on Graal, too :)

As usual, all files are attached (xmlschemaissue.zip), please ping me if there are more questions / remarks.

Thanks a lot in advance.

@fniephaus
Copy link
Member

Hi @michael-simons,
Could you please try again with a recent dev build of GraalVM? I recently pushed 919c2db, which hopefully already resolves your issue.

@michael-simons
Copy link
Author

michael-simons commented Jun 1, 2022

Sure, but sorry, no luck… see:

native-image -Ob simons.Main main 
You enabled -Ob for this image build. This will configure some optimizations to reduce image build time.
This feature should only be used during development and never for deployment.
========================================================================================================================
GraalVM Native Image: Generating 'main' (executable)...
========================================================================================================================
[1/7] Initializing...                                                                                    (8,6s @ 0,11GB)
 Version info: 'GraalVM 22.2.0-dev Java 17 CE'
 Java version info: '17.0.3+4-jvmci-22.2-b01'
 C compiler: cc (apple, x86_64, 13.1.6)
 Garbage collector: Serial GC
[2/7] Performing analysis...  [******]                                                                  (23,7s @ 0,78GB)
   3.850 (78,57%) of  4.900 classes reachable
   6.025 (59,94%) of 10.052 fields reachable
  18.419 (48,16%) of 38.249 methods reachable
      28 classes,     0 fields, and   335 methods registered for reflection
      59 classes,    60 fields, and    52 methods registered for JNI access
[3/7] Building universe...                                                                               (3,0s @ 1,59GB)
[4/7] Parsing methods...      [**]                                                                       (2,5s @ 0,96GB)
[5/7] Inlining methods...     [***]                                                                      (1,7s @ 1,61GB)
[6/7] Compiling methods...    [****]                                                                    (12,1s @ 1,31GB)
[7/7] Creating image...                                                                                  (2,5s @ 1,86GB)
   7,74MB (43,25%) for code area:    11.216 compilation units
   8,92MB (49,85%) for image heap:  112.411 objects and 5 resources
   1,24MB ( 6,90%) for other data
  17,90MB in total
------------------------------------------------------------------------------------------------------------------------
Top 10 packages in code area:                               Top 10 object types in image heap:
 726,99KB java.util                                            1,74MB byte[] for code metadata
 492,53KB c.s.org.apache.xerces.internal.impl.xs.traversers    1,08MB java.lang.String
 437,38KB com.sun.org.apache.xerces.internal.impl              1,07MB byte[] for general heap data
 353,21KB java.lang                                          908,88KB java.lang.Class
 341,19KB com.sun.org.apache.xerces.internal.impl.xs         674,22KB byte[] for java.lang.String
 273,46KB java.text                                          407,58KB java.util.HashMap$Node
 237,54KB com.oracle.svm.jni                                 300,78KB com.oracle.svm.core.hub.DynamicHubCompanion
 236,90KB java.util.regex                                    217,73KB java.lang.String[]
 215,06KB com.sun.org.apache.xerces.internal.impl.dtd        175,47KB java.util.HashMap$Node[]
 201,32KB java.util.concurrent                               160,18KB byte[] for reflection metadata
   4,23MB for 163 more packages                                1,64MB for 932 more object types
                                           (use GraalVM Dashboard to see all)
------------------------------------------------------------------------------------------------------------------------
                        1,1s (2,0% of total time) in 18 GCs | Peak RSS: 3,45GB | CPU load: 3,21
------------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /Users/msimons/Projects/tmp/gr4608/main (executable)
 /Users/msimons/Projects/tmp/gr4608/main.build_artifacts.txt (txt)
========================================================================================================================
Finished generating 'main' in 56,4s.

still fails:

gr4608 java simons/Main.java schema.xsd document.xml
migration
➜  gr4608 ./main schema.xsd document.xml 
Exception in thread "main" com.sun.org.apache.xerces.internal.utils.ConfigurationError: Provider com.sun.org.apache.xerces.internal.impl.dv.xs.SchemaDVFactoryImpl not found
	at com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:168)
	at com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:148)
	at com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:73)
	at com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:57)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.reset(XMLSchemaLoader.java:1053)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:564)
	at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:543)
	at com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory.newSchema(XMLSchemaFactory.java:281)
	at javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:612)
	at javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:628)
	at simons.Main.main(Main.java:19)

I used https://github.com/graalvm/graalvm-ce-dev-builds/releases/download/22.2.0-dev-20220531_1958/graalvm-ce-java17-darwin-amd64-dev.tar.gz

@fniephaus
Copy link
Member

Thanks for checking. We'll look into this soon.

@michael-simons
Copy link
Author

michael-simons commented Jun 1, 2022

I just had a quick look at your change and while I don't know if those configuration affects how Graal native image is build itself or just the image and in which order the JDK of Graal is build… What does happen if you use the non-shaded name of Xerces? org.apache.xerces.internal? The "fun" created through shading is always great (I do understand that not Graal is doing any shading here, but the JDK)

Anyway, thanks for sharing the commit (I find this educating) and also for looking into it. Curious what it will be in the end, the process seems to work for the other com.sun.org.apache.xerces.internal.* classes, they are definitely logged in the build output.

@fniephaus fniephaus changed the title XML schema related JDK classes not fully usable (JDK javax.xml.validation.Schema and friends). [GR-39200] XML schema related JDK classes not fully usable (JDK javax.xml.validation.Schema and friends). Jun 16, 2022
@fniephaus fniephaus linked a pull request Jun 16, 2022 that will close this issue
@fniephaus
Copy link
Member

fniephaus commented Jun 16, 2022

@michael-simons I looked into this some more and came up with #4608. Can you give that a go and verify that this works for your example (it does on my machine:tm:)? I'm not sure why, but I didn't run into "2nd problem: Missing messages".

@michael-simons
Copy link
Author

Hello @fniephaus The latest dev build (10 days old, 20220606_2102) still fails.

@fniephaus
Copy link
Member

#4608 hasn't been merged yet so you'd need to build Native Image from source to try this out. If that's too much work on your end, you may need to wait until it's merged and deployed. Not sure if this can still make it into the upcoming 22.2 release.

@fniephaus
Copy link
Member

The fix should land in 22.3 and will be available shortly in a dev build. Please feel free to re-open if we didn't fix this :)

@michael-simons
Copy link
Author

Hey @fniephaus I am terribly sorry to get back to you only after the release of 22.2

Sadly, I have to report that it does not work in either 22.2 nor 22.3 dev. I tested

native-image --version 
GraalVM 22.2.0 Java 17 CE (Java Version 17.0.4+8-jvmci-22.2-b06)

and

native-image --version
GraalVM 22.3.0-dev Java 17 CE (Java Version 17.0.4+7-jvmci-22.3-b02)

from two days ago.

Image builds fine, but fails to run:

./main schema.xsd document.xml                                 
Exception in thread "main" com.sun.org.apache.xerces.internal.utils.ConfigurationError: Provider com.sun.org.apache.xerces.internal.impl.dv.xs.SchemaDVFactoryImpl not found
	at [email protected]/com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:168)
	at [email protected]/com.sun.org.apache.xerces.internal.utils.ObjectFactory.newInstance(ObjectFactory.java:148)
	at [email protected]/com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:73)
	at [email protected]/com.sun.org.apache.xerces.internal.impl.dv.SchemaDVFactory.getInstance(SchemaDVFactory.java:57)
	at [email protected]/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.reset(XMLSchemaLoader.java:1053)
	at [email protected]/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:564)
	at [email protected]/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:543)
	at [email protected]/com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory.newSchema(XMLSchemaFactory.java:281)
	at [email protected]/javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:612)
	at [email protected]/javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:628)
	at simons.Main.main(Main.java:19)

I have updated the zip (xmlschemaissue2.zip), I forgot to add above main file. So for reproducing please see:

Download:

curl -Lo xmlschemaissue2.zip https://github.com/oracle/graal/files/9220270/xmlschemaissue2.zip
unzip xmlschemaissue2.zip
javac simons/Main.java --release 17
native-image simons.Main main
./main schema.xsd document.xml    

@michael-simons

This comment was marked as outdated.

@fniephaus fniephaus reopened this Jul 29, 2022
@fniephaus
Copy link
Member

Apologies, I may have tested the fix with the wrong JDK, so thanks for checking again. I've opened #4770, which should resolve the issue.

@fniephaus
Copy link
Member

Could you please try again with a dev build in a couple of days? Thanks!

@michael-simons
Copy link
Author

Hello @fniephaus,

I'm happy to confirm that both these dev builds work now as expected:

GraalVM 22.3.0-dev Java 17 CE (Java Version 17.0.4+7-jvmci-22.3-b02)
GraalVM 22.3.0-dev Java 11 CE (Java Version 11.0.16+7-jvmci-22.3-b02)

(From August 5th, I think).

Thanks for taking care of maybe somewhat "exotic" use case these days.

@fniephaus
Copy link
Member

Yay! Thanks for confirming, glad it's working now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants