-
Notifications
You must be signed in to change notification settings - Fork 38.5k
Use String.intern() for Annotation and Class scanning [SPR-14862] #19428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Juergen Hoeller commented Good point! Revised in our fork of ASM for Spring Framework 4.3.4, due towards the end of this week. |
Ivan Sopov commented This change looks really suspicious since it breaks simple rule of thumb "never-ever user String.intern() method". More information can be found on - https://youtu.be/YgGAUGC9ksk?t=1739 (point on video just about String.intern() method) and on https://shipilev.net/talks/joker-Oct2014-string-catechism.pdf (slides 48-59). Let me quote it: Q: I will use String.intern just on this tiny little location. It seems that it is the first usage of String.intern() in spring-framework. How about removing it and banning such usages in future? |
Juergen Hoeller commented
|
Oleg Poleshuk commented Hi, I'm pretty sure you have made your conclusions based on a solid research: performance analysis before/after, bytecode analysis, tried different JVM versions. |
Juergen Hoeller commented Oleg Poleshuk, in all fairness, that was a rather passive-aggressive comment that I don't consider justified here... neither to myself nor to the original reporter. I'll nevertheless take the bait: Potential negative effects of String interning are indeed JVM-dependent. I am not concerned about increased perm-gen consumption since we're just interning class names which end up on perm-gen in any case. Based on educated armchair reasoning, I don't see an issue here. Are we really concerned about interning the names of a few classes which might end up not getting loaded eventually, despite being in the scanned packages of the deployment unit? You do have a point that this is a tradeoff, so I'm happy to learn about any specific effects you are concerned about and reconsider the change for 4.3.5 accordingly. Let me turn the need for a proof around: Have you done solid research about the negative effects of String interning in the ASM class name parsing of Spring Framework 4.3.4, to rephrase your own words? Feel free to give Renier Roth, as the original reporter, could you elaborate on your specific motivation for String interning here? In particular if I'm not representing it accurately in my comment above? |
Juergen Hoeller commented With respect to the performance of the Generally, if we only did such fine-tuning based on "solid research" along the lines above, we'd never get around to any fine-tuning at all. There has to be some pragmatism in this, along with the willingness to reconsider if negative effects show up. This is particularly the case for concurrency fine-tuning of which we had a lot in recent years: mostly based on assumptions, always a tradeoff, never ideal for anybody, essentially just about finding a fine-tuned compromise that is good enough for the mainstream case. The only way of getting there is releasing it and then iteratively interacting with our stakeholders. |
Renier Roth commented Hi, It is like Juergen wrote. I see it like him. I did research the fakt, that class names where hold multible times in memory as Strings. Done via JProfiler and duplicate String search on a heap dump. I do not have made Screenshots, but everyone can do it on a heap dump and search for duplicate class names. It is also, the case (described in the issues) that every annotation you do on a managed bean (parsed by this scanner) are held in Memory as a String. So if you have 1000 managed Beans and each have an Annotation "Tag" inside the package "com.somreallylongname.models.somthing.else.another" you have the String "com.somreallylongname.models.somthing.else.another.Tag" inside your Heap Memory 1000 times. And this is only one annotation. All Classnames where held in Memory for managed Beans even the method return types. I checked after using Spring 4.3.4 again with the JProfile duplicate String feature, on a Heap dump. The classes as String are no longer duplicated anymore. Thats why i created the other ticket where some other Strings poped up, caused by ClassReader. |
Oleg Poleshuk commented Thanks for quick answers, I was surprised to get a response at all, usually it can take weeks to get a response in huge projects like Spring, I appreciate your responses. The performance argument regarding "happens only during bootstrap" is only partially valid, unfortunately. If we apply the argument regarding CPU ("that aside, the rather expensive I/O cost of classpath scanning will easily outweight any CPU cycles spent on String comparisons for interning") to a memory consumption, I would say that memory effect from using intern() is almost invisible in comparison to the whole memory consumption by most enterprise Spring projects (JPA, Hibernate, REST, Tomcat consume a lot of memory). String deduplication is available since Java 8 update 20 https://blog.codecentric.de/en/2014/08/string-deduplication-new-feature-java-8-update-20-2/ |
Juergen Hoeller commented Good point about the impact on garbage collection, and thanks for the specific pointers! Given that we're interning class names only, I would not expect the String table to grow, at least not significantly... since it'll contain the class names eventually anyway. As a consequence, there shouldn't be noteworthy impact on GC either. If there turns out to be a measurable difference in GC, I'm happy to reconsider. Without that, I'm inclined to leave the arrangement as-is: It has a positive impact for the original reporter's scenario, at least, and no proven negative impact in other scenarios yet. String deduplication is indeed a nice recent JVM feature, and I expect Compact Strings in JDK 9 to have quite an impact as well. Worth a note: The author of the String deduplication article wrote about String interning as well, highlighting the tradeoff but clearly not arguing that it should never be used at all: https://blog.codecentric.de/en/2012/03/save-memory-by-using-string-intern-in-java/ ... From my perspective, it should never be used in common application code. Even in our framework case, we are using it for a very specific kind of String only, certainly not recommending it for general use across the codebase. |
Renier Roth commented As written above: I would say that every byte you can save make a difference I could even messurte it. Saying that other Frameworks are worse in memory mangement, makes this tuning not invalid. And yes I would not use intern() on every code line, cause of the negative effects discribed, but as commented before its during classpath scanning and the Strings are Classnames and Types that are most likely inside the StringTable or should be. String duplicattion feature from Java8 does not conflict with this tuning and its for older JVMs as well. |
Ivan Sopov commented Just one more note: these two intern() calls place two different kinds of class-names to this table - with "." and "/" separators. It seems to me that only one of them is placed there by ClassLoaders. |
Juergen Hoeller commented Good point: So effectively, we'll have two representations of every class name in the String pool then. Both variants might end up there in any case (since the slash variant corresponds to the internal resource name that the JVM tracks per class). Admittedly, this is making assumptions about the JVM's default interning... but then again, the JVM and the JDK standard libraries rather aggressively intern String literals and in particular reflection artifact names (see the FWIW, checking other common libraries, there is a lot of |
Renier Roth opened SPR-14862 and commented
Consider Using String.intern() on the Type scanned by the Visitors. These Strings are always identical but are duplicated in Memory cause of new String() call.
Class: org.springframework.asm.Type
Line 565 & 580
example:
{CODE:linenumbers=true}
/**
Returns the binary name of the class corresponding to this type. This
method must not be used on method types.
@return
the binary name of the class corresponding to this type.*/
public String getClassName() {
switch (sort) {
case VOID:
return "void";
case BOOLEAN:
return "boolean";
case CHAR:
return "char";
case BYTE:
return "byte";
case SHORT:
return "short";
case INT:
return "int";
case FLOAT:
return "float";
case LONG:
return "long";
case DOUBLE:
return "double";
case ARRAY:
StringBuilder sb = new StringBuilder(getElementType().getClassName());
for (int i = getDimensions(); i > 0; --i) {
sb.append("[]");
}
return sb.toString();
case OBJECT:
return new String(buf, off, len).replace('/', '.');
default:
return null;
}
}
/**
@return
the internal name of the class corresponding to this object type.*/
public String getInternalName() {
return new String(buf, off, len);
}
{CODE}
Changed to:
{CODE:linenumbers=true}
/**
Returns the binary name of the class corresponding to this type. This
method must not be used on method types.
@return
the binary name of the class corresponding to this type.*/
public String getClassName() {
switch (sort) {
case VOID:
return "void";
case BOOLEAN:
return "boolean";
case CHAR:
return "char";
case BYTE:
return "byte";
case SHORT:
return "short";
case INT:
return "int";
case FLOAT:
return "float";
case LONG:
return "long";
case DOUBLE:
return "double";
case ARRAY:
StringBuilder sb = new StringBuilder(getElementType().getClassName());
for (int i = getDimensions(); i > 0; --i) {
sb.append("[]");
}
return sb.toString();
case OBJECT:
return new String(buf, off, len).replace('/', '.').intern();
default:
return null;
}
}
/**
@return
the internal name of the class corresponding to this object type.*/
public String getInternalName() {
return new String(buf, off, len).intern();
}
{CODE}
Lines difference in 34 & 49
This is used by several visitors on Class/Annotation scanning. The names of these Classes are then cached, but uses a new String Reference in Memory.
By Using String.intern() we can avoid duplicated Strings.
Memory Consumption and count of duplicated Strings depends on How many Annotations you have in your managed Beans.
Affects: 4.3.3
Issue Links:
Referenced from: commits d859826, 61d7d16
The text was updated successfully, but these errors were encountered: