Add a cache to LaunchedURLClassloader to improve startup performance
Motivation
URLClassLoader shows awful performance when loading Classes/Resources which doesn't exist . URLClassPath takes linear time to find a resource and it will traverse all over the classpath even if the resource doesn't exist. This will obviously slow down the startup , which is even worse when running from an executable fat jar or if the application's class path is a long list.
public Resource getResource(String name, boolean check) {
if (DEBUG) {
System.err.println("URLClassPath.getResource(\"" + name + "\")");
}
Loader loader;
for (int i = 0; (loader = getLoader(i)) != null; i++) {
Resource res = loader.getResource(name, check);
if (res != null) {
return res;
}
}
return null;
}
Scenarios which may lead to this issue are as follows.
1.Find a class repeatedly which apparently doesn't exist
- java.beans.Introspector.findCustomizerClass
private static Class<?> findCustomizerClass(Class<?> type) {
String name = type.getName() + "Customizer";
try {
type = ClassFinder.findClass(name, type.getClassLoader());
// Each customizer should inherit java.awt.Component and implement java.beans.Customizer
// according to the section 9.3 of JavaBeans specification
if (Component.class.isAssignableFrom(type) && Customizer.class.isAssignableFrom(type)) {
return type;
}
}
catch (Exception exception) {
// ignore any exceptions
}
return null;
}
related issue: https://github.com/spring-projects/spring-framework/issues/13653
Here are the top 10 calls of loadClass in the startup of my application
Calls ClassName
480 java.lang.ObjectCustomizer
249 org.springframework.beans.factory.AwareCustomizer
99 org.springframework.beans.factory.InitializingBeanCustomizer
93 org.springframework.core.OrderedCustomizer
87 org.springframework.context.ApplicationContextAwareCustomizer
61 org.springframework.beans.factory.BeanFactoryAwareCustomizer
49 org.springframework.context.annotation.ConfigurationClassEnhancer$EnhancedConfigurationCustomizer
41 org.springframework.context.ApplicationListenerCustomizer
31 org.springframework.web.context.ServletContextAwareCustomizer
29 org.springframework.beans.factory.BeanNameAwareCustomizer
- org.apache.logging.log4j.core.lookup.Interpolator
if (Constants.IS_WEB_APP) {
try {
strLookupMap.put(LOOKUP_KEY_WEB,
Loader.newCheckedInstanceOf("org.apache.logging.log4j.web.WebLookup", StrLookup.class));
} catch (final Exception ignored) {
handleError(LOOKUP_KEY_WEB, ignored);
}
} else {
LOGGER.debug("Not in a ServletContext environment, thus not loading WebLookup plugin.");
}
try {
strLookupMap.put(LOOKUP_KEY_DOCKER,
Loader.newCheckedInstanceOf("org.apache.logging.log4j.docker.DockerLookup", StrLookup.class));
} catch (final Exception ignored) {
handleError(LOOKUP_KEY_DOCKER, ignored);
}
try {
strLookupMap.put(LOOKUP_KEY_SPRING,
Loader.newCheckedInstanceOf("org.apache.logging.log4j.spring.boot.SpringLookup", StrLookup.class));
} catch (final Exception ignored) {
handleError(LOOKUP_KEY_SPRING, ignored);
}
try {
strLookupMap.put(LOOKUP_KEY_KUBERNETES,
Loader.newCheckedInstanceOf("org.apache.logging.log4j.kubernetes.KubernetesLookup", StrLookup.class));
} catch (final Exception | NoClassDefFoundError error) {
handleError(LOOKUP_KEY_KUBERNETES, error);
}
Here are the top 4 calls of loadClass in the startup of my application
Calls ClassName
225 org.apache.logging.log4j.docker.DockerLookup
225 org.apache.logging.log4j.kubernetes.KubernetesLookup
225 org.apache.logging.log4j.spring.cloud.config.client.SpringLookup
225 org.apache.logging.log4j.web.WebLookup
- Many other techniques or frameworks which use loadClass to check class's existence For example,org.springframework.http.converter.json.Jackson2ObjectMapperBuilder
private void registerWellKnownModulesIfAvailable(MultiValueMap<Object, Module> modulesToRegister) {
Class kotlinModuleClass;
Module kotlinModule;
try {
kotlinModuleClass = ClassUtils.forName("com.fasterxml.jackson.datatype.jdk8.Jdk8Module", this.moduleClassLoader);
kotlinModule = (Module)BeanUtils.instantiateClass(kotlinModuleClass);
modulesToRegister.set(kotlinModule.getTypeId(), kotlinModule);
} catch (ClassNotFoundException var6) {
}
2.Find resource repeatedly which doesn't exist
Here are the top 4 calls of findResource/findResources in the startup of my application
Calls Resource
106 META-INF/services/org.apache.logging.log4j.core.util.WatchEventService
92 log4j2.system.properties
64 META-INF/services/javax.xml.parsers.DocumentBuilderFactory
29 META-INF/spring.components
28 META-INF/services/javax.xml.transform.TransformerFactory
18 java/lang/com.class
18 Throwable.class
18 Object.class
18 Class.class
17 String.class
8 META-INF/services/javax.xml.datatype.DatatypeFactory
5 config/application.yml
5 config/application.yaml
5 config/application.xml
5 application.yml
5 application.yaml
5 application.xml
3.Find resource repeatedly
For example
Calls Resource
13 org/springframework/boot/autoconfigure/dao/PersistenceExceptionTranslationAutoConfiguration.class
13 org/springframework/boot/actuate/autoconfigure/health/HealthEndpointConfiguration.class
12 org/springframework/orm/ibatis/support/SqlMapClientDaoSupport.class
12 org/springframework/dao/support/DaoSupport.class
12 org/springframework/core/io/ByteArrayResource.class
12 org/springframework/core/io/AbstractResource.class
12 com/github/javaparser/ast/visitor/VoidVisitorAdapter.class
8 config/application.properties
7 application.properties
5 config/application-local.properties
3 org/apache/http/client/version.properties
4.Reflection repeatedly triggers sun.reflect.GeneratedMethodAccessor<N> class loading in runtime
All the cases listed above happen at startup time, while there are cases in runtime. This will make runtime performance worst. Here is a case which triggers sun.reflect.GeneratedMethodAccessor<N> class loading : https://blog.actorsfit.com/a?ID=00001-e5b50c75-c8a5-4cdb-9b0c-5558fb985a60
at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
- waiting to lock <0x0000000088104b60> (a java.lang.Object)
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.logging.log4j.core.impl.ThrowableProxy.loadClass(ThrowableProxy.java:539)
at org.apache.logging.log4j.core.impl.ThrowableProxy.toExtendedStackTrace(ThrowableProxy.java:660)
at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:137)
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.logging.log4j.core.impl.ThrowableProxy.loadClass(ThrowableProxy.java:539)
at org.apache.logging.log4j.core.impl.ThrowableProxy.toExtendedStackTrace(ThrowableProxy.java:660)
at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:137)
at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:121)
at org.apache.logging.log4j.core.impl.Log4jLogEvent.getThrownProxy(Log4jLogEvent.java:555)
at org.apache.logging.log4j.core.pattern.ExtendedThrowablePatternConverter.format(ExtendedThrowablePatternConverter.java:61)
We generally solve these issues mentioned above case by case and solutions are normally tricky. Is there any final solution to solve this fundamentally?
Approach
This commit trying to fundamentally solve this issue by adding a ClassLoaderCache to LaunchedURLClassLoader .
- Cache
ClassNotFoundExceptionand fast throw it for the next time - Cache
getResourceresult directly - Cache
getResourcesNOT-EXIST result - Set system property
loader.cache.enable=trueto enable cache
Benefit
Here are the startup performance of our applications after this enhancement, which shows a 50% acceleration on average
| App | No Cache | Cache | Improvement |
|---|---|---|---|
| app1 | 75s | 39s | 48% |
| app2 | 83s | 32s | 61% |
| app3 | 152s | 87s | 43% |
| app4 | 318s | 137s | 57% |
Enviroment:2C4G , OpenJDK 1.8.0_172, springboot 2.7.0
This looks very promising. Thank you, @zjulbj. Have you done any comparisons of heap usage with and without the cache?
This looks very promising. Thank you, @zjulbj. Have you done any comparisons of heap usage with and without the cache?
Hi wilkinsona. Thanks for your reply. Talking about heap usage , there are two ways to control it
- All caches are instances of fixed size LRU cache, and cache size can be configured by
loader.cache.size(default 3000)
protected <K, V> Map<K, V> createCache(int maxSize) {
return Collections.synchronizedMap(new LinkedHashMap<K, V>(maxSize, 0.75f, true) {
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() >= maxSize;
}
});
}
- All caches are cleared after the startup, which is invoked by
org.springframework.boot.ClearCachesApplicationListener
/**
* Clear URL caches.
*/
public void clearCache() {
this.loaderCache.clearCache();
if (this.exploded) {
return;
}
for (URL url : getURLs()) {
try {
URLConnection connection = url.openConnection();
if (connection instanceof JarURLConnection) {
clearCache(connection);
}
}
catch (IOException ex) {
// Ignore
}
}
}
Here is a comparison of memory usage, and I think it's acceptable without cache
HEAP-MEMORY-USAGE init : 268435456(256.0 MiB)
[memory in bytes] used : 1181133080(1.1 GiB)
committed : 1799880704(1.7 GiB)
max : 3817865216(3.6 GiB)
NO-HEAP-MEMORY-USAGE init : 2555904(2.4 MiB)
[memory in bytes] used : 171428168(163.5 MiB)
committed : 176160768(168.0 MiB)
max : -1(-1 B)
PENDING-FINALIZE-COUNT 0
with cache
HEAP-MEMORY-USAGE init : 268435456(256.0 MiB)
[memory in bytes] used : 1036303928(988.3 MiB)
committed : 1913126912(1.8 GiB)
max : 3817865216(3.6 GiB)
NO-HEAP-MEMORY-USAGE init : 2555904(2.4 MiB)
[memory in bytes] used : 171065096(163.1 MiB)
committed : 176160768(168.0 MiB)
max : -1(-1 B)
PENDING-FINALIZE-COUNT 0
Thanks for the heap comparisons! I'm going to bring this up on our next team call.
Hey, thanks for your contribution and your work. We want to run more benchmarks on it and want to explore other paths (like indexing the classpath etc.) and don't want to commit yet on the cache solution. I'll put it on hold until we have more information / ideas.
Hey, thanks for your contribution and your work. We want to run more benchmarks on it and want to explore other paths (like indexing the classpath etc.) and don't want to commit yet on the cache solution. I'll put it on hold until we have more information / ideas.
Many thanks for your reply. If I understand correctly,you want to explore other paths to solve this issue in a more widely view? That is also what I am going to do next. My rough idea is generating a index file mapping class/resource/packages to jar urls in the compile time, a little bit like INDEX.LIST,which I believe can imporve the time complexity to O(1). I not sure whether this is what 'indexing the classpath' meant with,and whether am I on a proper path
Yes, that was meant with "indexing the classpath" :)
Yes, that was meant with "indexing the classpath" :)
OK,thanks a lot. It seems similar issues has been discussed long time ago :) http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-August/035009.html http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-August/035010.html http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-August/035011.html http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-August/035012.html http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-August/035013.html http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-September/035192.html http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-September/035210.html http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-September/035290.html
@mhalbritter Is there any progress in "indexing the classpath"?will it be realize in 2.7.x or only support in 3.x?
This will be 3.0.x at the earliest, but more likely later in the 3.x cycle. I have updated the milestone accordingly.
@wilkinsona @mhalbritter Is there any progress in "indexing the classpath"?I really want to know how this is going, because it seems to really help us shorten the startup time of the application. Because some of our big applications have thousands of dependencies
Not at the moment, unfortunately. We have other, higher-priority work at the moment. Perhaps you could build the changes in this PR and share some before and after numbers for your apps?
Those interested in this PR, it would be really useful to know if you've tried running your app using class data sharing (CDS), particularly with Java 21. We've seen some very promising results with Spring Boot apps and would like to know if you have a similar experience. You can learn more in this blog post from @sdeleuze that includes some information about a script that unpacks a Spring Boot executable jar into a format that's as CDS-friendly as possible.
I wrote about my experience with CAS here.
I did not try the script that does the unpacking, but in a nutshell, what I saw (with a WAR file) was:
jar -xf build/libs/cas.war
cd build/libs
java org.springframework.boot.loader.launch.JarLauncher
...
...
...
<Started CasWebApplication in 7.693 seconds (process running for 9.285)>
Then:
jar -xf build/libs/cas.war
cd build/libs
java -cp "WEB-INF/classes:WEB-INF/lib/*" org.apereo.cas.web.CasWebApplication
...
...
...
<Started CasWebApplication in 6.873 seconds (process running for 7.469)>
and then with CDS:
java -XX:+AutoCreateSharedArchive -XX:SharedArchiveFile=cas.jsa \
org.springframework.boot.loader.launch.JarLauncher
...
...
...
<Started CasWebApplication in 6.766 seconds (process running for 7.582)>
Not that much difference, but I can see what I might have missed with the script.
Please check the guidelines printed when using the unpack-executable-jar.sh script.
It is expected they you get no significant gain when using the JarLauncher.
To start the application from that directory:
$ java -jar run-app.jar
TIP: To improve startup performance, you can create a CDS Archive with a single training run:
$ java -XX:ArchiveClassesAtExit=application.jsa -Dspring.context.exit=onRefresh -jar run-app.jar
Then use the generated cache:
$ java -XX:SharedArchiveFile=application.jsa -jar run-app.jar
See https://docs.spring.io/spring-framework/reference/integration/class-data-sharing.html for more details.
Thanks! I made small modifications to the script so it can run against a WAR file (replaced BOOT-INF with WEB-INF). With that in mind, I am not sure if this is still decent feedback and a good test, but here we go:
unpack-executable-jar.sh -d tmp build/libs/cas.war
Then:
$ java -jar tmp/run-app.jar
.
.
- <Started CasWebApplication in 6.637 seconds (process running for 8.178)>
Then:
java -XX:ArchiveClassesAtExit=application.jsa -Dspring.context.exit=onRefresh -jar tmp/run-app.jar
java -XX:SharedArchiveFile=application.jsa -jar tmp/run-app.jar
.
.
- <Started CasWebApplication in 6.284 seconds (process running for 6.693)>
I had a deeper look, this CAS application seems to perform a lot of processing at startup. To put things in perspective, on my powerful MacBook Pro M2, the CAS application takes 5.285 s to start (after disabling key generation at startup) while Petclinic JDBC takes 1.448 s to start. CDS is improving the loading of the classes, it can't optimize custom application processing at startup.
If we analyze the data points more in details (unpacked variant is done with the unpack-executable-jar.sh script which replicate #38276 behavior):
JARexe: Started CasWebApplication in 7.854 seconds (process running for 8.709) Unpacked: Started CasWebApplication in 4.778 seconds (process running for 5.285) Unpacked with CDS:Started CasWebApplication in 4.36 seconds (process running for 4.712)
JARexe: Started PetClinicApplication in 1.586 seconds (process running for 1.842) Unpacked: Started PetClinicApplication in 1.299 seconds (process running for 1.448) Unpacked with CDS: Started PetClinicApplication in 0.859 seconds (process running for 0.978) Unpacked with CDS + AOT: Started PetClinicApplication in 0.678 seconds (process running for 0.81) Project Leyden + AOT: even more
So CDS seems to optimize pretty consistently the loading of the classes on the 2 samples (0,573 s reduction versus 0,47 s). This half of a second on a powerful machine turns to be multiple seconds when deployed on a cheap Cloud server.
I think the layout used by the script and proposed by #38276 is immune even without CDS to the performance issue discussed in this PR, since it does not use LaunchedURLClassloader. So using both system classloader + CDS provide a significant performance boost.
AOT could be used to optimize even more the Spring Boot configuration processing of this application, but I was not able to make it work with CAS due to some double bean registration (feel free to raise a bug on https://github.com/spring-projects/spring-framework if you think there is something to fix on Spring Framework side).
Cache Data Store (Project Leyden premain) will be able to optimize with a wider scope than just class loading (optionally combined with AOT). It will for example include JVM AOT cache and maybe some classpath check caches. For the rest, that will be to the application itself to make careful use of the processing steps happening before startup.
Thanks very much, @sdeleuze.
Thank you very much @sdeleuze.
(At the moment AOT processing of the CAS application is blocked due to a bug in Spring Data. I will revisit this once the fix is out, but even at the time I was able to build native images, startup time of the native image was still around 2~3 seconds)
I hope this isn't hijacking the thread here, but does the team have any recommendations on how to analyze the startup data and performance beyond the likes of VisualVM, etc? One suspicion I have (at least in the case of this CAS app) is that given it's a Spring Cloud app and uses @RefreshScope, I wonder if all those proxied beans are slowing down the startup. I don't suppose PetClinicApplication uses Spring Cloud, or put differently, have you also done CDS analysis of Spring Cloud apps? Is that not in scope of this exercise?
@mmoayyed You can probably use FlightRecorderApplicationStartup. @RefreshScope can indeed have an impact but the 3 seconds in native tends to show this is more application specific. Now let's maybe indeed focus again on the original purpose of that PR ;-)
Sure thing, thank you very much!