ggimage
ggimage copied to clipboard
Add Image Caching Mechanism to Improve Performance of geom_image
Description:
This pull request introduces an image caching mechanism to geom_image, significantly improving performance for plots with multiple images or repeated plot generations.
Let's consider the example provided:
images = list.files(system.file("extdata", package="ggimage"),
pattern="png", full.names=TRUE)
df = data.frame(x = rep(1:20, each = 20),
y = rep(1:20, 20),
image = sample(images, 400, replace = TRUE))
ggplot(df, aes(x, y)) +
geom_image(aes(image=image), size=0.04)
This example creates a plot with 400 image points, potentially using multiple unique images repeatedly. Without caching:
- Each of the 400 points would require loading its image from disk, even if it's a duplicate.
- For large datasets or repeated plot generations (e.g., in Shiny apps), this could lead to significant performance issues.
With caching:
- Each unique image is loaded only once and stored in memory.
- Subsequent uses of the same image retrieve it from the cache instead of reloading from disk.
- This significantly reduces I/O operations and improves rendering speed, especially for larger datasets or interactive applications.
To demonstrate, you could run a simple benchmark:
library(microbenchmark)
> # With caching
> microbenchmark(
+ print(ggplot(df, aes(x, y)) + geom_image(aes(image=image), size=0.04)),
+ times = 10
+ )
Unit: milliseconds
expr min lq mean median uq max neval
print(ggplot(df, aes(x, y)) + geom_image(aes(image = image), size = 0.04)) 570.8385 574.692 600.3304 582.8108 597.2787 733.2289 10
> # Without caching
> microbenchmark(
+ print(ggplot(df, aes(x, y)) + geom_image(aes(image=image), size=0.04)),
+ times = 10
+ )
Unit: seconds
expr min lq mean median uq max neval
print(ggplot(df, aes(x, y)) + geom_image(aes(image = image), size = 0.04)) 6.55834 48.1042 45.05545 49.15449 49.98102 51.79424 10
The results would likely show a significant performance improvement, especially on subsequent runs.
Key changes:
- Implemented an internal cache using an environment to store loaded images.
- Modified
imageGroband related functions to utilize the cache. - Added functions to manage the cache (clear cache, get cache size).
- Deleted alpha and use opacity
Benefits:
- Reduced disk I/O: Each unique image is loaded only once.
- Improved rendering speed: Subsequent uses of the same image retrieve it from memory.
- Enhanced performance for large datasets and interactive applications.
- Disguised alpha and opacity
geom_subview also add a cache. but the not the key to speed up.