Diff doesn't aggregate package updates into single events
The mal diff report doesn't look very coherent, especially if you don't extract the images yourself, but instead perform mal diff --image. It seems that Deleted -> Added events of the same package should be compressed into a single event like Changed: ... libzstd.so.1.5.6 -> libzstd.so.1.5.7 [...]. Do you have any ideas or plans for this kind of aggregation?
mal diff --image --file-risk-change ghcr.io/aquasecurity/trivy:0.65.0 ghcr.io/aquasecurity/trivy:0.66.0 | grep -e Added -e Deleted | sort
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /+AfcLNAJNxFxG0hH40=.post-install [MEDIUM]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /+AfcLNAJNxFxG0hH40=.post-upgrade [MEDIUM]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /+AfcLNAJNxFxG0hH40=.pre-install [MEDIUM]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /+AfcLNAJNxFxG0hH40=.pre-upgrade [MEDIUM]
├─ 🛑 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /busybox-1.37.0-r18.Q1IVWNSWjzHcw3fA8n2um7DzK7JdI=.post-install [HIGH]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /busybox-1.37.0-r18.Q1IVWNSWjzHcw3fA8n2um7DzK7JdI=.post-upgrade [MEDIUM]
├─ 🛑 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /busybox-1.37.0-r18.Q1IVWNSWjzHcw3fA8n2um7DzK7JdI=.trigger [HIGH]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /ca-certificates-20250619-r0.Q1O3wy7NQ0LRAM8EyppKJ3AolkYeM=.post-deinstall [MEDIUM]
├─ 🛑 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /ca-certificates-20250619-r0.Q1O3wy7NQ0LRAM8EyppKJ3AolkYeM=.trigger [HIGH]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /libapk.so.2.14.9 [MEDIUM]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /libexpat.so.1.10.2 [MEDIUM]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /libnghttp2.so.14.28.4 [MEDIUM]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /libunistring.so.5.2.0 [MEDIUM]
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /libzstd.so.1.5.7 [MEDIUM]
├─ 🛑 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /0NTXAhIjY7Nqo=.post-install [HIGH]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /0NTXAhIjY7Nqo=.post-upgrade [MEDIUM]
├─ 🛑 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /0NTXAhIjY7Nqo=.trigger [HIGH]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /ca-certificates-20250619-r0.Q1xUNRT2WUrGiLIMFZ+1e2JbKz6MQ=.post-deinstall [MEDIUM]
├─ 🛑 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /ca-certificates-20250619-r0.Q1xUNRT2WUrGiLIMFZ+1e2JbKz6MQ=.trigger [HIGH]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /iSXcJI1Vf8x0TVc9Y=.post-install [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /iSXcJI1Vf8x0TVc9Y=.post-upgrade [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /iSXcJI1Vf8x0TVc9Y=.pre-install [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /iSXcJI1Vf8x0TVc9Y=.pre-upgrade [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /libapk.so.2.14.0 [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /libexpat.so.1.10.1 [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /libnghttp2.so.14.28.3 [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /libunistring.so.5.1.0 [MEDIUM]
├─ 🟡 Deleted: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /libzstd.so.1.5.6 [MEDIUM]
mal diff --file-risk-change 0.65.0-rootfs/ 0.66.0-rootfs/ | grep -e Added -e Deleted | sort
├─ 🟡 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.7.0-r0.Q1KfmXSO6h/+AfcLNAJNxFxG0hH40=.post-install [MEDIUM]
├─ 🟡 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.7.0-r0.Q1KfmXSO6h/+AfcLNAJNxFxG0hH40=.post-upgrade [MEDIUM]
├─ 🟡 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.7.0-r0.Q1KfmXSO6h/+AfcLNAJNxFxG0hH40=.pre-install [MEDIUM]
├─ 🟡 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.7.0-r0.Q1KfmXSO6h/+AfcLNAJNxFxG0hH40=.pre-upgrade [MEDIUM]
├─ 🛑 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /busybox-1.37.0-r18.Q1IVWNSWjzHcw3fA8n2um7DzK7JdI=.post-install [HIGH]
├─ 🟡 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /busybox-1.37.0-r18.Q1IVWNSWjzHcw3fA8n2um7DzK7JdI=.post-upgrade [MEDIUM]
├─ 🛑 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /busybox-1.37.0-r18.Q1IVWNSWjzHcw3fA8n2um7DzK7JdI=.trigger [HIGH]
├─ 🟡 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /ca-certificates-20250619-r0.Q1O3wy7NQ0LRAM8EyppKJ3AolkYeM=.post-deinstall [MEDIUM]
├─ 🛑 Added: 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /ca-certificates-20250619-r0.Q1O3wy7NQ0LRAM8EyppKJ3AolkYeM=.trigger [HIGH]
├─ 🟡 Added: 0.66.0-rootfs/usr/lib/libapk.so.2.14.9 [MEDIUM]
├─ 🟡 Added: 0.66.0-rootfs/usr/lib/libexpat.so.1.10.2 [MEDIUM]
├─ 🟡 Added: 0.66.0-rootfs/usr/lib/libnghttp2.so.14.28.4 [MEDIUM]
├─ 🟡 Added: 0.66.0-rootfs/usr/lib/libunistring.so.5.2.0 [MEDIUM]
├─ 🟡 Added: 0.66.0-rootfs/usr/lib/libzstd.so.1.5.7 [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.6.8-r1.Q17OteNVXn9/iSXcJI1Vf8x0TVc9Y=.post-install [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.6.8-r1.Q17OteNVXn9/iSXcJI1Vf8x0TVc9Y=.post-upgrade [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.6.8-r1.Q17OteNVXn9/iSXcJI1Vf8x0TVc9Y=.pre-install [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /alpine-baselayout-3.6.8-r1.Q17OteNVXn9/iSXcJI1Vf8x0TVc9Y=.pre-upgrade [MEDIUM]
├─ 🛑 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /busybox-1.37.0-r12.Q1sSNCl4MTQ0d1V/0NTXAhIjY7Nqo=.post-install [HIGH]
├─ 🟡 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /busybox-1.37.0-r12.Q1sSNCl4MTQ0d1V/0NTXAhIjY7Nqo=.post-upgrade [MEDIUM]
├─ 🛑 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /busybox-1.37.0-r12.Q1sSNCl4MTQ0d1V/0NTXAhIjY7Nqo=.trigger [HIGH]
├─ 🟡 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /ca-certificates-20250619-r0.Q1xUNRT2WUrGiLIMFZ+1e2JbKz6MQ=.post-deinstall [MEDIUM]
├─ 🛑 Deleted: 0.65.0-rootfs/lib/apk/db/scripts.tar ∴ /ca-certificates-20250619-r0.Q1xUNRT2WUrGiLIMFZ+1e2JbKz6MQ=.trigger [HIGH]
├─ 🟡 Deleted: 0.65.0-rootfs/usr/lib/libapk.so.2.14.0 [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/usr/lib/libexpat.so.1.10.1 [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/usr/lib/libnghttp2.so.14.28.3 [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/usr/lib/libunistring.so.5.1.0 [MEDIUM]
├─ 🟡 Deleted: 0.65.0-rootfs/usr/lib/libzstd.so.1.5.6 [MEDIUM]
Oh, good call. I don't think this is something we've considered previously but I'm happy to work on improving the legibility of diffs given how much more information is displayed.
Can you try the latest release (1.18.0) and see if the new output with and without --score-all improves your experience when running diffs?
I tested with the Trivy images mentioned above and the new output is much more concise, especially when using --score-all.
Yes, it looks great. Regarding the flag, it says this mode is slow, but I didn't notice much of a difference. Is there a formula I can use to calculate/predict how long the analysis will take?
--score-all Compute the Levenshtein distance for all source and destination paths (warning: experimental and slow!) (default: false)
mal diff --score-all 0.65.0-rootfs/ 0.66.0-rootfs/
├─ 🟡 Added: /Users/maxim/images/0.66.0-rootfs ∴ 0.66.0-rootfs/usr/bin/iconv [MEDIUM]
│ ≡ anti-static [MEDIUM]
│ 🟡 binary/opaque — binary contains little text content: destination charset, source charset, write error
│
├─ 🟡 Moved: 0.65.0-rootfs/usr/bin/getent -> 0.66.0-rootfs/usr/bin/getent (score: 1.000000)
│ ≡ networking [MEDIUM]
│- 🟡 ip/host_port — connects to an arbitrary hostname:port
│
├─ 🟡 Moved: 0.65.0-rootfs/usr/lib/libcrypto.so.3 -> 0.66.0-rootfs/usr/lib/libcrypto.so.3 (score: 1.000000)
│ ≡ discovery [LOW]
│+ 🔵 user/USER — Looks up the USER name of the current user: getenv, ENV
│
├─ 🛑 Moved: 0.65.0-rootfs/usr/lib/libcurl.so.4.8.0 -> 0.66.0-rootfs/usr/lib/libcurl.so.4.8.0 (score: 1.000000)
│ ≡ networking [MEDIUM]
│+ 🟡 ip/icmp — Uses the ping tool to generate ICMP packets: ping response., ping request.
│
├─ 🟡 Moved: 0.65.0-rootfs/usr/lib/libssl.so.3 -> 0.66.0-rootfs/usr/lib/libssl.so.3 (score: 1.000000)
│ ≡ networking [MEDIUM]
│+ 🟡 ip/addr — mentions an 'IP address'
│+ 🟡 socket/pair — create a pair of connected sockets: socketpair
│- 🔵 http — Uses the HTTP protocol
│- 🟡 http/post — submits content to websites
│
├─ 🛑 Moved: 0.65.0-rootfs/usr/libexec/git-core/git-http-push -> 0.66.0-rootfs/usr/libexec/git-core/git-http-push (score: 1.000000)
│ ≡ filesystem [LOW]
│+ 🔵 mount — mounts file systems
│
├─ 🟡 Moved: 0.65.0-rootfs/usr/lib/libnghttp2.so.14.28.3 -> 0.66.0-rootfs/usr/lib/libnghttp2.so.14.28.4 (score: 0.971429)
│ ≡ filesystem [LOW]
│- 🔵 file/delete — deletes files
│
├─ 🛑 Changed (1 added, 0 removed): 0.66.0-rootfs/lib/apk/db/scripts.tar ∴ /ca-certificates-20250619-r0.Q1O3wy7NQ0LRAM8EyppKJ3AolkYeM=.trigger
│ ≡ execution [MEDIUM]
│+ 🟡 shell/ignore_output — Runs shell commands but throws output away: /usr/sbin/update-ca-certificates > /dev/null 2>&1
│
mal diff --score-all --image ghcr.io/aquasecurity/trivy:0.65.0 ghcr.io/aquasecurity/trivy:0.66.0
├─ 🟡 Added: ghcr.io/aquasecurity/trivy:0.66.0 ∴ /usr/bin/iconv [MEDIUM]
│ ≡ anti-static [MEDIUM]
│ 🟡 binary/opaque — binary contains little text content: destination charset, source charset, write error
│
├─ 🟡 Moved: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /usr/bin/getent -> ghcr.io/aquasecurity/trivy:0.66.0 ∴ /usr/bin/getent (score: 1.000000)
│ ≡ networking [MEDIUM]
│- 🟡 ip/host_port — connects to an arbitrary hostname:port
│
├─ 🟡 Moved: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /usr/lib/libcrypto.so.3 -> ghcr.io/aquasecurity/trivy:0.66.0 ∴ /usr/lib/libcrypto.so.3 (score: 1.000000)
│ ≡ discovery [LOW]
│+ 🔵 user/USER — Looks up the USER name of the current user: getenv, ENV
│
├─ 🛑 Moved: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /usr/lib/libcurl.so.4.8.0 -> ghcr.io/aquasecurity/trivy:0.66.0 ∴ /usr/lib/libcurl.so.4.8.0 (score: 1.000000)
│ ≡ networking [MEDIUM]
│+ 🟡 ip/icmp — Uses the ping tool to generate ICMP packets: ping response., ping request.
│
├─ 🟡 Moved: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /usr/lib/libssl.so.3 -> ghcr.io/aquasecurity/trivy:0.66.0 ∴ /usr/lib/libssl.so.3 (score: 1.000000)
│ ≡ networking [MEDIUM]
│+ 🟡 ip/addr — mentions an 'IP address'
│+ 🟡 socket/pair — create a pair of connected sockets: socketpair
│- 🔵 http — Uses the HTTP protocol
│- 🟡 http/post — submits content to websites
│
├─ 🛑 Moved: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /usr/libexec/git-core/git-http-push -> ghcr.io/aquasecurity/trivy:0.66.0 ∴ /usr/libexec/git-core/git-http-push (score: 1.000000)
│ ≡ filesystem [LOW]
│+ 🔵 mount — mounts file systems
│
├─ 🟡 Moved: ghcr.io/aquasecurity/trivy:0.65.0 ∴ /usr/lib/libnghttp2.so.14.28.3 -> ghcr.io/aquasecurity/trivy:0.66.0 ∴ /usr/lib/libnghttp2.so.14.28.4 (score: 0.971429)
│ ≡ filesystem [LOW]
│- 🔵 file/delete — deletes files
│
├─ 🛑 Changed (1 added, 0 removed): ghcr.io/aquasecurity/trivy:0.66.0 ∴ /lib/apk/db/scripts.tar ∴ /ca-certificates-20250619-r0.Q1O3wy7NQ0LRAM8EyppKJ3AolkYeM=.trigger
│ ≡ execution [MEDIUM]
│+ 🟡 shell/ignore_output — Runs shell commands but throws output away: /usr/sbin/update-ca-certificates > /dev/null 2>&1
│
Nice! The approach is roughly quadratic (something like $$(n^2log(n))$$ complexity) so the runtime will be much more noticeable as the source and destination file counts increase.
I think the Trivy diffs were maybe a couple of dozen files each, so you're looking at $$(24*24)$$ pairs, $$(24*24)*log_2((24*24))$$ comparisons, and then $$min(24, 24)$$ matches.
Here's something else I was thinking about. What if I scan different versions of images on different days? Let's say it's a vulnerability scanner like Trivy. It pulls the image, scans it, creates a report, and then deletes the image. If I want to create a mal diff of two images, I'll pull those images again. Is it possible to create a report, aka SBOM, from mal analyze and then perform a mal diff of the two SBOM files? What do you think about that?
Ingesting existing reports into a FileReport and then diffing them isn't something we currently support but I think it would be useful since the expensive portions of scanning will only need to happen once.
I'll create a backlog item for that and experiment with possible options. Closing the original Issue for now.