bc-csharp icon indicating copy to clipboard operation
bc-csharp copied to clipboard

Remove embedded resources when using dotnet trimmer

Open tranb3r opened this issue 2 years ago • 8 comments

I was previously using Portable.BouncyCastle in my app, and I'd like to migrate to BouncyCastle.Cryptography. However, Portable.BouncyCastle.dll 1.9.0 weights about 3MB, while BouncyCastle.Cryptography.dll 2.1.1 weights more than 6MB, because of a few embedded resources. In my app I'm using dotnet trimmer to remove useless code, and this works fine with Portable.BouncyCastle. However, with BouncyCastle.Cryptography, the trimmer does not remove the embedded resources, resulting in the size of my app increasing by more than 3MB.

What are those embedded resources? Is there any solution to trim them?

Thanks!

tranb3r avatar Mar 02 '23 16:03 tranb3r

They are the following .bz2 files within the PQC code:

  • pqc/crypto/picnic/lowmcL1.bz2 ("Org.BouncyCastle.pqc.crypto.picnic.lowmcL1.bz2")
  • pqc/crypto/picnic/lowmcL3.bz2 ("Org.BouncyCastle.pqc.crypto.picnic.lowmcL3.bz2")
  • pqc/crypto/picnic/lowmcL5.bz2 ("Org.BouncyCastle.pqc.crypto.picnic.lowmcL5.bz2")
  • pqc/crypto/sike/p434.bz2 ("Org.BouncyCastle.pqc.crypto.sike.p434.bz2")
  • pqc/crypto/sike/p503.bz2 ("Org.BouncyCastle.pqc.crypto.sike.p503.bz2")
  • pqc/crypto/sike/p610.bz2 ("Org.BouncyCastle.pqc.crypto.sike.p610.bz2")
  • pqc/crypto/sike/p751.bz2 ("Org.BouncyCastle.pqc.crypto.sike.p751.bz2")

with a total size of more than 3MB. Sike at least will be removed for v3.0 in any case.

If trimming these you probably want to explicitly trim also the entire Org.BouncyCastle.Pqc namespace (there are some factory classes in Org.BouncyCastle.Pqc.Crypto.Utilities that might otherwise cause issues).

peterdettman avatar Mar 06 '23 06:03 peterdettman

Ok. However, even the trimmer removes the entire Org.BouncyCastle.Pqc namespace, it does not remove the embbeded resources. Do you know how to force it?

tranb3r avatar Mar 06 '23 10:03 tranb3r

Have you tried removing the Org.BouncyCastle.pqc namespace (lowercase pqc)?

peterdettman avatar Mar 06 '23 11:03 peterdettman

Could you please explain how you'd do that? Right now, I'm juste enabling the dotnet trimmer by adding these properties to my csproj:

  <PropertyGroup>
	  <PublishTrimmed>true</PublishTrimmed>
	  <TrimMode>full</TrimMode>
  </PropertyGroup>

tranb3r avatar Mar 06 '23 11:03 tranb3r

No, I'm just learning about trimming for the first time right now. If anyone can PR a change to the library to make it work it would be welcome

peterdettman avatar Mar 06 '23 12:03 peterdettman

(linking #504)

Indeed I don't think there's any way to trim out embedded resources from an assembly. But I think there's a way forward:

The sike code is marked obsolete, so if (when?) this is deleted that's an easy 2.2MB gone from the assembly from those 4 bz2 files.

Then we are left with the 3 picnic bz2 files, which are a further 1.2MB. These all look like this:

...
linearMatrices = 46,CD,26,E0,D0 ...
...

So a list of byte values. But each byte value is represented by 3 UTF-8 bytes (2 hex chars and 1 comma).

And we have e.g. lowmcL5.bz2, which is 743KB and 2187KB uncompressed, i.e. the data is compressed by 3x.

In theory then, if this data were directly embedded in the assembly via the C# source e.g.

private static readonly byte[] linearMatrices = { 0x46, 0xCD, 0x26 , ...

then this would come at no cost to the size of the untrimmed assembly (because the actual data is 3x smaller than the uncompressed file size i.e. the same size as the bz2 files), but being embedded in the source would mean it could be trimmed out. Obviously, the *.cs files would be large but that does not matter for the end result.

Rob-Hague avatar Apr 17 '24 07:04 Rob-Hague

Actually, I've figured out that embedded resources can be trimmed out from an assembly, using an ILLink substitutions xml file.

For example:

<linker>
	<assembly fullname="BouncyCastle.Cryptography">
		<resource name="Org.BouncyCastle.pqc.crypto.picnic.lowmcL1.bz2‎" action="remove" />
	</assembly>
</linker>

tranb3r avatar Apr 17 '24 19:04 tranb3r

Nice one.

I put my theory to a quick test, by embedding the resultant uint[] arrays in the source I was able to see the assembly trimmed down to ~2.2MB (I only did the picnic data, so that's basically just the sike data left over).

Unfortunately the untrimmed assembly grew unexpectedly. The data in the files is actually only about 25% of the resultant uint[] arrays, the other 75% consisting of trailing zeroes. That should not be too difficult to tackle (by simply not embedding the trailing zeroes and sizing it correctly after).

Rob-Hague avatar Apr 18 '24 07:04 Rob-Hague

Thanks to @Rob-Hague (#534, #539) the embedded resources have been replaced with static data that is visible to the trimmer.

peterdettman avatar May 24 '24 06:05 peterdettman