runtime icon indicating copy to clipboard operation
runtime copied to clipboard

[API Proposal]: InitialCapacity for AsnWriter

Open vcsjones opened this issue 3 years ago • 10 comments

Background and motivation

There are many places throughout the Libraries where we use AsnWriter for little writes. Consider X509KeyUsageExtension's managed implementation:

https://github.com/dotnet/runtime/blob/f6c5698cb612f3b6e56eb41d808ae7fa73c63897/src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/X509Certificates/ManagedX509ExtensionProcessor.cs#L23-L25

In this case, the total encoded size of the extension is going to be at most 5 bytes.

However, right off the bat, the AsnWriter starts off with a buffer of 1024 bytes.

https://github.com/dotnet/runtime/blob/f6c5698cb612f3b6e56eb41d808ae7fa73c63897/src/libraries/System.Formats.Asn1/src/System/Formats/Asn1/AsnWriter.cs#L260

So that's 1019 bytes that we really almost never end up using.

This similar line of thought applies to

  1. Encoding basic constraints:

    https://github.com/dotnet/runtime/blob/f6c5698cb612f3b6e56eb41d808ae7fa73c63897/src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/X509Certificates/ManagedX509ExtensionProcessor.cs#L82-L84

  2. Encoding extended key usages. This one is a little less of a perfect example, but the majority of certificates that get encoded have serverAuth and clientAuth, which encode to a compact 22 bytes.

  3. Converting IEEE1363 signatures to DER form, we can give a reasonable hint.

  4. On the flip side, there may be places we want to specify a larger capacity by default, such as the managed PKCS12 implementation.

The initialCapacity would just be a hint for the first internally allocated buffer. If the ends up needing to re-allocate+copy, the current behavior would continue.

API Proposal

namespace System.Formats.Asn1;

public sealed partial class AsnWriter {
+    public AsnWriter(AsnEncodingRules ruleSet, int initialCapacity);
}

API Usage

AsnWriter writer = new AsnWriter(AsnEncodingRules.DER, initialCapacity: 5);
writer.WriteNamedBitList(keyUsagesAsn);
byte[] encoded = writer.Encode();

Alternative Designs

No response

Risks

No response

vcsjones avatar May 19 '22 19:05 vcsjones

Tagging subscribers to this area: @dotnet/area-system-security, @vcsjones See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

There are many places throughout the Libraries where we use AsnWriter for little writes. Consider X509KeyUsageExtension's managed implementation:

https://github.com/dotnet/runtime/blob/f6c5698cb612f3b6e56eb41d808ae7fa73c63897/src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/X509Certificates/ManagedX509ExtensionProcessor.cs#L23-L25

In this case, the total encoded size of the extension is almost always going to be 4 bytes.

However, right off the bat, the AsnWriter starts off with a buffer of 1024 bytes.

https://github.com/dotnet/runtime/blob/f6c5698cb612f3b6e56eb41d808ae7fa73c63897/src/libraries/System.Formats.Asn1/src/System/Formats/Asn1/AsnWriter.cs#L260

So that's 1020 bytes that we really almost never end up using.

This similar line of thought applies to

  1. Encoding basic constraints:

    https://github.com/dotnet/runtime/blob/f6c5698cb612f3b6e56eb41d808ae7fa73c63897/src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/X509Certificates/ManagedX509ExtensionProcessor.cs#L82-L84

  2. Encoding extended key usages. This one is a little less of a perfect example, but the majority of certificates that get encoded have serverAuth and clientAuth, which encode to a compact 22 bytes.

  3. Converting IEEE1363 signatures to DER form, we can give a reasonable hint.

  4. On the flip side, there may be places we want to specify a larger capacity by default, such as the managed PKCS12 implementation.

The initialCapacity would just be a hint for the first internally allocated buffer. If the ends up needing to re-allocate+copy, the current behavior would continue.

API Proposal

namespace System.Formats.Asn1;

public sealed partial class AsnWriter {
+    public AsnWriter(AsnEncodingRules ruleSet, int initialCapacity);
}

API Usage

AsnWriter writer = new AsnWriter(AsnEncodingRules.DER, initialCapacity: 4);
writer.WriteNamedBitList(keyUsagesAsn);
byte[] encoded = writer.Encode();

Alternative Designs

No response

Risks

No response

Author: vcsjones
Assignees: -
Labels:

api-suggestion, area-System.Security

Milestone: -

msftbot[bot] avatar May 19 '22 19:05 msftbot[bot]

I was thinking about this recently, too. I wondered if it might be (differently/also) useful to give it an initial buffer... not to say "start with these contents", but that for small or repetitive tasks to be able to use a pre-allocated (even rented) buffer.

Of course, that then has the "should it block growth" / "can you ask if it's still writing to your buffer and you can avoid copying it out" / other similar problems...

Amusingly, for initialCapacity I was actually thinking bigger numbers, not smaller; just to avoid regrowth.

Assuming that initialCapacity is respected (new byte[initialCapacity]), would you expect it to return to the kilobyte-based growth it already has? So a lot of WriteNull calls would go 5 => 1024 => 2048, ...; or would you expect a different heuristic to come into play?

bartonjs avatar May 19 '22 21:05 bartonjs

I wondered if it might be (differently/also) useful to give it an initial buffer... not to say "start with these contents", but that for small or repetitive tasks to be able to use a pre-allocated (even rented) buffer.

A bunch of thoughts all jumped out at my head at once, so I will do my best to make sense of them.

My first thought is "how does ownership work". For example, let's say we accept a pre-allocated buffer... do we want to support continuing to own that buffer? One might think something like this works:

byte[] initialBuffer = new byte[512];
AsnWriter writer = new(AsnEncodingRules.DER, initialBuffer);
writer.WriteObjectIdentifier("1.2.3.4");

ReadOnlySpan<byte> written = initialBuffer.Slice(0, writer.GetEncodedLength());

Ah ha, this is neat because now I can use AsnWriter allocation free.

But do we want that to work?

What if the AsnWriter needs to re-allocate? What if some of the Write methods become lazy in the future and don't get flushed until Encode is called?

Accepting a "initial" buffer seems prone to ownership confusion.

Well, if we want to make sure ownership is clear, it could be callback based. This starts to feel very "Win32 custom allocator"-y to me. I don't like it, but it solves my concerns about ownership.

static byte[] AllocateBuffer(int requestedAtLeastSize)
{
    // Or do something scary here.
    return ArrayPool<byte>.Shared.Rent(requestedAtLeastSize);
}

static void ReleaseBuffer(byte[] buffer)
{
    ArrayPool<byte>.Shared.Return(buffer, clear: true);
}

AsnWriter writer = new(
    AsnEncodingRules.DER,
    allocator: AllocateBuffer,
    releaser: ReleaseBuffer);

It's certainly... code... and I am not sure I like this (I am actually pretty sure I don't like this), and it can still be misused if you do funny stuff with capturing... but I think it sets the expectations about "I own this buffer" a little more clearly.

Oh. But what if you never call {Try}Encode? What it seems like without dispose, something might leak. I don't want to add dispose on the AsnWriter.

So the allocator approach doesn't immediately make me happy either.

would you expect it to return to the kilobyte-based growth it already has?

Yeah. It's a hint. You got one chance to make your hint meaningful. If you're wrong, well, let's stick with what is working today.

Amusingly, for initialCapacity I was actually thinking bigger numbers, not smaller; just to avoid regrowth.

Yeah I tried to jump on that scenario with the PKCS12 example. The tricky thing with bigger numbers is knowing what your hint should be.

vcsjones avatar May 19 '22 21:05 vcsjones

A completely separate idea would be to support one shot encoding for non-constructed items.

Span<byte> buff = stackalloc byte[256];
int written = AsnWriter.EncodeObjectIdentifier("1.2.3.4", buff);

vcsjones avatar May 20 '22 17:05 vcsjones

I guess you could even do constructed items with ReadOnlyMemory:

byte[] oid1 = ArrayPool<byte>.Shared.Rent(512);
byte[] oid2 = ArrayPool<byte>.Shared.Rent(512);
byte[] seq = ArrayPool<byte>.Shared.Rent(512);

int written1 = AsnWriter.EncodeObjectIdentifier("1.2.3.4", oid1);
int written2 = AsnWriter.EncodeObjectIdentifier("1.2.3.5", oid2);

int seqWritten = AsnWriter.EncodeSequence(
    new [] { oid1.AsMemory(0, written1), oid2.AsMemory(0, written2) }
    , seq);

This might be a better direction for "little" writes.

vcsjones avatar May 20 '22 17:05 vcsjones

A completely separate idea would be to support one shot encoding for non-constructed items.

Yeah. I think a more "natural" representation, given the static AsnDecoder type, would be to make a static AsnEncoder type.

public static class AsnEncoder
{
    public static int WriteBoolean(Span<byte> destination, Asn1Tag? tag = null);
    public static int WriteInteger(NUMBERS value, Span<byte> destination, Asn1Tag? tag = null);
    public static int WriteBitString(ReadOnlySpan<byte> source, Span<byte> destination, int unusedBitCount, AsnEncodingRules encodingRules = AsnEncodingRules.DER, Asn1Tag? tag = null);
    public static int WriteOctetString(ReadOnlySpan<byte> source, Span<byte> destination, AsnEncodingRules encodingRules = AsnEncodingRules.DER, Asn1Tag? tag = null);
    public static int WriteNull(Span<byte> destination, Asn1Tag? tag = null);
    ...

    // Helper mainly for for SEQUENCE(OF)
    public static int WriteLength(int length, Span<byte> destination);

    // Maybe EXPLICIT is interesting enough to make a helper for?
    public static int WriteExplicit(ReadOnlySpan<byte> encodedValue, Span<byte> destination, int contextSpecificValue);
    public static int WriteExplicit(ReadOnlySpan<byte> encodedValue, Span<byte> destination, Asn1Tag tag);
}

(And Try versions, too, I'm just doodling)

In that doodle there the BIT STRING and OCTET STRING types were the only ones that needed to know the encoding rules (because of CER).

bartonjs avatar May 20 '22 18:05 bartonjs

In that doodle there the BIT STRING and OCTET STRING types were the only ones that needed to know the encoding rules (because of CER).

Well, I think character strings would need to know CER-or-not, too.

I think, at least for what I would hope for, just being able to do basic primitives would be enough. If SET OF and SEQUENCE OF, I think it's fairly reasonable to just go to a builder for that.

Anyway, if you have appetite for doing this, here's a complete proposal. Maybe it should be spun off it to it's own separate issue.

I omitted allocating versions of these. If you are okay allocating, just do new AsnWriter().

namespace System.Formats.Asn1 {
    public static class AsnEncoder {
        public static int WriteBitString(
            ReadOnlySpan<byte> value,
            Span<byte> destination,
            int unusedBitCount = 0,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static bool TryWriteBitString(
            ReadOnlySpan<byte> value,
            Span<byte> destination,
            out int bytesWritten,
            int unusedBitCount = 0,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static int WriteBoolean(bool value, Span<byte> destination, Asn1Tag? tag = null);
        public static bool TryWriteBoolean(bool value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);

        public static int WriteEnumeratedValue(Enum value, Span<byte> destination, Asn1Tag? tag = null);
        public static bool TryWriteEnumeratedValue(Enum value, Span<byte> destination, out int bytesWritten Asn1Tag? tag = null);

        public static int WriteEnumeratedValue<TEnum>(
            TEnum value,
            Span<byte> destination,
            Asn1Tag? tag = null) where TEnum : Enum;

        public static bool TryWriteEnumeratedValue<TEnum>(
            TEnum value,
            Span<byte> destination,
            out int bytesWritten,
            Asn1Tag? tag = null) where TEnum : Enum;

        public static int WriteGeneralizedTime(
            DateTimeOffset value,
            Span<byte> destination,
            bool omitFractionalSeconds = false,
            Asn1Tag? tag = null);

        public static bool TryWriteGeneralizedTime(
            DateTimeOffset value,
            Span<byte> destination,
            out int bytesWritten,
            bool omitFractionalSeconds = false,
            Asn1Tag? tag = null);

        public int WriteInteger(long value, Span<byte> destination, Asn1Tag? tag = null);
        public bool TryWriteInteger(long value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);
        public int WriteInteger(ulong value, Span<byte> destination, Asn1Tag? tag = null);
        public bool TryWriteInteger(ulong value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);
        public int WriteInteger(BigInteger value, Span<byte> destination, Asn1Tag? tag = null);
        public bool TryWriteInteger(BigInteger value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);
        public int WriteInteger(ReadOnlySpan<byte> value, Span<byte> destination, Asn1Tag? tag = null);
        public bool TryWriteInteger(ReadOnlySpan<byte> value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);
        public int WriteIntegerUnsigned(ReadOnlySpan<byte> value, Span<byte> destination, Asn1Tag? tag = null);
        public bool TryWriteIntegerUnsigned(ReadOnlySpan<byte> value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);

        public static int WriteNamedBitList(Enum value, Span<byte> destination, Asn1Tag? tag = null);
        public static bool TryWriteNamedBitList(Enum value, Span<byte> destination, out int bytesWritten Asn1Tag? tag = null);
        public static int WriteNamedBitList(BitArray value, Span<byte> destination, Asn1Tag? tag = null);
        public static bool TryWriteNamedBitList(BitArray value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null) where TEnum : Enum;

        public static int WriteNamedBitList<TEnum>(
            TEnum value,
            Span<byte> destination,
            Asn1Tag? tag = null) where TEnum : Enum;

        public static bool TryWriteNamedBitList<TEnum>(
            TEnum value,
            Span<byte> destination,
            out int bytesWritten,
            Asn1Tag? tag = null) where TEnum : Enum;

        public static int WriteNull(Span<byte> destination, Asn1Tag? tag = null);
        public static bool TryWriteNull(Span<byte> destination, out int bytesWritten Asn1Tag? tag = null);

        public static int WriteOctetString(
            ReadOnlySpan<byte> value,
            Span<byte> destination,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static bool TryWriteOctetString(
            ReadOnlySpan<byte> value,
            Span<byte> destination,
            out int bytesWritten,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static int WriteObjectIdentifier(string value, Span<byte> destination, Asn1Tag? tag = null);
        public static int WriteObjectIdentifier(ReadOnlySpan<char> value, Span<byte> destination, Asn1Tag? tag = null);
        public static bool TryWriteObjectIdentifier(string value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);
        public static bool TryWriteObjectIdentifier(ReadOnlySpan<char> value, Span<byte> destination, out int bytesWritten, Asn1Tag? tag = null);

        public static int WriteCharacterString(
            UniversalTagNumber encodingType,
            string value,
            Span<byte> destination,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static int WriteCharacterString(
            UniversalTagNumber encodingType,
            ReadOnlySpan<char> value,
            Span<byte> destination,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static bool TryWriteCharacterString(
            UniversalTagNumber encodingType,
            string value,
            Span<byte> destination,
            out int bytesWritten,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static bool TryWriteCharacterString(
            UniversalTagNumber encodingType,
            ReadOnlySpan<char> value,
            Span<byte> destination,
            out int bytesWritten,
            AsnEncodingRules encodingRules = AsnEncodingRules.DER,
            Asn1Tag? tag = null);

        public static int WriteUtcTime(
            DateTimeOffset value,
            Span<byte> destination,
            Asn1Tag? tag = null);

        public static int WriteUtcTime(
            DateTimeOffset value,
            int twoDigitYearMax,
            Span<byte> destination,
            Asn1Tag? tag = null);

        public static bool TryWriteUtcTime(
            DateTimeOffset value,
            Span<byte> destination,
            out int bytesWritten,
            Asn1Tag? tag = null);

        public static bool TryWriteUtcTime(
            DateTimeOffset value,
            int twoDigitYearMax,
            Span<byte> destination,
            out int bytesWritten,
            Asn1Tag? tag = null);

        public static int WriteExplicit(ReadOnlySpan<byte> encodedValue, Span<byte> destination, int contextSpecificValue);
        public static int WriteExplicit(ReadOnlySpan<byte> encodedValue, Span<byte> destination, Asn1Tag tag);

        public static bool TryWriteExplicit(ReadOnlySpan<byte> encodedValue, Span<byte> destination, out int bytesWritten, int contextSpecificValue);
        public static bool TryWriteExplicit(ReadOnlySpan<byte> encodedValue, Span<byte> destination, out int bytesWritten, Asn1Tag tag);
    }
}

vcsjones avatar May 20 '22 22:05 vcsjones

Explicit encodes as a SEQUENCE with one item, right? So it might need to know CER-ness, because I think CER might have to write an EXPLICIT [0] as

C0 80
   real value
   00 00

Maybe it should be spun off it to it's own separate issue.

Yeah, probably :smile:

bartonjs avatar May 20 '22 22:05 bartonjs

I don't think this'll land in 7; but marking it ready-for-review in 8. Let's take the AsnEncoder piece over to a new issue.

bartonjs avatar Jul 06 '22 23:07 bartonjs

Video

namespace System.Formats.Asn1;

public partial class AsnWriter
{
    public AsnWriter(AsnEncodingRules ruleSet, int initialBufferSize);
}

terrajobst avatar Jul 26 '22 18:07 terrajobst