MemoryPack icon indicating copy to clipboard operation
MemoryPack copied to clipboard

Garbage in serialized data.

Open cNoNim opened this issue 11 months ago • 4 comments

I was trying to understand how MemoryPack serialises nullable unmanaged types. And I came across some strange behaviour between Generic and NonGeneric APIs. Originally way to reproduce was unstable. But these tests that consistently show a problem.

using System.Buffers;
using System.Runtime.CompilerServices;

namespace MemoryPack.SimpleTests;

public class PrimitiveTest
{
    [Fact]
    public void UnmanagedGenericTest()
    {
        PoisonStack();
        var writer = new ArrayBufferWriter<byte>(1024);
        MemoryPackSerializer.Serialize(writer, (int?)0);
        Assert.Equal(8, writer.WrittenCount);
        Span<byte> expected = [1, 0, 0, 0, 0, 0, 0, 0];
        Assert.Equal(expected, writer.WrittenSpan);
    }

    [Fact]
    public void UnmanagedNonGenericIntTest()
    {
        PoisonStack();
        var writer = new ArrayBufferWriter<byte>(1024);
        MemoryPackSerializer.Serialize(typeof(int?), writer, 0);
        Assert.Equal(8, writer.WrittenCount);
        Span<byte> expected = [1, 0, 0, 0, 0, 0, 0, 0];
        Assert.Equal(expected, writer.WrittenSpan);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static void PoisonStack()
    {
        var s = (stackalloc byte[1024]);
        s.Fill(0xFF);
        Consume(s);
        return;

        [MethodImpl(MethodImplOptions.NoInlining)]
        static void Consume<T>(Span<T> _)
        {
        }
    }
}
Assert.Equal() Failure: Collections differ
              ↓ (pos 1)
Expected: [1, 0, 0, 0, 0, ···]
Actual:   [1, 56, 15, 111, 0, ···]
              ↑ (pos 1)
   at MemoryPack.SimpleTests.PrimitiveTest.UnmanagedNonGenericIntTest() 

I guess it's the padding. But it doesn't look safe because part of the stack is serialised. Probably the generic version can also grab part of the stack in some cases, but I haven't been able to break it yet. Less problem is the API difference.

cNoNim avatar Feb 22 '25 20:02 cNoNim

I presume MemoryPack uses this path as a fast-path serialization: https://github.com/Cysharp/MemoryPack/blob/0f2fe0827907940768293b7bb81b9605a7d8cba8/src/MemoryPack.Core/MemoryPackSerializer.Serialize.cs#L24-L29

This is considered as a dangerous anti-pattern for anything other than blittable primitives/structs without paddings. E.g. Nullable<int> hits both issues - it's not blittable (bool is not) and contains paddings. This may lead to:

  • Garbage from stack is serialized - this may expose some secrets from stack that developer didn't plan to transfer.
  • Moreover, it uses non-zero initialized heap as a storage which increases chances of
  • Denormalized booleans - generally, anything other than 0/1 for booleans is considered UB. It is better to read/write booleans with normalization e.g. bool deserialized = bytes[n] != 0 ? true : false and same for write.

cc @jkotas @GrabYourPitchforks

EgorBo avatar Feb 22 '25 21:02 EgorBo

I am aware that padding gets serialized, and I understand that it is generally not recommended. Is there a possibility that including it partially in a user-defined struct could lead to specific critical issues? That said, regarding arrays, since they can contain padding on quite a large scale, I would like to avoid this approach.

neuecc avatar Mar 03 '25 10:03 neuecc

The main issue is the lack of explicit control. If the user understands the process and chooses to organize data this way, it's their responsibility. Perhaps a .NET API could indicate that a type is densely packed, allowing safe memory copying. However, such an API doesn’t exist, is not simple, and most users should stick to field-wise serialization. Is there a way to restrict access to the unsafe API in MemoryPack?

cNoNim avatar Mar 05 '25 13:03 cNoNim

We faced the same issue. The workaround here (for us) is write own formatter and mark each nullable property with it. like that:

[MyNullableFormatter<short>]
 public short? SomeProperty { get; set; }

But it would be nice to have such attribute out of the box or even better to mark whole class with it (to apply it to all properties). In our solution We decided to keep the format compatibility in size wise, so we just write zeros instead of potential garbage :

public sealed class NullableFormatterAttribute<T> : MemoryPackCustomFormatterAttribute<NullableFormatter<T>, T?>
    where T : struct
{
    public override NullableFormatter<T> GetFormatter()
    {
        return NullableFormatter<T>.Default;
    }
}


public sealed class NullableFormatter<T> : MemoryPackFormatter<T?> 
    where T : struct
{
    public static NullableFormatter<T> Default { get; } = new();

    private static readonly byte[] _hasValueY = [1, .. new byte[Unsafe.SizeOf<T?>() - Unsafe.SizeOf<T>() - 1]];
    private static readonly byte[] _hasValueN = [0, .. new byte[Unsafe.SizeOf<T?>() - Unsafe.SizeOf<T>() - 1]];

    /// <summary>
    /// When serialize, first we write the value of HasValue, filling rest with zeroes
    /// Then we write the actual value (or 0 when NULL)
    /// </summary>
    /// <typeparam name="TBufferWriter"></typeparam>
    public override void Serialize<TBufferWriter>(ref MemoryPackWriter<TBufferWriter> writer, scoped ref T? value)
    {
        if (value.HasValue)
        {
            writer.WriteSpanWithoutLengthHeader(new ReadOnlySpan<byte>(_hasValueY));
            writer.DangerousWriteUnmanaged(value.Value);
        }
        else
        {
            writer.WriteSpanWithoutLengthHeader(new ReadOnlySpan<byte>(_hasValueN));
            writer.DangerousWriteUnmanaged(default(T));
        }
    }

    /// <summary>
    /// This is a full copy of same method from original NullableFormatter
    /// https://github.com/Cysharp/MemoryPack/blob/main/src/MemoryPack.Core/Formatters/NullableFormatter.cs
    /// </summary>
    public override void Deserialize(ref MemoryPackReader reader, scoped ref T? value)
    {
        if (!RuntimeHelpers.IsReferenceOrContainsReferences<T>())
        {
            reader.DangerousReadUnmanaged(out value);
            return;
        }

        if (!reader.TryReadObjectHeader(out var count))
        {
            value = null;
            return;
        }

        if (count != 1)
        {
            MemoryPackSerializationException.ThrowInvalidPropertyCount(1, count);
        }

        value = reader.ReadValue<T>();
    }

    
}

troepolik avatar May 21 '25 14:05 troepolik