protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

New field types in IDL

Open esrauchg opened this issue 1 month ago • 1 comments

There are several additional field types not supported in Protobuf today which have legitimate business usecases. This issue is a general tracking bug for those requests.

Candidate Additional Type Current Workaround How a first class type could be better? Notes
BigInts / BigFloats string More efficient wire representation, API ergonomics
int128 (especially for UUIDs) string More efficient wire representation, API ergonomics
int16 int32 for singular, int32 or maybe bytes for repeated API ergonomics, in-memory-size Varint representation makes int32s that are int16 size the same as an int16 type would be
bfloat16 or similar bytes for repeated API ergonomics Mostly relevant as repeated/matrix for ML model payloads There are several different incompatible half-float specs, most languages have no native support any
Decimals google.types.Decimal or string google.types.Decimal cannot be used as map keys, string is loose

There's a number of topics here that make Protobuf move conservatively on any new types, that makes us unlikely to add new types without a correspondingly large benefit.

  • Adding a new tag type to the wire format is considered a breaking change and has never been done. The main problem there is unknown field propagation, which would be broken by any new tag type. Certain types can't be added without a new tag type (fixed16 for example). But its not a hard constraint for most candidate new types, as tag type to IDL type is already a 1:N mapping. But it comes up as especially problematic for eg 128 where we couldn't support either a "fixed 16 byte" type nor a "varint of up to 128" because our varint definition is already capped at 10 bytes max and even extending that to 20 max would break unknown field handling in current implementations.

  • Any new first class types would need good support in all languages. This is actually already a pain point with our limited set of types today where 'uint32' today where many languages including Java don't have good support for that type. For something like bfloat16 it would be very difficult to do so.

  • Adding new WKT is possible and sidesteps the first two issues (languages that don't special case them would just have their message representation, and on the wire format they would just look like messages), but still has problems in the introduction of them and has never been done. Previously we tried to make should-be-benign changes to WKT and found the 'Hyrum's law' fallout from even those was large.

WKT here means 'google.protobuf.* things, which are special cased in ProtoJSON representation. A much lower bar would be new things added to the google.type package (published here and for Google's published APIs these are expected to be 'common' types reused by many APIs. The distinction from the other WKT here is that google.type.Decimal is not special cased Protobuf library itself, so eg packing/unpacking them has to be done by the application code or in ancillary libraries, and it has no special representation in ProtoJSON format.

So that basically leaves only a slim window here: if its easily representable by a standard and very common shape of message than google.types is a good place for it (no special casing in the wire formats at all). And Protobuf can't add new tags to the wire format. So the only cases where we may add new types is either:

  1. Makes sense to special case in the Protobuf libraries even though we can't add new tag types (including that we overload old tag types), which is plausible/realistic if the business value is big enough.

  2. That the addition is so hugely impactful to support that it warrants a wire format evolution that will cause ecosystem fragmentation (which is technically possible but pretty unlikely)

esrauchg avatar Jan 05 '26 18:01 esrauchg

Seems like a reasonable/feasible approach to consider could be to add lib serde helper methods for types that are in the intersection of "significant business value" and "has native representation in sufficient percentage of lib target platforms"

IMO just from my perspective, it wouldn't be necessary for 100% of lib targets have native support today in order to make the type supported in terms of delivering useful business value; native support coverage will improve over time and when that happens that target's method could be switched from a stub to an implementation.

It seems to me like a good amount of the types listed in the Issue description could fit these criteria.

Cheers

ericsampson avatar Jan 05 '26 18:01 ericsampson