avsc icon indicating copy to clipboard operation
avsc copied to clipboard

forSchema mutates passed options

Open bdrobinson opened this issue 5 years ago • 4 comments

Given the following code:

import avsc from 'avsc';

const userSchemaV1 = { type: 'record', name: 'User', fields: [] };
const userSchemaV2 = { type: 'record', name: 'User', fields: [] };

const opts = {};

avsc.Type.forSchema(userSchemaV1, opts);
avsc.Type.forSchema(userSchemaV2, opts);

The code unexpectedly crashes when we try and parse userSchemaV2 with the error: Error: duplicate type name: User.

I looked at the source code and the problem is that forSchema mutates the passed opts to add a registry to it. So it tries to parse the second schema but User is already in the registry, so it crashes. Here's the offending line: https://github.com/mtth/avsc/blob/f69d61cdeddaee7ba16b9126e0bcf8eaf0a77026/lib/types.js#L120

It feels like very confusing behaviour that the top-level opts object is mutated in any way, and in fact has led to a crash for us when using https://github.com/kafkajs/confluent-schema-registry/, because that repo re-uses the options object between calls to forSchema – see https://github.com/kafkajs/confluent-schema-registry/blob/b342f4eb42599447a39c9506ca501e3a59afc7c3/src/cache.ts#L28

Thanks very much. Hope that makes sense, let me know if you need any more information.

bdrobinson avatar Jul 30 '20 09:07 bdrobinson

In fact, what's the reason for exposing registry as an option that can be passed in? I can't think of a time a user might want to pass it in as to me it looks like an implementation detail. But perhaps there's a use-case that hasn't occurred to me?

bdrobinson avatar Jul 30 '20 10:07 bdrobinson

Hi @bdrobinson. The registry is used for example to support custom long types and modular type definitions (see https://github.com/mtth/avsc/issues/305).

I agree that mutating the options object can be surprising. Since changing this behavior would not be backwards-compatible, it will be best done with the next major release. Let's keep this open to track it.

In the meantime, you can work around your underlying issue by adding the type hook below to forSchemaOptions:

function typeHook(schema, opts) {
  let name = schema.name;
  if (!name) {
    return; // Not a named type, use default logic.
  }
  if (!~name.indexOf('.')) {
    // We need to qualify the type's name.
    const namespace = schema.namespace || opts.namespace;
    if (namespace) {
      name = `${namespace}.${name}`;
    }
  }
  // Return the type registered with the same name, if any.
  return opts.registry[name];
}

See also https://github.com/mtth/avsc/issues/294 for more context.

mtth avatar Aug 01 '20 16:08 mtth

@mtth - appreciate this is an old thread at this point, but the typeHook function you suggest there will return the schema for userSchemaV1 (in this example).

When attempting to decode a userSchemaV2 message with the userSchemaV1 schema, it won't work (unless the schemas are identical).

Let me know if I've missed something important. Alternatively is there a world in which we could add an extra option into opts so that it won't mutate the registry? Which would allow the change to be backwards compatible for those who enable it?

kenneth-gray avatar Aug 29 '22 21:08 kenneth-gray