orama icon indicating copy to clipboard operation
orama copied to clipboard

Vector example with @orama/plugin-embeddings not working

Open diego-betto opened this issue 1 year ago • 6 comments

Describe the bug

Hi,

it seems that the second example, the one using @orama/plugin-embeddings does not work as expected.

From what I can see:

  • the first import is missing insert and search
  • then the search throws an error
    file:///..../node_modules/@orama/orama/dist/esm/methods/search-vector.js:15
      const vectorIndex = orama.data.index.vectorIndexes[vector.property];
    
    TypeError: Cannot read properties of undefined (reading 'property')
    
    Maybe we need to add a check for vector is being defined here or use another import.

Also check the first example. It imports remove‘ and searchVector’ but doesn't use them. It's not a big deal, but I think it's better to remove them or add some code that uses them for better understanding.

Regards, Diego

To Reproduce

Just try the example with @orama/plugin-embeddings

Expected behavior

To run without issues

Environment Info

OS: Ubuntu 24.04.2 LTS
Node: v22.14.0

Packages
    "@orama/orama": "^3.1.6",
    "@orama/plugin-embeddings": "^3.1.6",
    "@tensorflow/tfjs-node": "^4.22.0"

Affected areas

Initialization, Search

Additional context

No response

diego-betto avatar Apr 17 '25 15:04 diego-betto

Just ran into this! It's a problem with Orama's bundler. beforeSearch is an async function:

https://github.com/oramasearch/orama/blob/c8051ce183d3628519fe61d07b9f6ff561b21465/packages/plugin-embeddings/src/index.ts#L62

Orama switches into a special mode when it detects that hook as an async function:

https://github.com/oramasearch/orama/blob/c8051ce183d3628519fe61d07b9f6ff561b21465/packages/orama/src/components/hooks.ts#L95

That's done by looking at the function's constructor:

https://github.com/oramasearch/orama/blob/c8051ce183d3628519fe61d07b9f6ff561b21465/packages/orama/src/utils.ts#L355

But...

Orama's bundler converts it away from an async function! If you look at the code published to npm, you'll see that beforeSearch is defined as (prettified):

beforeSearch: function t(t, i2) {
  return r(function () {
    var r2, t2, a3, l2, c2;
    return n(this, function (n2) {
      // omitted
    });
  })();
}

This is not an async function, because it was transformed by the bundler!

lights0123 avatar Apr 25 '25 16:04 lights0123

As a workaround, you can do this:

pluginEmbeddings({
	embeddings: {
		// Property used to store generated embeddings. Must be defined in the schema.
		defaultProperty: 'embeddings',
		onInsert: {
			// Generate embeddings at insert-time.
			// Turn off if you're inserting documents with embeddings already generated.
			generate: true,
			// Properties to use for generating embeddings at insert time.
			// These properties will be concatenated and used to generate embeddings.
			properties: ['contents'],
			verbose: true,
		},
	},
}).then(async (p) => {
        // shouldn't need this but there's a mistake in the TypeScript types
	const plugin = await p;
	// the fix
	plugin.beforeSearch.constructor = (async () => {}).constructor;
	plugin.beforeInsert.constructor = (async () => {}).constructor;
	// now use it
});

lights0123 avatar Apr 25 '25 16:04 lights0123

Hey @lights0123 that's a nice catch... any suggestion for a fix at the bundler level?

micheleriva avatar Apr 25 '25 17:04 micheleriva

Async functions are part of ES7, and your bundler tsup reads tsconfig.json so it should be as simple as changing this line to ES7:

https://github.com/oramasearch/orama/blob/c8051ce183d3628519fe61d07b9f6ff561b21465/packages/plugin-embeddings/tsconfig.json#L4

You'll also need to add to your documentation that users should configure their bundlers in the same way, as they typically re-process files. This drops support for IE11.


However, I think it would be better to just re-write some logic to look at the return type of the function—I don't really like that async () => {} works while () => new Promise(() => {}) doesn't. You could do that by replacing

https://github.com/oramasearch/orama/blob/c8051ce183d3628519fe61d07b9f6ff561b21465/packages/orama/src/components/hooks.ts#L95-L106

with (untested):

  let promise: Promise<void> | undefined;

  for (const hook of hooks) {
    if (promise) {
      // If we already have a promise chain, extend it
      // by running the current hook after previous promises resolve
      promise = promise.then(() => hook(db, params, language));
    } else {
      // First hook or no promises yet
      const result = hook(db, params, language);
      if (typeof result?.then === 'function') {
        // Start a promise chain
        promise = result;
      }
    }
  }

  return promise;

lights0123 avatar Apr 25 '25 22:04 lights0123

alternatively to @lights0123 temporary fix, you can also do this for now to achieve a cleaner syntax imo

    const plugin = await pluginEmbeddings({
      embeddings: {
        defaultProperty: "embeddings",
        ...
        },
      },
    });

   // eslint-disable-next-line @typescript-eslint/no-empty-function
    const asyncConstructor = (async () => {}).constructor;
    plugin.beforeSearch.constructor = asyncConstructor;
    plugin.beforeInsert.constructor = asyncConstructor;

thank you @lights0123

maxschulmeister avatar Aug 25 '25 16:08 maxschulmeister

hey @micheleriva, have you already considered how to address this issue? if you have a specific solution in mind, i’d be happy to help by preparing a pr.

as discussed above, one possible workaround could be changing the typescript build target, which would prevent the async keyword from being stripped out during compilation. that would only be a temporary fix, since users could still transpile the code themselves and encounter the same problem. ideally, it probably make sense to revisit how isAsyncFunction is used for a more robust solution.

if it helps in the short term, i can go ahead and put together a pr that simply increases the build target to es2017, since @orama/plugin-embeddings isn’t really usable right now as it stands. happy to help however you think is best :)


edit: i just found this comment, so i guess this is probably already a wip.

in the meantime, i think it might still make sense to set esnext as the target in @orama/plugin-embeddings as well, so that it solves this problem at least in some circumstances, and in any case it would be aligned with @orama/orama, which uses that target.

stefanofa avatar Sep 23 '25 20:09 stefanofa