node-red-node-watson icon indicating copy to clipboard operation
node-red-node-watson copied to clipboard

Discovery Document Loader loads text as 'value' attribute, not 'text' attribute

Open jaumemir opened this issue 7 years ago • 14 comments

The module "Discovery Document Loader" loads document content (the msg.payload), under an attribute named "value", when the standard expectation is to have it under the attribute named "text" in Discovery. This requires to enrich the "value" field, generating "enriched_value" instead of "enriched_text", which is non-standard and a client application may fail if enrichments are not where expected.

jaumemir avatar Nov 07 '18 00:11 jaumemir

Implemented in 0.7.5

chughts avatar Dec 18 '18 15:12 chughts

Had to back out during test, as the stream input processing requires the attribute 'value', else the open stream from a document file causes node-red to crash.

chughts avatar Dec 18 '18 16:12 chughts

And it doesn't upload json as they are... this best I achieved is having the stringified json structure inside the value field: not useful at all.

Lotti avatar May 22 '19 05:05 Lotti

How are you injecting the json? It has been tested with HTTP request, HTTP multipart and File Inject, and PDF Hummus.

[{"id":"6a0eaf1a.5ae95","type":"fileinject","z":"e0b750ac.1c66c","name":"Inject a doc or json","x":134,"y":803,"wires":[["af11a3c2.83d6b"]]},{"id":"f6eb0f41.7a89b","type":"fileinject","z":"e0b750ac.1c66c","name":"Inject a pdf","x":104,"y":823,"wires":[["d52e8d8b.7f484"]]},{"id":"d52e8d8b.7f484","type":"pdf-hummus","z":"e0b750ac.1c66c","name":"","filename":"TempDoscV1.pdf","split":true,"mode":{"value":"asBuffer"},"x":282,"y":806,"wires":[["af11a3c2.83d6b"]]},{"id":"af11a3c2.83d6b","type":"watson-discovery-v1-document-loader","z":"e0b750ac.1c66c","name":"","environment_id":"6a02b192-b8ed-4839-b108-e02a842d26e5","collection_id":"06e8c1b0-9858-4a38-861f-d35cd96d70f6","filename":"Temp Docs","default-endpoint":true,"service-endpoint":"https://gateway.watsonplatform.net/discovery/api","x":518,"y":774,"wires":[["9d8fc50a.4acec8"]]},{"id":"9d8fc50a.4acec8","type":"debug","z":"e0b750ac.1c66c","name":"","active":true,"console":"false","complete":"true","x":540,"y":677,"wires":[]}]

chughts avatar May 22 '19 08:05 chughts

I have pure json, not stored in a file...

I modified the old discovery-insert node provided by community for my purposes: https://github.com/Lotti/node-red-contrib-discovery-insert

I will find time to fix and test the official node to propose a pull request... Already setup the env but I've got the deadline today and preferred to go with something easier to modify respect to official nodes.

Anyway... This is the solution I found to adding json as document to discovery with latest sdk

                var env = (msg.hasOwnProperty('environment_id')) ? msg.environment_id : environment;
                var col = (msg.hasOwnProperty('collection_id')) ? msg.collection_id : collection;

                const string = JSON.stringify(msg.payload.content);
                const file = Buffer.from(string, 'utf8');
                const sha1 = getSHA1(string);
                const filename = msg.payload.filename || `${sha1}.json`;

                var document_obj = {
                    environment_id: env,
                    collection_id: col,
                    file: file,
                    filename: filename,
                    file_content_type: 'application/json',
                };

                discovery.addDocument(document_obj, function (err, response) {
                    if (err) {
                        if (err.code === 429) {
                            resolve(429);
                        } else {
                            reject(err);
                        }
                    } else {
                        resolve(response);
                    }
                });

Lotti avatar May 23 '19 10:05 Lotti

NLU analysis is not performed unless text data is stored in the 'text' property. Change it to the 'text' property instead of the 'value' property.

prismboy avatar Mar 11 '20 03:03 prismboy

I have pure json, not stored in a file...

I modified the old discovery-insert node provided by community for my purposes: https://github.com/Lotti/node-red-contrib-discovery-insert

I will find time to fix and test the official node to propose a pull request... Already setup the env but I've got the deadline today and preferred to go with something easier to modify respect to official nodes.

Anyway... This is the solution I found to adding json as document to discovery with latest sdk

                var env = (msg.hasOwnProperty('environment_id')) ? msg.environment_id : environment;
                var col = (msg.hasOwnProperty('collection_id')) ? msg.collection_id : collection;

                const string = JSON.stringify(msg.payload.content);
                const file = Buffer.from(string, 'utf8');
                const sha1 = getSHA1(string);
                const filename = msg.payload.filename || `${sha1}.json`;

                var document_obj = {
                    environment_id: env,
                    collection_id: col,
                    file: file,
                    filename: filename,
                    file_content_type: 'application/json',
                };

                discovery.addDocument(document_obj, function (err, response) {
                    if (err) {
                        if (err.code === 429) {
                            resolve(429);
                        } else {
                            reject(err);
                        }
                    } else {
                        resolve(response);
                    }
                });

Hey Lotti, thanks for the code. It is exactly what I need. One thing thou. When trying to run it, it gives an error stating "Error: Missing required parameters: apikey" Even thou the apikey is there. image

As I am newbie to all this, can you please help me? Thanks in advance!

rcorig avatar Jun 03 '20 08:06 rcorig

Watson SDK changes frequently.. maybe the problem you are encountering with API Key is related to it. I'll spend some time on this later today... I suggest to use my package because I don't have rights to merge code here (this repo) nor the official npm module.

Lotti avatar Jun 03 '20 08:06 Lotti

Sure thing! Thanks for the fast reply. As an ex-IBMER, I really appreciate it. You are a life savior!!

rcorig avatar Jun 03 '20 08:06 rcorig

Try using this module, install it from Node-Red Palette. Please remove the original before installing mine. https://www.npmjs.com/package/node-red-contrib-discovery-insert-temp

I didn't had time to test it on real discovery instance... will try tomorrow if won't work for you.

I'll try also to send a pull request here... but I think the owner of the repo doens't maintain anymore these nodes.

Lotti avatar Jun 03 '20 21:06 Lotti

@Lotti @rcorig Please take this discussion to the GitHub repo that you are referring to https://github.com/GwilymNewton/node-red-contrib-discovery-insert

Perhaps if you report your issues to the correct repo, the owner might actually accept your pull request.

chughts avatar Jun 04 '20 08:06 chughts

@chughts ok for me. Any news on implementing this feature inside the official watson discovery nodes? If no, should I invest time and do a pull request about it?

Lotti avatar Jun 04 '20 09:06 Lotti

@chughts ok for me as well. @Lotti Thanks man! It worked perfectly. With the https://www.npmjs.com/package/node-red-contrib-discovery-insert-temp, discovery was uploaded on the correct way! You are the man!!! Thank you very much!

rcorig avatar Jun 04 '20 09:06 rcorig

@Lotti If you can do it and verify that there is no regression in all current working cases, then I will happily accept a pull request.

chughts avatar Jun 04 '20 10:06 chughts