arcadedb icon indicating copy to clipboard operation
arcadedb copied to clipboard

Can we update/port this Elasticsearch plugin for arcadedb

Open tolgaulas opened this issue 4 years ago • 15 comments

This a feature request, this is a good plugin for Elasticsearch, can we have it for arcadedb? https://github.com/orientechnologies/orientdb-elasticsearch

tolgaulas avatar Dec 06 '21 13:12 tolgaulas

What's your use case?

lvca avatar Dec 06 '21 16:12 lvca

You can use ArcadeDB's Full-Text index that uses Lucene (the same as ES) to tokenize works.

lvca avatar Dec 13 '21 00:12 lvca

@tolgaulas did you use that plugin with OrientDB before?

lvca avatar Dec 14 '21 14:12 lvca

Hey, sorry for late response. No I did not. But your previous inquiry made us rethink it; as arcadedb now supports many protocol i.e postgresql etc we are now playing around with its grafana integration; so far import from orientdb was not successful with postgresql driver, nor with native json (which worked well with orientdb).

I guess you can close this one, we may open a new one for sharing our endeavor.

tolgaulas avatar Dec 14 '21 14:12 tolgaulas

@tolgaulas did you import the database from orientdb with the orientdb importer https://docs.arcadedb.com/#OrientDB-Importer?

lvca avatar Dec 14 '21 14:12 lvca

Yes, asked my colleague who will share the details here as much as possible without revealing too much data :)

tolgaulas avatar Dec 14 '21 14:12 tolgaulas

Ok, since you're a sponsor you can write to support at arcadedb.com if you have sensitive data. If you need to sign an NDA please send it to the same email address.

lvca avatar Dec 14 '21 14:12 lvca

Will do if things is harder to sanitize for public, otherwise we like to keep public for future generations. :)

tolgaulas avatar Dec 14 '21 14:12 tolgaulas

Ok i think the problem is not import, but our habits from oriendtb sql language. In orientdb, we had v1, v2 vertices and v1v2 edge from v1 to v2. So, in orientdb we would have v1.out_v1v2 and v2.in_v1v2 and v1v2.in pointing to v1 and v1v2.out pointing to v2. And we used to have some sql queries like select expand(out_v1v2) from v1 to access to v1v2 via vertex; or select expand(in) from v1v2 to access to v1 via edge. But I guess this logic is changed in arcadedb.

When we create the v1,v2 and v1v2 and add related records, select from v1 does not show any out_* records, actually nothing related to any edges.

Similarly, when we rung select from v1v2 it display @in and @out pointing to the respected vertices as in db. But in this time select expand(@out) from v1v2 return empty.

Is there any formula or should i open a new issue for this as this is not related with this issue topic.

tolgaulas avatar Dec 15 '21 15:12 tolgaulas

And on the grafana part, definetly their postgres adapter is not compatible with current arcadeb; we're not getting results we're getting with postgres clients out of arcadedb. Also we could not try mongodb as it requries grafana enterprise. And for redis, redis plugin in grafana needs specific parameters on server to connect to. it can not recognize the driver as a redis server to connect to. So far, we could fetch data with in grafana with HTTP/JSON adapter out of arcadedb, which satisfied our needs of Elasticsearch migrator as we can use arcadedb with grafana.

tolgaulas avatar Dec 15 '21 15:12 tolgaulas

This is select from v1v2 :

{
  "result": {
    "vertices": [
      {
        "p": {
          "foo": "bar"
        },
        "r": "#25:0",
        "t": "v2",
        "i": 1,
        "o": 0
      },
      {
        "p": {
          "hello": "world"
        },
        "r": "#1:0",
        "t": "v1",
        "i": 0,
        "o": 1
      }
    ],
    "records": [
      {
        "@out": "#1:0",
        "@rid": "#49:0",
        "@in": "#25:0",
        "@type": "v1v2",
        "@cat": "e"
      },
      {
        "@out": 0,
        "@rid": "#25:0",
        "@in": 1,
        "@type": "v2",
        "foo": "bar",
        "@cat": "v"
      },
      {
        "@out": 1,
        "@rid": "#1:0",
        "@in": 0,
        "@type": "v1",
        "@cat": "v",
        "hello": "world"
      }
    ],
    "edges": [
      {
        "p": {},
        "r": "#49:0",
        "t": "v1v2",
        "i": "#25:0",
        "o": "#1:0"
      }
    ]
  },
  "user": "root",
  "version": "21.12.1-SNAPSHOT (build 94c9efa44b0430f8a79bf2e8b572c2fc81a7d50a/1638857567344/main)"
}

and this is select from v1:

{
  "result": {
    "vertices": [
      {
        "p": {
          "hello": "world"
        },
        "r": "#1:0",
        "t": "v1",
        "i": 0,
        "o": 1
      }
    ],
    "records": [
      {
        "@out": 1,
        "@rid": "#1:0",
        "@in": 0,
        "@type": "v1",
        "@cat": "v",
        "hello": "world"
      }
    ],
    "edges": []
  },
  "user": "root",
  "version": "21.12.1-SNAPSHOT (build 94c9efa44b0430f8a79bf2e8b572c2fc81a7d50a/1638857567344/main)"
}

tolgaulas avatar Dec 15 '21 15:12 tolgaulas

With OrientDB we did this horrible mistake: mixing edges with properties. And we spent so much time fixing and making it not super ugly. With ArcadeDB we finally have a physical separation between edges and vertices. So in your queries. Most of the time you don't want the edges in the result because the client doesn't know what to do with them. On top of that, there is also overhead on fetching and serializing those edges for no reason.

So you won't find _in and _out in ArcadeDB, but you can still access the edges. It's always suggested to use the graph functions to manipulate the graph.

Some SQL queries that work on both OrientDB and ArcadeDB:

select expand(out()) from v1

select expand(outV()) from v1v2

About the issues with Postgres and Redis, do you have any log to check that out? Adding commands to the Redis protocol is straightforward, with Postgres unfortunately the lack of documentation makes things harder, but still doable.

lvca avatar Dec 15 '21 16:12 lvca

Also, since you're using the HTTP /post command, you can specify the serializer you want:

  • graph: return as a graph separating vertices from edges
  • record: returns everything as records
  • by default it's like record but with additional metadata for vertex records, such as the number of outgoing edges in @out property and total incoming edges in @in property. This serialzier is used by Studio.

lvca avatar Dec 15 '21 16:12 lvca

In orientdb, we used to have a lot data (hence functionality) on edges; like the ongoing activities in between the vertices. Should I understand that with arcadedb we should use edges only for linking vertices, but nothing more. For example, the v1v2 would mean a received message from a user to a system with all message details. So, given v1 is person and v2 system v1v2 would mean the messages from v1 to v2 and v2v1 would mean the same with inverse direction. Should I understand instead we should craft v_user <-> e_send <-> v_message <-> e_send <-> v_system, instead?

Postrgres and Redis were the from inside grafana, I think we should not worry any more about this here, as grafana is a third party app, and these protocols are currently in development. We're happy with http/json at the moment. Therefore if i can clear out the above paragraph, i think we should close this issue.

tolgaulas avatar Dec 15 '21 22:12 tolgaulas

You can totally do the same with ArcadeDB: storing properties on edges. On this the 2 products are very similar.

lvca avatar Dec 15 '21 23:12 lvca