incubator-hugegraph icon indicating copy to clipboard operation
incubator-hugegraph copied to clipboard

[Bug] Merged query using logical operator AND returns false results

Open llooFlashooll opened this issue 2 years ago • 3 comments

Bug Type (问题类型)

None

Before submit

  • [X] 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • HugeGraph Version: 0.12.0
  • Storage Backend: inmemory
  • Mixed Index Backend: none
  • Link to discussed bug: none
  • Operating system: macOS 13.2.1
  • API/Driver: Java

Expected & Actual behavior (期望与实际表现)

  • Expected behavior: We construct the following scenario: we randomly generate two queries Q1, Q2, and merge these two queries using AND logical operator into a new query Q3. Based on the AND calculation. The Q3 query result set should be the intersection of result sets from Q1 and Q2. We generate graph schema and data based on random strings and values. Here is one of our examples that triggered the bug.
  1. g.V().outE('el1').has('ep0', inside(0.26257116683022674,0.9491151866694021)).outV() returns [698910130770542592, 698910130791514112, 698910130812485632, 698910130829262848, 698910130850234368]
  2. g.V().out('el0') returns [698910130674073600, 698910130674073600, 698910130711822336, 698910130711822336, 698910130749571072, 698910130770542592, 698910130791514112, 698910130812485632, 698910130829262848, 698910130850234368, 698910130850234368, 698910130867011584, 698910130867011584]
  3. g.V().and(outE('el1').has('ep0', inside(0.26257116683022674,0.9491151866694021)).outV(),out('el0')) returns an empty set.

We calculate the intersection result set of Q1 and Q2, which is [698910130770542592, 698910130791514112, 698910130812485632, 698910130829262848, 698910130850234368]. The intersection result set doesn't equal to Q3 result set.

  • Actual behavior: The intersection result set should equal to Q3 result set. We did trigger some cases conform to this requirement, but still there're some cases that violate this constraint.

Vertex/Edge example (问题点 / 边数据举例)

Steps to reproduce: We create a graph with 10 nodes and 20 edges. We try to make it clear to reproduce the bugs, hope to not cause much inconvenience to your reviewing, but we believe the problem does exist. Following the following graph data generation query, we can reproduce the bugs:

  • Create data
Vertex:
g.addV('vl0').property('vp0','false').property('vp3','0.11901258738738996').property('vp2','0.63591534').property('vp4','0.11901258738738996').property(T.id,698910130674073600)
g.addV('vl0').property('vp3','0.768218427293753').property(T.id,698910130711822336)
g.addV('vl0').property('vp0','false').property('vp3','0.587157222635081').property('vp4','0.039931592743848277').property(T.id,698910130728599552)
g.addV('vl0').property('vp0','false').property('vp3','0.8060767940441078').property('vp4','0.7147467048901186').property(T.id,698910130749571072)
g.addV('vl0').property('vp0','true').property('vp3','0.8060767940441078').property('vp2','0.13471144').property('vp4','0.08673768791354186').property(T.id,698910130770542592)
g.addV('vl0').property('vp0','false').property('vp3','0.7147467048901186').property('vp2','0.40616316').property(T.id,698910130791514112)
g.addV('vl0').property('vp0','false').property(T.id,698910130812485632)
g.addV('vl0').property('vp3','0.8984566565808907').property('vp2','0.8857693').property(T.id,698910130829262848)
g.addV('vl0').property('vp0','true').property('vp3','0.41031638957410377').property('vp2','0.025008798').property('vp4','0.1968631072861693').property(T.id,698910130850234368)
g.addV('vl0').property('vp0','true').property('vp4','0.5947836228331603').property(T.id,698910130867011584)
Edge:
g.V(698910130674073600).as('698910130674073600').V(698910130867011584).as('698910130867011584').addE('el0').from('698910130674073600').to('698910130867011584')
g.V(698910130867011584).as('698910130867011584').V(698910130812485632).as('698910130812485632').addE('el1').from('698910130867011584').to('698910130812485632')
g.V(698910130749571072).as('698910130749571072').V(698910130791514112).as('698910130791514112').addE('el1').from('698910130749571072').to('698910130791514112')
g.V(698910130711822336).as('698910130711822336').V(698910130829262848).as('698910130829262848').addE('el1').from('698910130711822336').to('698910130829262848')
g.V(698910130791514112).as('698910130791514112').V(698910130711822336).as('698910130711822336').addE('el1').from('698910130791514112').to('698910130711822336')
g.V(698910130850234368).as('698910130850234368').V(698910130770542592).as('698910130770542592').addE('el1').from('698910130850234368').to('698910130770542592')
g.V(698910130711822336).as('698910130711822336').V(698910130829262848).as('698910130829262848').addE('el0').from('698910130711822336').to('698910130829262848')
g.V(698910130770542592).as('698910130770542592').V(698910130867011584).as('698910130867011584').addE('el1').from('698910130770542592').to('698910130867011584')
g.V(698910130812485632).as('698910130812485632').V(698910130674073600).as('698910130674073600').addE('el1').from('698910130812485632').to('698910130674073600')
g.V(698910130674073600).as('698910130674073600').V(698910130850234368).as('698910130850234368').addE('el0').from('698910130674073600').to('698910130850234368')
g.V(698910130749571072).as('698910130749571072').V(698910130850234368).as('698910130850234368').addE('el1').from('698910130749571072').to('698910130850234368')
g.V(698910130749571072).as('698910130749571072').V(698910130728599552).as('698910130728599552').addE('el0').from('698910130749571072').to('698910130728599552')
g.V(698910130749571072).as('698910130749571072').V(698910130812485632).as('698910130812485632').addE('el1').from('698910130749571072').to('698910130812485632')
g.V(698910130867011584).as('698910130867011584').V(698910130711822336).as('698910130711822336').addE('el0').from('698910130867011584').to('698910130711822336')
g.V(698910130770542592).as('698910130770542592').V(698910130850234368).as('698910130850234368').addE('el0').from('698910130770542592').to('698910130850234368')
g.V(698910130711822336).as('698910130711822336').V(698910130791514112).as('698910130791514112').addE('el0').from('698910130711822336').to('698910130791514112')
g.V(698910130867011584).as('698910130867011584').V(698910130674073600).as('698910130674073600').addE('el0').from('698910130867011584').to('698910130674073600')
g.V(698910130850234368).as('698910130850234368').V(698910130812485632).as('698910130812485632').addE('el0').from('698910130850234368').to('698910130812485632')
g.V(698910130749571072).as('698910130749571072').V(698910130728599552).as('698910130728599552').addE('el1').from('698910130749571072').to('698910130728599552')
g.V(698910130674073600).as('698910130674073600').V(698910130867011584).as('698910130867011584').addE('el1').from('698910130674073600').to('698910130867011584')

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

  • Create schema
schema().propertyKey('vp0').asBoolean().ifNotExist().create();
schema().propertyKey('vp1').asLong().ifNotExist().create();
schema().propertyKey('vp2').asFloat().ifNotExist().create();
schema().propertyKey('vp3').asDouble().ifNotExist().create();
schema().propertyKey('vp4').asDouble().ifNotExist().create();
schema().propertyKey('vp5').asBoolean().ifNotExist().create();
schema().vertexLabel('vl0').ifNotExist().create();
schema().vertexLabel('vl0').properties('vp0').nullableKeys('vp0').append();
schema().indexLabel('vl0byvp0Shard').onV('vl0').by('vp0').shard().ifNotExist().create();
schema().vertexLabel('vl0').properties('vp4').nullableKeys('vp4').append();
schema().indexLabel('vl0byvp4Shard').onV('vl0').by('vp4').shard().ifNotExist().create();
schema().vertexLabel('vl0').properties('vp2').nullableKeys('vp2').append();
schema().indexLabel('vl0byvp2Shard').onV('vl0').by('vp2').shard().ifNotExist().create();
schema().vertexLabel('vl0').properties('vp3').nullableKeys('vp3').append();
schema().indexLabel('vl0byvp3Shard').onV('vl0').by('vp3').shard().ifNotExist().create();
schema().propertyKey('ep0').asDouble().ifNotExist().create();
schema().propertyKey('ep1').asDouble().ifNotExist().create();
schema().propertyKey('ep2').asInt().ifNotExist().create();
schema().propertyKey('ep3').asText().ifNotExist().create();
schema().propertyKey('ep4').asInt().ifNotExist().create();
schema().propertyKey('ep5').asDouble().ifNotExist().create();
schema().propertyKey('ep6').asFloat().ifNotExist().create();
schema().edgeLabel('el0').link('vl0', 'vl0').ifNotExist().create();
schema().edgeLabel('el0').properties('ep6').nullableKeys('ep6').append();
schema().indexLabel('el0byep6Shard').onE('el0').by('ep6').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep5').nullableKeys('ep5').append();
schema().indexLabel('el0byep5Shard').onE('el0').by('ep5').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep0').nullableKeys('ep0').append();
schema().indexLabel('el0byep0Shard').onE('el0').by('ep0').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep2').nullableKeys('ep2').append();
schema().indexLabel('el0byep2Shard').onE('el0').by('ep2').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep1').nullableKeys('ep1').append();
schema().indexLabel('el0byep1Shard').onE('el0').by('ep1').shard().ifNotExist().create();
schema().edgeLabel('el1').link('vl0', 'vl0').ifNotExist().create();
schema().edgeLabel('el1').properties('ep2').nullableKeys('ep2').append();
schema().indexLabel('el1byep2Shard').onE('el1').by('ep2').shard().ifNotExist().create();
schema().edgeLabel('el1').properties('ep0').nullableKeys('ep0').append();
schema().indexLabel('el1byep0Shard').onE('el1').by('ep0').shard().ifNotExist().create();
schema().edgeLabel('el1').properties('ep5').nullableKeys('ep5').append();
schema().indexLabel('el1byep5Shard').onE('el1').by('ep5').shard().ifNotExist().create();
schema().edgeLabel('el1').properties('ep3').nullableKeys('ep3').append();
schema().indexLabel('el1byep3Shard').onE('el1').by('ep3').shard().ifNotExist().create();

llooFlashooll avatar Mar 10 '23 07:03 llooFlashooll

note and() is a filter step: https://tinkerpop.apache.org/docs/current/reference/#and-step

you can try this sample:

g.V().outE('el1').has('ep0', inside(0.26257116683022674,0.9491151866694021)).outV().aggregate('a').cap('a').as('x')
 .V().out('el0').where(within('x'))

you can please also ref match() step: https://stackoverflow.com/questions/48067834/gremlin-intersection-operation

javeme avatar Mar 10 '23 13:03 javeme

Dear developers, I re-corrected the syntax and ran new experiments, still triggered the bugs.

  • Expected behavior: We construct the following scenario: we randomly generate two queries Q1, Q2, and merge these two queries using MATCH operator into a new query Q3. See reference at https://stackoverflow.com/questions/48067834/gremlin-intersection-operation. Based on the calculation. The Q3 query result set should be the intersection of result sets from Q1 and Q2. We generate graph schema and data based on random strings and values. Here is one of our examples that triggered the bug.
  1. g.V().in('el0','el2','el1') returns [701183111471300608, 701183111588741120, 701183111630684160, 701183111685210112, 701183111710375936, 701183111794262016].
  2. g.V().and(__.outE('el2','el1')).has('vp3',true).both('el0','el1') returns [701183111630684160, 701183111685210112, 701183111710375936, 701183111794262016].
  3. g.V().match(__.as('a').in('el0','el2','el1'),__.as('a').filter(and(__.outE('el2','el1')).has('vp3',true).both('el0','el1'))).select('a') returns [701183111471300608, 701183111588741120].

We calculate the intersection result set of Q1 and Q2, which is [701183111630684160, 701183111685210112, 701183111710375936, 701183111794262016]. The intersection result set doesn't equal to Q3 result set.

  • Actual behavior: The intersection result set should equal to Q3 result set. We did trigger some cases conform to this requirement, but still there're some cases that violate this constraint.

  • Steps to reproduce Create schema

schema().propertyKey('vp0').asInt().ifNotExist().create();
schema().propertyKey('vp1').asText().ifNotExist().create();
schema().propertyKey('vp2').asText().ifNotExist().create();
schema().propertyKey('vp3').asBoolean().ifNotExist().create();
schema().propertyKey('vp4').asInt().ifNotExist().create();
schema().propertyKey('vp5').asBoolean().ifNotExist().create();
schema().vertexLabel('vl0').ifNotExist().create();
schema().vertexLabel('vl0').properties('vp0').nullableKeys('vp0').append();
schema().indexLabel('vl0byvp0Shard').onV('vl0').by('vp0').shard().ifNotExist().create();
schema().vertexLabel('vl0').properties('vp2').nullableKeys('vp2').append();
schema().indexLabel('vl0byvp2Shard').onV('vl0').by('vp2').shard().ifNotExist().create();
schema().vertexLabel('vl1').ifNotExist().create();
schema().vertexLabel('vl1').properties('vp3').nullableKeys('vp3').append();
schema().indexLabel('vl1byvp3Shard').onV('vl1').by('vp3').shard().ifNotExist().create();
schema().vertexLabel('vl2').ifNotExist().create();
schema().vertexLabel('vl2').properties('vp2').nullableKeys('vp2').append();
schema().indexLabel('vl2byvp2Shard').onV('vl2').by('vp2').shard().ifNotExist().create();
schema().vertexLabel('vl2').properties('vp0').nullableKeys('vp0').append();
schema().indexLabel('vl2byvp0Shard').onV('vl2').by('vp0').shard().ifNotExist().create();
schema().propertyKey('ep0').asText().ifNotExist().create();
schema().propertyKey('ep1').asInt().ifNotExist().create();
schema().propertyKey('ep2').asFloat().ifNotExist().create();
schema().propertyKey('ep3').asDouble().ifNotExist().create();
schema().propertyKey('ep4').asText().ifNotExist().create();
schema().propertyKey('ep5').asLong().ifNotExist().create();
schema().propertyKey('ep6').asBoolean().ifNotExist().create();
schema().propertyKey('ep7').asText().ifNotExist().create();
schema().edgeLabel('el0').link('vl1', 'vl1').ifNotExist().create();
schema().edgeLabel('el0').properties('ep6').nullableKeys('ep6').append();
schema().indexLabel('el0byep6Shard').onE('el0').by('ep6').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep2').nullableKeys('ep2').append();
schema().indexLabel('el0byep2Shard').onE('el0').by('ep2').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep5').nullableKeys('ep5').append();
schema().indexLabel('el0byep5Shard').onE('el0').by('ep5').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep3').nullableKeys('ep3').append();
schema().indexLabel('el0byep3Shard').onE('el0').by('ep3').shard().ifNotExist().create();
schema().edgeLabel('el0').properties('ep7').nullableKeys('ep7').append();
schema().indexLabel('el0byep7Shard').onE('el0').by('ep7').shard().ifNotExist().create();
schema().edgeLabel('el1').link('vl0', 'vl1').ifNotExist().create();
schema().edgeLabel('el1').properties('ep2').nullableKeys('ep2').append();
schema().indexLabel('el1byep2Shard').onE('el1').by('ep2').shard().ifNotExist().create();
schema().edgeLabel('el1').properties('ep0').nullableKeys('ep0').append();
schema().indexLabel('el1byep0Shard').onE('el1').by('ep0').shard().ifNotExist().create();
schema().edgeLabel('el2').link('vl1', 'vl1').ifNotExist().create();
schema().edgeLabel('el2').properties('ep6').nullableKeys('ep6').append();
schema().indexLabel('el2byep6Shard').onE('el2').by('ep6').shard().ifNotExist().create();

Create data

Vertex:
g.addV('vl1').property('vp3','true').property(T.id,701183111471300608)
g.addV('vl2').property('vp0','-1270611045').property(T.id,701183111559380992)
g.addV('vl1').property('vp3','true').property(T.id,701183111588741120)
g.addV('vl0').property('vp2',''nw)0Y'').property(T.id,701183111630684160)
g.addV('vl2').property('vp0','-1270611045').property('vp2',''~kK4?'').property(T.id,701183111655849984)
g.addV('vl1').property('vp3','false').property(T.id,701183111685210112)
g.addV('vl1').property('vp3','false').property(T.id,701183111710375936)
g.addV('vl2').property('vp2',''qrΛO|'').property(T.id,701183111735541760)
g.addV('vl2').property('vp2',''qrΛO|'').property(T.id,701183111764901888)
g.addV('vl1').property('vp3','false').property(T.id,701183111794262016)
Edge:
g.V(701183111685210112).as('701183111685210112').V(701183111471300608).as('701183111471300608').addE('el2').from('701183111685210112').to('701183111471300608')
g.V(701183111685210112).as('701183111685210112').V(701183111710375936).as('701183111710375936').addE('el0').from('701183111685210112').to('701183111710375936')
g.V(701183111685210112).as('701183111685210112').V(701183111794262016).as('701183111794262016').addE('el0').from('701183111685210112').to('701183111794262016')
g.V(701183111794262016).as('701183111794262016').V(701183111471300608).as('701183111471300608').addE('el2').from('701183111794262016').to('701183111471300608')
g.V(701183111710375936).as('701183111710375936').V(701183111685210112).as('701183111685210112').addE('el2').from('701183111710375936').to('701183111685210112')
g.V(701183111471300608).as('701183111471300608').V(701183111630684160).as('701183111630684160').addE('el1').from('701183111471300608').to('701183111630684160')
g.V(701183111685210112).as('701183111685210112').V(701183111630684160).as('701183111630684160').addE('el1').from('701183111685210112').to('701183111630684160')
g.V(701183111710375936).as('701183111710375936').V(701183111471300608).as('701183111471300608').addE('el0').from('701183111710375936').to('701183111471300608')
g.V(701183111588741120).as('701183111588741120').V(701183111685210112).as('701183111685210112').addE('el2').from('701183111588741120').to('701183111685210112')
g.V(701183111588741120).as('701183111588741120').V(701183111710375936).as('701183111710375936').addE('el2').from('701183111588741120').to('701183111710375936')
g.V(701183111794262016).as('701183111794262016').V(701183111630684160).as('701183111630684160').addE('el1').from('701183111794262016').to('701183111630684160')
g.V(701183111471300608).as('701183111471300608').V(701183111685210112).as('701183111685210112').addE('el2').from('701183111471300608').to('701183111685210112')
g.V(701183111710375936).as('701183111710375936').V(701183111588741120).as('701183111588741120').addE('el2').from('701183111710375936').to('701183111588741120')
g.V(701183111588741120).as('701183111588741120').V(701183111630684160).as('701183111630684160').addE('el1').from('701183111588741120').to('701183111630684160')
g.V(701183111685210112).as('701183111685210112').V(701183111471300608).as('701183111471300608').addE('el0').from('701183111685210112').to('701183111471300608')
g.V(701183111471300608).as('701183111471300608').V(701183111794262016).as('701183111794262016').addE('el2').from('701183111471300608').to('701183111794262016')
g.V(701183111794262016).as('701183111794262016').V(701183111685210112).as('701183111685210112').addE('el2').from('701183111794262016').to('701183111685210112')
g.V(701183111710375936).as('701183111710375936').V(701183111794262016).as('701183111794262016').addE('el2').from('701183111710375936').to('701183111794262016')
g.V(701183111588741120).as('701183111588741120').V(701183111794262016).as('701183111794262016').addE('el0').from('701183111588741120').to('701183111794262016')
g.V(701183111471300608).as('701183111471300608').V(701183111685210112).as('701183111685210112').addE('el0').from('701183111471300608').to('701183111685210112')

llooFlashooll avatar Mar 16 '23 01:03 llooFlashooll

g.V().match(.as('a').in('el0','el2','el1'),.as('a').filter(and(__.outE('el2','el1')).has('vp3',true).both('el0','el1'))).select('a') returns [701183111471300608, 701183111588741120].

please try:

g.V().match(
  __.as('s').outE('el1').has('ep0', inside(0.26257116683022674,0.9491151866694021)).outV().as('i'),
  __.as('s').out('el0').as('i'))
.select('i')

javeme avatar Mar 21 '23 13:03 javeme