BlackLab icon indicating copy to clipboard operation
BlackLab copied to clipboard

Grouping on captures

Open JessedeDoes opened this issue 5 years ago • 2 comments

Use cases: user has some complex pattern to extract, for instance, verb-subject-object triples. It would be nice to be able to group directly on properties of one or more of the named captures.

Example:

Query:

 (v:[pos="VERB"] o:[pos="NOUN"] | [word="dat"] s:[pos="NOUN"] o:[pos="NOUN"] v:[pos="VERB"]) 

Result

Piet v:eet o:worst.
Ik zag dat Klaas o:kaas v:at.

Group on v.lemma and o.lemma:

eten worst
eten kaas

JessedeDoes avatar Nov 24 '20 14:11 JessedeDoes

Good idea, and shouldn't be too difficult I think. I might have a look soon.

I'm not sure if tagging different parts of the query with the same group name works at present; we should investigate that.

jan-niestadt avatar Apr 15 '22 12:04 jan-niestadt

https://github.com/INL/BlackLab/tree/group-by-capture

Work in progress.

jan-niestadt avatar Jun 29 '22 13:06 jan-niestadt

Should work, please test. :-)

(e.g. group by property capture:word:i:X to group on named capture X)

jan-niestadt avatar Sep 22 '22 09:09 jan-niestadt

Do we need to expose this in the frontend?

KCMertens avatar Sep 22 '22 09:09 KCMertens

That's above my pay grade. ;-)

jan-niestadt avatar Sep 22 '22 09:09 jan-niestadt