prose icon indicating copy to clipboard operation
prose copied to clipboard

Quotes not tagged correctly

Open jrschumacher opened this issue 4 years ago • 2 comments

Expect: Quotes to be tagged as quotes

Given

package main

import (
	"fmt"
	"log"

	"github.com/jdkato/prose/v2"
)

func main() {
	doc, err := prose.NewDocument("Bob said \"Alice, could you send me the key?\" Alice shook her head no.")

	if err != nil {
		log.Fatal(err)
	}

	// Iterate over the doc's tokens:
	fmt.Println("Tokens")
	fmt.Println(doc.Tokens())
	fmt.Println()

	// Iterate over the doc's named-entities:
	fmt.Println("Entities")
	fmt.Println(doc.Entities())
	fmt.Println()

	// Iterate over the doc's sentences:
	fmt.Println("Sentences")
	fmt.Println(doc.Sentences())
	fmt.Println()
}

Then

Tokens
[{NNP Bob B-PERSON} {VBD said O} {NNP " O} {NNP Alice O} {, , O} {MD could O} {PRP you O} {VB send O} {PRP me O} {DT the O} {NN key O} {. ? O} {JJ " O} {NNP Alice O} {VBD shook O} {PRP$ her O} {NN head O} {DT no O} {. . O}]

Entities
[{Bob PERSON}]

Sentences
[{Bob said "Alice, could you send me the key?"} {Alice shook her head no.}]

jrschumacher avatar May 24 '21 18:05 jrschumacher

Using single quotes:

  • The first quote is bound to the word
  • The second quote tagged correctly
Tokens
[{NNP Bob B-PERSON} {VBD said O} {NNP 'Alice B-GPE} {, , O} {MD could O} {PRP you O} {VB send O} {PRP me O} {DT the O} {NN key O} {. ? O} {'' ' O} {NNP Alice O} {VBD shook O} {PRP$ her O} {NN head O} {DT no O} {. . O}]

Entities
[{Bob PERSON} {'Alice GPE}]

Sentences
[{Bob said 'Alice, could you send me the key?'} {Alice shook her head no.}]

jrschumacher avatar May 24 '21 18:05 jrschumacher

This is an example of CoreNLP's tagging:

Bob(NNP) said(VBD) "(``)Alice(NNP),(,) could(MD) you(PRP) send(VB) me(PRP) the(DT) key(NN)?(.)"('') Alice(NNP) shook(VBD) her(PRP$) head(NN) no(DT).(.)

jrschumacher avatar May 25 '21 00:05 jrschumacher