prose
prose copied to clipboard
Quotes not tagged correctly
Expect: Quotes to be tagged as quotes
Given
package main
import (
"fmt"
"log"
"github.com/jdkato/prose/v2"
)
func main() {
doc, err := prose.NewDocument("Bob said \"Alice, could you send me the key?\" Alice shook her head no.")
if err != nil {
log.Fatal(err)
}
// Iterate over the doc's tokens:
fmt.Println("Tokens")
fmt.Println(doc.Tokens())
fmt.Println()
// Iterate over the doc's named-entities:
fmt.Println("Entities")
fmt.Println(doc.Entities())
fmt.Println()
// Iterate over the doc's sentences:
fmt.Println("Sentences")
fmt.Println(doc.Sentences())
fmt.Println()
}
Then
Tokens
[{NNP Bob B-PERSON} {VBD said O} {NNP " O} {NNP Alice O} {, , O} {MD could O} {PRP you O} {VB send O} {PRP me O} {DT the O} {NN key O} {. ? O} {JJ " O} {NNP Alice O} {VBD shook O} {PRP$ her O} {NN head O} {DT no O} {. . O}]
Entities
[{Bob PERSON}]
Sentences
[{Bob said "Alice, could you send me the key?"} {Alice shook her head no.}]
Using single quotes:
- The first quote is bound to the word
- The second quote tagged correctly
Tokens
[{NNP Bob B-PERSON} {VBD said O} {NNP 'Alice B-GPE} {, , O} {MD could O} {PRP you O} {VB send O} {PRP me O} {DT the O} {NN key O} {. ? O} {'' ' O} {NNP Alice O} {VBD shook O} {PRP$ her O} {NN head O} {DT no O} {. . O}]
Entities
[{Bob PERSON} {'Alice GPE}]
Sentences
[{Bob said 'Alice, could you send me the key?'} {Alice shook her head no.}]
This is an example of CoreNLP's tagging:
Bob(NNP) said(VBD) "(``)Alice(NNP),(,) could(MD) you(PRP) send(VB) me(PRP) the(DT) key(NN)?(.)"('') Alice(NNP) shook(VBD) her(PRP$) head(NN) no(DT).(.)