PPLM
PPLM copied to clipboard
Are the bag of words case-sensitive?
Hello, I find that some words are cased while some are uncased. They have different word ids in the vocab of tokenizer of GPT.
What is the appropriate way to process the words ? Thanks.

Seems like there's no other better way to solve this, unless you include them all in bag of words.