Telltale Signs of Output from GPT2 (Double Quotes in Particular)
I want to note that GPT2 output, while decent, usually has a ton of telltale signs:
- The frequent use of double quotes, as if it's quoting from an interview. Almost every paragraph uses double quotes at least once, especially at the end of the first paragraph.
- Long quotations inside a sentence. Removing these requires more advanced filtering, but something along the lines of removing any sentence with a quote and the word "said."
- The use of advertisement text. Because of how GPT-2 was trained, it sometimes has words like "Advertisement" on its own line.
- Any use of colons for things like quotations in interviews.
- Characters that are not found on a standard keyboard.
- Stuff like "All photos © Michael C. Stough" on its own line.
- While you remove the stuff that comes after the final period, you don't remove any trailing newlines afterwards.
We should try to remove as many of these signs as possible from the output of GPT2. Some of these are easier to remove than others, but the ones that are difficult for us to remove are also difficult for them to filter out. For example, they could remove most of GPT2's submissions by removing any submission with double quotes.
It might also be a good idea to leave these signs in the output every so often so that they implement filters that could remove false negatives. For example, if they implement a filter for quotation marks, they could remove some real submissions that have quotation marks.
Also, someone on Discord pointed out a couple things to fix to make it harder for humans to distinguish between fake and real form submission:
- we should vary capitalization and punctuation usage, as well as spelling accuracy (someone who wants to delve into an interesting problem could write some code to figure out how to misspell words that are commonly misspelled -- probably don't switch around "n" and "a" in "an", but do do things like write "gynocoligist" instead of "gynecologist", do you know what i mean?)
- vary output length
- vary output of other form entries (ip address, person getting the abortion -- as long as we don't vary that it's going to be very easy for them to filter our form responses no matter how good GPT2 is)
ideally someone could set up a sort of Turing test with our bot where we have people write real-looking form entries and compare them to our auto-generated ones to see if humans can accurately tell the difference
These are all great ideas. I recommend submitting a PR in the form generation API repo here.
I have started working on introducing typos into the messages in the typos branch.