language
language copied to clipboard
TempLAMA generation fails with UnicodeEncodeError
When running the TempLAMA generation code on the official Ubuntu 18.04 docker image, non-ASCII characters cause a UnicodeEncodeError in sling2facts.py. I needed to change the write_kb method to write utf-8 encoded bytes rather than text in order to get this to work:
def write_kb(self, filename):
"""Write out all triples rel/subject/object, perhaos with qualifiers."""
with open(filename, 'wb') as fp: # <--- open file in binary mode
for f in self.frames(filter_english=FLAGS.skip_nonenglish):
for t in self.as_triples(SlotCollection(f)):
fp.write((t + '\n').encode('utf-8')) # <--- write utf-8 bytes