remove non printable chars from titles #31
My initial idea is working only for the mentioned \u00ad char. I stripped the non-printable chars by replacing them by " "
Not sure if we need json.dumps and for what it was needed in the first place. If so, we should add a test case for that.
Thanks, Lioman. According to the description in #23, the json.dumps() method:
should handle any arbitrary punctuation marks which may happen to be in the Title -
",',\,*,...etc.
I just tried putting those characters in article titles, and I didn't have any problems with the existing code in main, except that I see a backslash before double-quotation marks in the search result titles. The escaping logic from #15 is adding a backslash where there shouldn't be one.
(I moved the rest of this comment to a more relevant issue.)