Leave invisible characters in string fields
Description
This is a further improvement to stripping invisible characters. Instead of always removing invisible unicode characters, now they're only removed from non-string fields so we don't modify values where the character may be desired.
medic/cht-core#7676
Code review checklist
- [ ] Readable: Concise, well named, follows the style guide, documented if necessary.
- [ ] Documented: Configuration and user documentation on cht-docs
- [ ] Tested: Unit and/or e2e where appropriate
- [ ] Internationalised: All user facing text
- [ ] Backwards compatible: Works with existing data and configuration or includes a migration. Any breaking changes documented in the release notes.
License
The software is provided under AGPL-3.0. Contributions to this project are accepted under the same license.
@binokaryg Can I get a review for this work? There's a bit of refactoring but hopefully it's the best of both worlds for processing zero width characters.
we need to make sure that the filed Patient ID should be an integer type, not a string in the configuration.
I don't think that'll work because the patient ID may start with a leading 0 (eg: "01234") which would get truncated if you convert to an integer. I've added special handling for patient_id and place_id which I think works.
However, in the raw message too, the zero-width character is removed from the form code. Could there be an easy way to preserve it?
It was a bit tricky because of the spaghetti like nature of this code but I've added a line to reset it to the raw content before saving it which I think achieves this, albeit in a hacky way.
@binokaryg Thanks! I've made those improvements - can you have another look please?