gccrs
gccrs copied to clipboard
Enforce file contents are valid UTF-8 in include_str! macro
Strings and strs are always valid UTF-8 in Rust. We should enforce that in the include_str! builtin macro.
https://github.com/Rust-GCC/gccrs/blob/0fa882160df40cee56b5cdd0a2953b4abb4b9d18/gcc/rust/expand/rust-macro-builtins.cc#L232-L233
Yes! I think this is fine for now, but we should definitely open up an issue (and maybe turn this into a FIXME instead). rustc errors out if the included string is not valid UTF-8:
arthur@platypus ~/G/gccrs (laxer-parser-on-stmt-fragment) [1]> wget https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt -O incorrect.utf # snip arthur@platypus ~/G/gccrs (laxer-parser-on-stmt-fragment)> cat test.rs fn main() { println!("{}", include_str!("incorrect.utf")); } arthur@platypus ~/G/gccrs (laxer-parser-on-stmt-fragment)> rustc test.rs error: incorrect.utf wasn't a utf-8 file --> test.rs:2:20 | 2 | println!("{}", include_str!("incorrect.utf")); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | = note: this error originates in the macro `include_str` (in Nightly builds, run with -Z macro-backtrace for more info) error: aborting due to previous error
Originally posted by @CohenArthur in https://github.com/Rust-GCC/gccrs/pull/1043#discussion_r831879088