gccrs icon indicating copy to clipboard operation
gccrs copied to clipboard

Enforce file contents are valid UTF-8 in include_str! macro

Open dafaust opened this issue 3 years ago • 1 comments

Strings and strs are always valid UTF-8 in Rust. We should enforce that in the include_str! builtin macro.

https://github.com/Rust-GCC/gccrs/blob/0fa882160df40cee56b5cdd0a2953b4abb4b9d18/gcc/rust/expand/rust-macro-builtins.cc#L232-L233

dafaust avatar Mar 24 '22 22:03 dafaust

Yes! I think this is fine for now, but we should definitely open up an issue (and maybe turn this into a FIXME instead). rustc errors out if the included string is not valid UTF-8:

arthur@platypus ~/G/gccrs (laxer-parser-on-stmt-fragment) [1]> wget https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt -O incorrect.utf # snip
arthur@platypus ~/G/gccrs (laxer-parser-on-stmt-fragment)> cat test.rs
fn main() {
    println!("{}", include_str!("incorrect.utf"));
}
arthur@platypus ~/G/gccrs (laxer-parser-on-stmt-fragment)> rustc test.rs
error: incorrect.utf wasn't a utf-8 file
 --> test.rs:2:20
  |
2 |     println!("{}", include_str!("incorrect.utf"));
  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: this error originates in the macro `include_str` (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to previous error

Originally posted by @CohenArthur in https://github.com/Rust-GCC/gccrs/pull/1043#discussion_r831879088

dafaust avatar Mar 24 '22 22:03 dafaust