encoding
encoding copied to clipboard
Encoding Standard
The large majority of users of `TextEncoder` and `TextDecoder` do not make use of streaming capabilities. Instead they just need to decode a single chunk, or encode a single chunk....
NOTE: this PR is currently blocked on a technicality (https://github.com/tabatkins/bikeshed/issues/2270). This PR adds support for static `TextDecoder.decode`, `TextEncoder.encode` and `TextEncoder.encodeInto` methods. These do not add new functionality, rather just acting...
Cause GB18030-2005 is already one-to-one mapping bettween Unicode & GBK18030 except The 14 characters that still mapped into Unicode PUA that at 2005, But nowadays, all the 14 characters have...
Concatenating two ISO-2022-JP outputs from a conforming encoder doesn't result in conforming input
Encodings other than ISO-2022-JP have the property that if you concatenate two outputs from a conforming encoder and decode them together, you get the same result as when decoding them...
There are some sequences of bytes that are valid lead-trailing according to the description at https://encoding.spec.whatwg.org/#big5-decoder, but don't have a corresponding Unicode codepoint in the `index-big5.txt` mapping table. In this...
@sideshowbarker #260 fails to build because https://encoding.spec.whatwg.org/windows-1253-bmp.html has `aria-label` usage that the HTML checker warns for. Basically, it's a lot of stuff like: > `U+0000� `
https://encoding.spec.whatwg.org/commit-snapshots/75b988c17cc2b90266e69526f399c7916c3e0ef0/#index-gb18030-ranges-pointer https://encoding.spec.whatwg.org/commit-snapshots/75b988c17cc2b90266e69526f399c7916c3e0ef0/#gb18030-encoder index-gb18030-ranges.txt maps more-or-less continuous pointers to Unicode values. Reverse process may produce invalid results due to the fact that Unicode values are not continuous: U+00A4, U+00A5 maps to...
https://encoding.spec.whatwg.org/commit-snapshots/4d54adce6a871cb03af3a919cbf644a43c22301a/#visualization > Let index be index Big5 excluding all entries whose pointer is less than \(0xA1 \- 0x81\) × 157\. > > Avoid returning Hong Kong Supplementary Character Set extensions...
This is WIP, mainly since I'm a little unsure we want to go this far, but I also kinda like it. @andreubotella @ricea @domenic @aphillips thoughts? *** Preview | Diff
Results for a series of tests for gb18030 encoding/decoding can be found at https://www.w3.org/International/tests/repo/results/encoding-dbl-byte.en#gb18030 The tests can be run from that page (select the link in the left-most column) or...