Should mime just use the MIME sniffing algorithm?
The target domain of the mime crate is webdev. Instead of following the original RFCs (as is done now), perhaps it's best to just use the sniffing algorithm that is now used by web browsers.
cc @nox @SimonSapin @rustonaut
https://mimesniff.spec.whatwg.org/ is called "MIME Sniffing" and contains a parse a MIME type algorithm that is relevant.
But "sniffing" refers to looking at the contents of a file or the body of an HTTP response (in addition to other signals) to make a guess at the actual file format, in case the Content-Type header is missing or unspecific or inaccurate. For example, if the first 6 bytes of a file are GIF89a in ASCII it’s very probably a GIF, especially if it’s used in <img>. That spec also has algorithms for this.
This kind of sniffing can be useful, but I don’t know if it should be in scope for this crate.
Sorry, I don't mean sniffing the body bytes, just using the parse algorithm mentioned in that document.
So, looking through the test cases, I noticed this as a valid MIME type:
!#$%&'+-.^_`|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz/!#$%&'+-.^`|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz;!#$%&'*+-.^ `|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz=!#$%&'*+-.^_`|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
Something I appreciate in the API in mime/master is the difference between MediaType and MediaRange. They allow things like text/* to be a MediaRange, but not MediaType. That combined with headers::ContentType would help prevent setting a frankly bogus content-type header (even though mimesniff says to parse it).
So I'm torn.
After some more thought, the advantages of just following what the Fetch spec wants outweighs having MediaType and MediaRange splits.
So, the new plan is to remove the split, only having Mime again, and only supporting the mimesniff parsing algorithm.
The closest it is to the mimesniff algorithm, the more we can make use of it.
What would be useful too is a way to represent just the essence of a mime type, because many specs have prose about that.
Hi,
Is there a way to expose the both parsers (rfc and mime-sniff)? Actually i'd like to make some servo tests pass, so i need to follow the mime-sniff algo. @SimonSapin already has implemented it in rust-url (but not officially exposed by the crate). Should i duplicate the code in servo or can i help here?
Regards