Agree on convention for including or not HTTPS URLs
In https://github.com/citizenlab/test-lists/pull/38#issuecomment-211993308 we discussed about the pro and contra of including or not https URLs in the testing lists.
@sneft
This is a debate we've had internally a number of times. It's obvious inefficient to repeatedly test the full path of HTTPS URLs for the same domain, but at the same time we didn't want to make any initial assumptions in constructing the lists that would limit their usefulness in the long-run. In other words, trying to future-proof the lists, leave open the possibility that someone's being MITM'd, etc. We generally have settled on being exhaustive (in some cases testing both HTTP/HTTPS versions) as the performance impact wasn't our primary consideration. One middle ground may be to include a small sample of HTTPS full-path URLs (e.g. a handful of full HTTPS URLs of sensitive Twitter accounts or YouTube videos) while testing the HTTP version of the rest.
@hellais
One thing to take into consideration is the fact that if you only do testing for the HTTPS version of the site you will not catch cases in which the specific path is being filtered by a transparent proxy that doesn't do MITM. I guess in the end it's probably up to the tool that does the measurements to take into account the fact that it should be testing the URL in both it's HTTPS and HTTP version. That said if we decide to include in the URLs to test the HTTPS version of websites it should be consistent across all of them. That is if the site does support HTTPS the URL should always be listed using https. It will then be up to the tool consuming the testing list to take into account the fact it should be testing for the URL in both it's HTTPS and HTTP version.
Do we agree that the convention to adopt is "when the site supports HTTPS list it in HTTPS" and it should be up to the tool using the list to figure out that it should ALSO be testing for "HTTP" when testing a URL? If so I can apply changes in batch to all the lists to update the URLs that also support HTTPS to have them prefixed by that. Changes will also be made to ooniprobe to support testing for HTTP AND HTTPS when it finds HTTPS URLs.
I think this is also relevant to @rpanah for centinel testing.
@hellais regarding:
"Do we agree that the convention to adopt is "when the site supports HTTPS list it in HTTPS" and it should be up to the tool using the list to figure out that it should ALSO be testing for "HTTP" when testing a URL?"
Yes, I agree (and have been following this logic when adding URLs). In some (rare) cases though, I have added both the HTTP and HTTPS versions of sites.