almanac.httparchive.org
almanac.httparchive.org copied to clipboard
Cookies sql 2024
Queries for the Cookies Chapter 2024 (#3617)
Extract cookies into intermediate table to reduce size of data processed
- [x] 0_create_desktop_cookies.sql
- [x] 0_create_mobile_cookies.sql
Prevalence Cookies type and attributes
- [x] prevalence_attributes_per_type.sql
- [x] prevalence_type_attributes_per_rank.sql
Top Cookies of each type and top domains setting the most cookies
- [x] top_20_domains_setting_cookies.sql
- [x] top_20_first_party_cookies.sql
- [x] top_20_third_party_cookies.sql
Nb cookies
- [x] nb_cookies_cdf.sql
- [x] nb_cookies_per_type_quantiles.sql
- [x] nb_cookies_quantiles.sql
Size cookies
- [x] size_cookies_cdf.sql
- [x] size_cookies_per_type_quantiles.sql
- [x] size_cookies_quantiles.sql
- [x] size_extract_largest.sql
Age of cookies
- [x] age_expire_cookies_per_type_quantiles.sql
- [x] age_expire_cookies_quantiles.sql
- [x] age_expires_cookies_cdf.sql
New Privacy Sandbox APIs
-
CHIPS:
- [x] CHIPS_top_20_first_party_cookies.sql
- [x] CHIPS_top_20_third_party_cookies.sql
-
who is using them?
- [x] see status in #3653
-
RWS & Attestation File
- [x] get list of domains with potentially the corresponding well-known file set - see #3653
- [x] Use of this custom crawler from https://privacysandstorm.com/datasets_tools/well_known_crawler/ to parse and check that well-known files are actually valid
cc @ydimova, @shaoormunir, @samdutton @ChrisBeeti