tranco-list
tranco-list copied to clipboard
Tranco: An improved top websites ranking
Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation
By Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Korczyński and Wouter Joosen
This repository contains the source code driving the generation of the Tranco ranking provided at https://tranco-list.eu/. This new top websites ranking was proposed in our paper Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation.
combined_lists.pycontains the core code for generating new lists based on a configuration passed tocombined_lists.generate_combined_list.shared.pyandglobal_config.pycontain several configuration variables;shared.DEFAULT_TRANCO_CONFIGgives the configuration of the default (daily updated) Tranco list.generate_daily_list.pyruns daily to generate the default Tranco list.job_handler.pycontains either the code for submitting jobs to anrqqueue for processing, or code to relay requests for list generation to a remote host.job_server.pyaccepts request for list generation on a remote host.notify_email.pycontains code to notify users when their list has been generated.generate_domain_parts.pypreprocesses rankings to extract the different components of domains.