cc.py
cc.py copied to clipboard
Extracting URLs of a specific target based on the results of "commoncrawl.org"
Results
3
cc.py issues
Sort by
recently updated
recently updated
newest added
I was hoping to use this project to look at some newer data. I assume I should just add the name of the indexes in the file 'index.txt'?
``` [!] Processing year: 2019 [-] CC-MAIN-2019-51 Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 543, in _update_chunk_length self.chunk_left = int(line, 16) ValueError: invalid literal for int() with base 16:...