inetdata
inetdata copied to clipboard
Utilize Project Sonar API if possible
Overview
Project Sonar provides free API access to obtain archives which are updated more frequently. These changes still allow usage of the older archives without API access. Additionally, Project Sonar archive downloads are now executed in parallel via hydra.
Authenticated Download
Download Project Sonar with API key.
Updated config with default value for sonar_api_base_url and redacted value for sonar_api_key.
$ grep sonar ./conf/inetdata.json
"sonar_base_url": "https://opendata.rapid7.com",
"sonar_api_base_url": "https://us.api.insight.rapid7.com/opendata/studies",
"sonar_api_key": "[REDACTED]",
Run The Jewels.
$ ./bin/download.sh -s sonar
...
2020-08-20 19:14:21 [download] Download initiated with sources: sonar
2020-08-20 19:14:28 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-03-1596466509-fdns_txt_mx_dmarc.json.gz completed with 22890208 bytes
2020-08-20 19:14:28 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-02-1596407508-fdns_txt_mx_mta-sts.json.gz completed with 9905564 bytes
2020-08-20 19:18:22 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-16-1597591348-fdns_cname.json.gz completed with 2523636183 bytes
2020-08-20 19:18:49 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-19-1597840086-fdns_mx.json.gz completed with 3534611752 bytes
2020-08-20 19:20:06 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-17-1597704652-fdns_txt.json.gz completed with 5063172990 bytes
2020-08-20 19:20:39 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-18-1597751163-fdns_aaaa.json.gz completed with 4357061904 bytes
2020-08-20 19:26:33 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-18-1597786965-rdns.json.gz completed with 12512303461 bytes
2020-08-20 19:31:45 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-14-1597363608-fdns_a.json.gz completed with 24259215447 bytes
2020-08-20 19:44:01 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-08-15-1597533042-fdns_any.json.gz completed with 19534742886 bytes
2020-08-20 19:44:01 [download] Download completed with sources: sonar
Normalize.
$ ./bin/normalize.sh -s sonar
...
2020-08-20 20:02:41 [normalize] Normalize initiated with sources: sonar
2020-08-20 20:02:41 [normalize] [sonar] Running nice pigz -dc /data/inetdata/data/cache/sonar/2020-08-02-1596407508-fdns_txt_mx_mta-sts.json.gz | nice inetdata-sonardnsv2-split -t /home/...
...
[*] [inetdata-dns2mtbl] Read 53378837 and wrote 53378837 records in 521 seconds (102302/s in, 102302/s out) (merged: 157, invalid: 0)
[*] [inetdata-dns2mtbl] Read 53378837 and wrote 53378837 records in 522 seconds (102106/s in, 102106/s out) (merged: 157, invalid: 0)
2020-08-21 16:41:20 [normalize] Normalize completed with sources: sonar
Unauthenticated Download
Download Project Sonar without API key.
Standard-issue config with new entries and default values for sonar_api_base_url and sonar_api_key.
$ grep sonar ./conf/inetdata.json
"sonar_base_url": "https://opendata.rapid7.com",
"sonar_api_base_url": "https://us.api.insight.rapid7.com/opendata/studies",
"sonar_api_key": "",
Run The Jewels.
$ ./bin/download.sh -s sonar
...
2020-08-21 17:24:19 [download] Download initiated with sources: sonar
2020-08-21 17:24:22 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-01-1593563501-fdns_txt_mx_dmarc.json.gz completed with 21769748 bytes
2020-08-21 17:24:23 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-01-1593561899-fdns_txt_mx_mta-sts.json.gz completed with 9430510 bytes
2020-08-21 17:28:12 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-25-1595720464-fdns_cname.json.gz completed with 2485498593 bytes
2020-08-21 17:29:37 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-25-1595638339-fdns_mx.json.gz completed with 3559892466 bytes
2020-08-21 17:30:08 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-24-1595549100-fdns_aaaa.json.gz completed with 4474523289 bytes
2020-08-21 17:30:22 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-26-1595801134-fdns_txt.json.gz completed with 5091576038 bytes
2020-08-21 17:34:26 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-29-1596003565-rdns.json.gz completed with 12323913408 bytes
2020-08-21 17:39:41 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-25-1595720498-fdns_a.json.gz completed with 24537931344 bytes
2020-08-21 17:54:20 [download] [sonar] > Downloading of /data/inetdata/data/cache/sonar/2020-07-24-1595549209-fdns_any.json.gz completed with 35350204643 bytes
2020-08-21 17:54:20 [download] Download completed with sources: sonar