keczuppp

Results 12 comments of keczuppp

There exist tools for it already: > ZeroDot1 : Add a URL/Domain extractor. - https://www.google.pl/search?q=domain+extractor > ZeroDot1 : Add a search function - it's a scope of regular expressions: https://regexr.com/...

> ZeroDot1 : Most of the online tools are simply unusable because they do not support the extraction of entire domains with subdomains and long TLDs such as .stream. -...

> ZeroDot1 : Sorry, no the tool does not work. It does work very well, but it extracts domains only and not URLs, I just missed you wanted to extract...

> ZeroDot1 : 2GB text files Wow. Are you sure you don't want to split the file into smaller chunks?: https://stackoverflow.com/questions/18208524/how-do-i-read-a-text-file-of-about-2-gb https://stackoverflow.com/questions/159521/text-editor-to-open-big-giant-huge-large-text-files Also this is a good tool I use...

Yeah, I'm curious as well, how did he end up with a 2GB text file... - is it a some log which accumulated over time (weeks/months) - or is it...

**As for domains extraction:** As I said in https://github.com/funilrys/PyFunceble/issues/13#issuecomment-797590637 : in case of extracting domains in Adblock Decoder, "Decode everything" mode, will give too many useless false positives which will...

Try this offline tool https://www.softpedia.com/get/Office-tools/Other-Office-Tools/Web-Link-Extractor-Linas.shtml, I've tested on easylist and: - the result contain some garbare: [result.zip](https://github.com/funilrys/PyFunceble/files/6291798/result.zip) - you can test your 2 GB file with it to see whether...

And what about Wine? > ZeroDot1 : I think the best solution would be to use PyFunceble to extract domains and subdomains from completely mixed text. It can be done,...

> spirillen : Sounds like an integration of [BeautifulSoup](https://pypi.org/project/beautifulsoup4/) could come in handy!!! I saw it before, but I didn't mention about it coz it supports only HTML or XML,...

> spirillen : https://github.com/funilrys/PyFunceble/issues/255#issuecomment-933797906 I don't know much about these things, I would have to study all that stuff first > funilrys : https://github.com/funilrys/PyFunceble/issues/255#issuecomment-941221455: text in German (IP-based). - in...