bbot icon indicating copy to clipboard operation
bbot copied to clipboard

Add functionality to extract JS strings as links in a javascript blob

Open Sh4d0wHunt3rX opened this issue 2 years ago โ€ข 12 comments

Couldn't get JS strings as links to able to grep

My command: bbot -t trickest.com -m httpx -c web_spider_distance=2 web_spider_depth=3 web_spider_links_per_page=1000 omit_event_types=[] url_extension_httpx_only=[]

image

image

๐Ÿ™

Sh4d0wHunt3rX avatar Feb 23 '24 19:02 Sh4d0wHunt3rX

@liquidsec what do you think about this? We would essentially be implementing js link extractor.

TheTechromancer avatar Feb 24 '24 14:02 TheTechromancer

This is my command:

bbot -t react.dev -m httpx -c web_spider_distance=3 web_spider_depth=3 web_spider_links_per_page=500 omit_event_types=[]

And bbot can't detect any of these JS as links

image

For example this link not exists in output file: https://react.dev/_next/static/chunks/webpack-8af07453075e2970.js

Sh4d0wHunt3rX avatar Feb 26 '24 16:02 Sh4d0wHunt3rX

Added support for extracting URLs from <link> elements: https://github.com/blacklanternsecurity/bbot/pull/1132.

TheTechromancer avatar Feb 26 '24 23:02 TheTechromancer

I add some more examples here for future testing, I guess all of them are related to JS blob.

openai.com image

shopify.com image

atlassian.com image

whatsapp.com image

ahrefs.com image

clickup.com image

Sh4d0wHunt3rX avatar Feb 27 '24 10:02 Sh4d0wHunt3rX

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

TheTechromancer avatar Feb 27 '24 12:02 TheTechromancer

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

bbot -t https://www.atlassian.com/software -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[]

image

I have it like this tens of times on the output file, but it's not as "url": "https://atl-global.atlassian.com/js/atl-global.min.js"

Sh4d0wHunt3rX avatar Feb 27 '24 13:02 Sh4d0wHunt3rX

bbot -t https://www.atlassian.com/software -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[]

I think you're forgetting a config option ;)

image

(The reason this config option exists is because most everyone wants to search javascript files for secrets etc., but if it didn't contain anything interesting, they usually don't want to see it in the output.)

TheTechromancer avatar Feb 27 '24 15:02 TheTechromancer

Thanks ๐Ÿ™ I also used that config, but still same : (

Sh4d0wHunt3rX avatar Feb 27 '24 18:02 Sh4d0wHunt3rX

This is my command:

bbot -t react.dev -m httpx -c web_spider_distance=3 web_spider_depth=3 web_spider_links_per_page=500 omit_event_types=[]

And bbot can't detect any of these JS as links

image

For example this link not exists in output file: https://react.dev/_next/static/chunks/webpack-8af07453075e2970.js

For this one, I just upgraded bbot to v1.1.7.2998rc and this JS only exists as URL UNVERIFIED, but shouldn't it exist as URL too?

https://react.dev/_next/static/chunks/webpack-a1ff329830897a9a.js

My command: bbot -t react.dev -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[] url_extension_httpx_only=[]

image

Sh4d0wHunt3rX avatar Feb 27 '24 18:02 Sh4d0wHunt3rX

@amiremami that specific file is 4 levels deep. The reason it's not showing up is because the spider is set to a depth of 2 (web_spider_depth=2).

If you enable --debug, it will tell you the reason:

2024-02-27 17:00:10,924 [DEBUG] bbot.modules.internal.excavate base.py:1175 Tagging URL_UNVERIFIED("https://react.dev/_next/static/chunks/webpack-ccf89d5e32b01f59.js", module=excavate, tags={'in-scope', 'extension-js', 'endpoint'}) as spider-danger because its spider depth or distance exceeds the scan's limits

TheTechromancer avatar Feb 27 '24 22:02 TheTechromancer

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

Still couldn't get the atlassian neither in URL nor in URL_UNVERIFIED , if this problem is different than JS blob, please check, thanks a lot ๐Ÿ™

Got this today image

Sh4d0wHunt3rX avatar Feb 29 '24 13:02 Sh4d0wHunt3rX

@amiremami keep in mind that https://atl-global.atlassian.com/js/atl-global.min.js is on a different subdomain than www.atlassian.com, so it's not in scope. If you want to see it you will need to either:

  1. increase your scope report distance to see the URL_UNVERIFIED (-c scope_report_distance=1)
  2. whitelist all of atlassian.com to also produce a URL (-w atlassian.com)

image

TheTechromancer avatar Feb 29 '24 15:02 TheTechromancer