facebook-scraper icon indicating copy to clipboard operation
facebook-scraper copied to clipboard

how can we create this ./mbasicHeaders.json file

Open Sarun1001 opened this issue 2 years ago • 9 comments

can you pls provide a sample python script that work i tested using from facebook_scraper import get_posts, _scraper import json for post in get_posts('nintendo', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/nintendo?v=timeline", pages=3, cookies='trollmcookies.txt'): try: print(post['text'][:20]) print(post) except Exception as e: print(f"Error processing post: {e}")
this code and it doesn't return anything

Sarun1001 avatar Dec 26 '23 10:12 Sarun1001

@Sarun1001 the mbasic headers file is a json file that you get from the developer tools. the steps are in the readME file but here are in more details :

  1. open an mbasic facebook page in your browser
  2. open the developer tools (options+command+i) and go to the network tab
  3. on the same dev tools panel open the responsive mode and choose samsung S20 ultra as your device
  4. Refresh the page you opened at first (command+r)
  5. filter by "Doc" the request list and right click on the first page loaded -> click copy -> as cURL
  6. got to https://curlconverter.com/ and paste, from there you can choose python as output and get the headers in a json format.
  7. copy the headers into your repo and into the file that you then inject into your scraper instance

moda20 avatar Dec 26 '23 15:12 moda20

Thanks for sharing the steps.

Successfully scraped posts, but after a few minutes i get temporary block

File "/home/ubuntu/.local/lib/python3.8/site-packages/facebook_scraper/facebook_scraper.py", line 944, in get raise exceptions.TemporarilyBanned(title.text) facebook_scraper.exceptions.TemporarilyBanned: You’re Temporarily Blocked

is there any suggestion to avoid this ban, also now what ? do i need a new ip or new fb account to continue using this, nb: before getting banned i run the script without a cookie file.

Sarun1001 avatar Dec 28 '23 01:12 Sarun1001

@Sarun1001 , Do you have an example of how your headers turned out? It doesn't work for me even though I created the json file correctly.

Jowawis99 avatar Jan 01 '24 20:01 Jowawis99

@Sarun1001 I can't help you there really, just don't use it a lot in rapid succession.

moda20 avatar Jan 01 '24 20:01 moda20

@moda20 There are several headers that exist when copying like CURL, I don't know which headers you use, can you help me, please?

Jowawis99 avatar Jan 01 '24 20:01 Jowawis99

@Jowawis99 Reload the page at step 5 --> select 1st item in 'Name' column --> right click --> copy --> copy as curl --> paste on to curlconverter.com --> select json --> there you ca see an object called header

Sarun1001 avatar Jan 03 '24 15:01 Sarun1001

I did everything above, but it returns nothing. My header looks like this:

{
'authority': 'mbasic.facebook.com',
 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
 'accept-language': 'zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7',
 'cache-control': 'max-age=0',
 'cookie': 'sb=rbAJZVCvEh8Q_09qlyWHTUMS; datr=rrAJZV9gFAn919FtNTDrVruN; c_user=100002607319714; ps_n=0; ps_l=0; dpr=1.5; wd=1280x559; presence=C%7B%22t3%22%3A%5B%5D%2C%22utc3%22%3A1709127707853%2C%22v%22%3A1%7D; xs=116%3ANBx7p68qbfFI0Q%3A2%3A1700836402%3A-1%3A11322%3A%3AAcXWxB1htPS7UjKQ6jWw7RJDJtOSzlOdKRXDk9Uy0tA; fr=1u3FkUasnpLa34xOQ.AWUs6AgQrE7dzqtOlIHC6gUhzlo.Bl30HX..AAA.0.0.Bl30HX.AWXvjSaIynQ; m_page_voice=100002607319714',
 'dpr': '1.5',
 'sec-ch-prefers-color-scheme': 'light',
 'sec-ch-ua': '"Not A(Brand";v="99", "Google Chrome";v="121", "Chromium";v="121"',
 'sec-ch-ua-full-version-list': '"Not A(Brand";v="99.0.0.0", "Google Chrome";v="121.0.6167.185", "Chromium";v="121.0.6167.185"',
 'sec-ch-ua-mobile': '?1',
 'sec-ch-ua-model': '"SM-G981B"',
 'sec-ch-ua-platform': '"Android"',
 'sec-ch-ua-platform-version': '"13"',
 'sec-fetch-dest': 'document',
 'sec-fetch-mode': 'navigate',
 'sec-fetch-site': 'none',
 'sec-fetch-user': '?1',
 'upgrade-insecure-requests': '1',
 'user-agent': 'Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Mobile Safari/537.36',
 'viewport-width': '210'
}

and the code is:

with open('./nintendo.json', 'r') as file:
    _scraper.mbasic_headers = json.load(file)
    
for post in get_posts('NintendoAmerica', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/NintendoAmerica?v=timeline", pages=10):
    print(post['text'][:50])

Could someone please tell me what's wrong?

asheseux16 avatar Feb 29 '24 08:02 asheseux16

@asheseux16 The mbasic headers don't affect the response but rather the quality of the image. Please open another issue with your error. but for starter try to enable logging, to see the library response.

logging.setLevel(logging.DEBUG)

moda20 avatar Feb 29 '24 14:02 moda20

I did everything above, but it returns nothing. My header looks like this:

{
'authority': 'mbasic.facebook.com',
 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
 'accept-language': 'zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7',
 'cache-control': 'max-age=0',
 'cookie': 'sb=rbAJZVCvEh8Q_09qlyWHTUMS; datr=rrAJZV9gFAn919FtNTDrVruN; c_user=100002607319714; ps_n=0; ps_l=0; dpr=1.5; wd=1280x559; presence=C%7B%22t3%22%3A%5B%5D%2C%22utc3%22%3A1709127707853%2C%22v%22%3A1%7D; xs=116%3ANBx7p68qbfFI0Q%3A2%3A1700836402%3A-1%3A11322%3A%3AAcXWxB1htPS7UjKQ6jWw7RJDJtOSzlOdKRXDk9Uy0tA; fr=1u3FkUasnpLa34xOQ.AWUs6AgQrE7dzqtOlIHC6gUhzlo.Bl30HX..AAA.0.0.Bl30HX.AWXvjSaIynQ; m_page_voice=100002607319714',
 'dpr': '1.5',
 'sec-ch-prefers-color-scheme': 'light',
 'sec-ch-ua': '"Not A(Brand";v="99", "Google Chrome";v="121", "Chromium";v="121"',
 'sec-ch-ua-full-version-list': '"Not A(Brand";v="99.0.0.0", "Google Chrome";v="121.0.6167.185", "Chromium";v="121.0.6167.185"',
 'sec-ch-ua-mobile': '?1',
 'sec-ch-ua-model': '"SM-G981B"',
 'sec-ch-ua-platform': '"Android"',
 'sec-ch-ua-platform-version': '"13"',
 'sec-fetch-dest': 'document',
 'sec-fetch-mode': 'navigate',
 'sec-fetch-site': 'none',
 'sec-fetch-user': '?1',
 'upgrade-insecure-requests': '1',
 'user-agent': 'Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Mobile Safari/537.36',
 'viewport-width': '210'
}

and the code is:

with open('./nintendo.json', 'r') as file:
    _scraper.mbasic_headers = json.load(file)
    
for post in get_posts('NintendoAmerica', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/NintendoAmerica?v=timeline", pages=10):
    print(post['text'][:50])

Could someone please tell me what's wrong?

It is a very bad idea to post cookies on the Internet, you should change your Facebook password now

Drzhivago264 avatar Apr 12 '24 02:04 Drzhivago264