linkedin_scraper icon indicating copy to clipboard operation
linkedin_scraper copied to clipboard

Scraping Company employees list only returns Null

Open yoweeking opened this issue 3 years ago • 8 comments

trying to scrape companies and a lot of times I am getting Null in the employees list, here is the code I am running:

chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome("../chromedriver.exe")
company = Company("https://www.linkedin.com/company/genesis-trading/", get_employees=True, driver=driver)

and the response looks like this:

{
  "name": "Genesis",
  "about_us": "Genesis facilitates billions in digital currency trades, loans and transactions on a monthly basis.  Our team combines decades of experience at top Wall Street investment banks with a deep understanding of cryptocurrency markets.  Our platform provides a single point of access for digital asset trading, derivatives, borrowing, lending, custody and prime brokerage services.",
  "specialties": null,
  "website": "http://www.genesistrading.com",
  "industry": "Financial Services",
  "company_type": "Genesis",
  "headquarters": "New York, NY",
  "company_size": "51-200 employees",
  "founded": "2013",
  "affiliated_companies": [

  ],
  "employees": [
    null,
    null
  ],
  "headcount": null
}

It seems to work for some companies, but also not for a lot of companies, do you know what might be the root cause?

yoweeking avatar Nov 19 '22 14:11 yoweeking

css selector for container of employees is too generic sometimes and it grabs the wrong element. Going down the list of Fortune 500 companies, this solution couldn't get the first one on the list which was Walmart. I had to rewrite to the appropriate element as well as load any Dom elements I could from the infinite scrolling list to grab a good sample size.

cygnus-dawn avatar Nov 24 '22 17:11 cygnus-dawn

Thanks @DA-Mena - do you have an example of how you managed to get it to work please?

yoweeking avatar Nov 24 '22 17:11 yoweeking

Hey @yoweeking and @DA-Mena, thanks for sharing this! Did either of you find a way around this problem?

tommysteryy avatar May 01 '23 22:05 tommysteryy

Sorry I've been busy for a while, I'll try and post what I had. It is as is and was working to some extent for me depending on how large the employee list is. I'll post here my edit when I get back from trip.

cygnus-dawn avatar May 01 '23 22:05 cygnus-dawn

I'll try and "parameterize" what I did and will try and open a PR when I think it's good enough

cygnus-dawn avatar May 01 '23 22:05 cygnus-dawn

Thanks @DA-Mena! Hope you're enjoying your trip - any luck with this?

tommysteryy avatar May 17 '23 23:05 tommysteryy

@tommysteryy please see if this works for you https://github.com/DA-Mena/linkedin_scraper/tree/fortuneListRunThroughFixes

ver 2.11.1 looks like they tried to attempt it and all I did was added some minor edits where it was failing for me for big company pages. It isn't going to get you all employees as this will be a lengthy process and will depend if LinkIn's choice of frontend framework will handle all that paging when scraping. I tested a run on this new ver and get 800+ (versus without my minor edits which was just only 1 employee) employees from Walmart's 40k+ list...

cygnus-dawn avatar May 18 '23 02:05 cygnus-dawn

for a narrow scope of employees you could try putting data in the search box

        filter = "people-search-keywords"
        filterBox = driver.find_element_by_id(filter)
        filterBox.send_keys("project")
        filterBox.send_keys(Keys.RETURN)

cygnus-dawn avatar May 18 '23 02:05 cygnus-dawn