I am trying to scrape data from linkedin's public profiles using scrapy. However for every request i am getting 999 response code. I am using RandomUserAgentMiddleware to randomize the user agent strings.
Strange thing is i am not blocked by ip, since i am able to open linkedin in my browser. Are there any specific field i need to pass in my request header ?
I have tried using 'Accept-Encoding': 'gzip, deflate' in the request header following one of the stackoverflow's questions. But it still gave me 999 response code.
Edit:
If i manually set the USER_AGENT in the settings file it works but if i do it using the randomuseragent middleware it doesn't work. Even though the request headers are same in both cases.
Request header with RandomUserAgent middleware
{'Accept-Language': ['en-US,en;q=0.8'], 'Accept-Encoding': ['gzip, deflate, sdch, br'], 'Host': ['www.linkedin.com'], 'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'], 'Upgrade-Insecure-Requests': ['1'], 'Connection': ['keep-alive'], 'User-Agent': ['Mozilla/5.0 (Linux; Android 5.1.1; SM-G928X Build/LMY47X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Safari/537.36']}
Request header with manually setting user agent.
{'Accept-Language': ['en-US,en;q=0.8'], 'Accept-Encoding': ['gzip, deflate, sdch, br'], 'Host': ['www.linkedin.com'], 'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'], 'Upgrade-Insecure-Requests': ['1'], 'Connection': ['keep-alive'], 'User-Agent': ['Mozilla/5.0 (Linux; Android 5.1.1; SM-G928X Build/LMY47X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Safari/537.36']}