我正在通过以下链接从事此项目:https://medium.com/cognifeed/targeting-influencers-with-machine-learning-part-1-scraping-instagram-f3c5d875fcc9
据我所知,项目的这一部分(我对机器学习和Python的了解不多,但想与项目一起学习)只是从.txt文件中提供的信息中提取信息。
我现在正在尝试在Windows 10 cmd中执行python文件,但是出现如下图所示的错误。我只是划掉了顶部,因为我错误地输入了命令a在我开始工作之前有几次,所以我不相信这对于我底部的KeyErrors没有任何用处。
以下是实际Python Shell中的回溯错误:
Traceback (most recent call last):
File "C:\Users\nzand\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\common\service.py", line 72, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "C:\Users\nzand\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\nzand\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 1307, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
在处理上述异常期间,发生了另一个异常:
Traceback (most recent call last):
File "C:\Users\nzand\Desktop\Nicholas Zachariah\Machine Learning\Projects\Instagram\SeaDeeper\cognifeed-instagram-scraper-master\main.py", line 55, in <module>
main()
File "C:\Users\nzand\Desktop\Nicholas Zachariah\Machine Learning\Projects\Instagram\SeaDeeper\cognifeed-instagram-scraper-master\main.py", line 43, in main
scraper = InstagramScraper(args.chromedriver_path)
File "C:\Users\nzand\Desktop\Nicholas Zachariah\Machine Learning\Projects\Instagram\SeaDeeper\cognifeed-instagram-scraper-master\scraper.py", line 17, in __init__
self.browser = webdriver.Chrome(
File "C:\Users\nzand\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 73, in __init__
self.service.start()
File "C:\Users\nzand\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'chromedriver.exe' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
main.py中的代码:
from scraper import InstagramScraper
import pandas as pd
import argparse
parser = argparse.ArgumentParser(
description='Scrapes instagram posts from given users.')
parser.add_argument(
'--username_list',
dest='username_list_path',
default='./lists/influencers.txt',
help='Path to the list of instagram accounts that you want scraped.')
parser.add_argument(
'--chromedriver_path',
dest='chromedriver_path',
default='C:/Chromedriver/chromedriver.exe',
help=
'Path to Chromedriver binary. See http://chromedriver.chromium.org/getting-started'
)
parser.add_argument(
'--out_file',
dest='out_file',
default='./dataset.csv',
help=
'The file in which the scraped info will be stored. You should name it \'filename.csv\''
)
def load_influencer_list(path):
with open(path, 'r') as in_file:
influencers = in_file.read().splitlines()
return influencers
def save_data(posts_info, save_path):
df = pd.DataFrame.from_dict(posts_info, orient='index')
df.set_index('shortcode', inplace=True)
df.to_csv(save_path, sep=',', encoding='utf-8')
def main():
args = parser.parse_args()
scraper = InstagramScraper(args.chromedriver_path)
influencers = load_influencer_list(args.username_list_path)
all_data = {}
for i, influencer in enumerate(influencers):
print("{}/{}: Getting {}'s posts...".format(i + 1, len(influencers),
influencer))
links = scraper.get_posts_from_user(influencer, 10)
all_data.update(scraper.get_data(links))
save_data(all_data, args.out_file)
if __name__ == "__main__":
main()
我希望有人熟悉这个项目,并且可以帮助我理解为什么会出现这些错误。