Question

这将转到CSV文件中的URL，然后向下滚动。我正在尝试从页面中获取公司URL。我似乎无法正常工作。现在，如果我仅使用一个独立URL而不将其从CSV中提取，它将打印到powershell。仍然无法将其写入CSV。

以下是我正在使用的几个URL：

https://www.facebook.com/search/pages/?q=Los%20Angeles%20remodeling
https://www.facebook.com/search/pages/?q=Boston%20remodeling

我曾经以为它必须是一个循环中的一个循环。或者，它可以是if，elif。我现在还不确定。任何和所有建议，将不胜感激。

import time
from selenium import webdriver
from bs4 import BeautifulSoup as bs
import csv
import requests
from selenium.webdriver.support.ui import WebDriverWait


driver = webdriver.Chrome()
elems = driver.find_elements_by_class_name('_32mo')


chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)


driver.get('https://www.facebook.com')
username = driver.find_element_by_id("email")
password = driver.find_element_by_id("pass")
username.send_keys("*****")
password.send_keys("******")
driver.find_element_by_id('loginbutton').click()
time.sleep(2)



with open('fb_urls.csv') as f_input, open('fb_profile_urls.csv', 'w', newline=)  as f_output:
    csv_input = csv.reader(f_input)
    csv_output = csv.writer(f_output)
    for url in csv_input:
        driver.get(url[0])
        time.sleep(5)
        lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
        match=False
        while(match==False):
            lastCount = lenOfPage
            time.sleep(1)
            lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
            if lastCount==lenOfPage:
                match=True
                for elem in elems:
                    csv_output.(driver.find_elements_by_tag_name('href'))

Answer 1

不是以写模式open('file','w')打开文件，而是以附加模式open('file','a')打开文件

在how to add lines to existing file using python中找到

while循环-将输出发送到csv文件

1 个答案: