Question

我从一个网站上删除了三个列表，然后将它们打印到Selenium中。那些是团队，赔率和Href。但是，这些列表无法正确写入CSV文件。我希望将每个列表放入第1,2和3列。任何帮助？

我倾向于获得很多：<selenium.webdriver.remote.webelement.WebElement (session="211dc26889dedb4d1d5db5f355c9b225", element="0.936313100855265-9")>

我的数据如下：https://ibb.co/iW6rbk

我希望它看起来像：https://ibb.co/fhna2Q

我认为这是由于它编写网页元素而不是我真正想要的。有关如何调整我的代码以便实际写出我想要的内容（已删除的值）的任何建议吗？

由于

 from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    import csv
    import requests
    import time
    from selenium import webdriver
    driver = webdriver.Chrome(executable_path=r'C:\Brother\chromedriver.exe')
    driver.set_window_size(1024, 600)
    driver.maximize_window()


    driver.get('https://www.bookmaker.com.au/sports/soccer/37854435-football-australia-australian-npl-2-new-south-wales/')

    SCROLL_PAUSE_TIME = 0.5

    # Get scroll height
    last_height = driver.execute_script("return document.body.scrollHeight")

    while True:
        # Scroll down to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        # Wait to load page
        time.sleep(SCROLL_PAUSE_TIME)

        # Calculate new scroll height and compare with last scroll height
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:
            break
        last_height = new_height

    time.sleep( 5 )

    #link
    elems = driver.find_elements_by_css_selector("h3 a[Href*='/sports/soccer']")
    for elem in elems:
        print(elem.get_attribute("href"))



    #TEAM
    langs1 = driver.find_elements_by_css_selector(".row:nth-child(1) td:nth-child(1)")
    for lang in langs1:
        print (lang.text)



    time.sleep( 10)

    #ODDS
    langs = driver.find_elements_by_css_selector(".row:nth-child(1) span")
    for lang in langs:
        print (lang.text)






    time.sleep( 10 )

    import csv

    with open ('I AM HERE12345.csv','w') as file:
       writer=csv.writer(file)
       for row in langs, langs1, elems:
          writer.writerow(row)

Answer 1

您的代码中存在两个问题

#TEAM
langs1 = driver.find_elements_by_css_selector(".row:nth-child(1) td:nth-child(1)")
for lang in langs1:
    print (lang.text)

langs1是一个元素数组。您打印每个文本，但数组仍然只有元素而不是文本。那么，如果您从未存储过文本，如何将其添加到CSV？所以我改变它如下。不是最优化的代码，但工作

langs1 = driver.find_elements_by_css_selector(".row:nth-child(1) td:nth-child(1)")
langs1_text = []

for lang in langs1:
    print(lang.text)
    langs1_text.append(lang.text)

接下来你的csv循环错误

for row in langs_text, langs1_text, elem_href:
    writer.writerow(row)

此循环将所有数组合并为单行而非多行。你需要的是每次一个数组中的一个值

for row in zip(langs_text, langs1_text, elem_href):
    writer.writerow(row)

修改-1

虽然可以让你的代码工作。但使用的方法是不对的。当您想要从多个部分捕获数据时，您应该遍历每个部分，然后从该部分收集数据。

我更改了代码

from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.common.by import By import csv import requests import time from selenium import webdriver driver = webdriver.Chrome() driver.set_window_size(1024, 600) driver.maximize_window() driver.get('https://www.bookmaker.com.au/sports/soccer/36116103-football-russia-russian-national-football-league/') SCROLL_PAUSE_TIME = 0.5 # Get scroll height last_height = driver.execute_script("return document.body.scrollHeight") while True: # Scroll down to bottom driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") # Wait to load page time.sleep(SCROLL_PAUSE_TIME) # Calculate new scroll height and compare with last scroll height new_height = driver.execute_script("return document.body.scrollHeight") if new_height == last_height: break last_height = new_height time.sleep(5) sections = driver.find_elements_by_css_selector(".fullbox") # link import csv with open('I AM HERE12345.csv', 'w') as file: writer = csv.writer(file) for section in sections: link = section.find_element_by_css_selector("h3 a").get_attribute("href") team_name = section.find_element_by_css_selector("tr.row[data-teamname]").get_attribute("data-teamname") bet = section.find_element_by_css_selector("a.odds.quickbet").text writer.writerow((bet, team_name, link))

CSV生成正常

修改-2

空白行的问题特定于Windows，这就是为什么没有出现在我的Mac上。您可以使用以下任何方法摆脱它

with open('I AM HERE12345.csv', 'w', newline='') as file:

或

with open('I AM HERE12345.csv', 'w', newline='\n') as file:

如何为每个列表写入Csv文件

1 个答案: