我从一个网站上删除了三个列表,然后将它们打印到Selenium中。那些是团队,赔率和Href。但是,这些列表无法正确写入CSV文件。我希望将每个列表放入第1,2和3列。任何帮助?
我倾向于获得很多:<selenium.webdriver.remote.webelement.WebElement (session="211dc26889dedb4d1d5db5f355c9b225", element="0.936313100855265-9")>
我的数据如下:https://ibb.co/iW6rbk
我希望它看起来像:https://ibb.co/fhna2Q
我认为这是由于它编写网页元素而不是我真正想要的。有关如何调整我的代码以便实际写出我想要的内容(已删除的值)的任何建议吗?
由于
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import csv
import requests
import time
from selenium import webdriver
driver = webdriver.Chrome(executable_path=r'C:\Brother\chromedriver.exe')
driver.set_window_size(1024, 600)
driver.maximize_window()
driver.get('https://www.bookmaker.com.au/sports/soccer/37854435-football-australia-australian-npl-2-new-south-wales/')
SCROLL_PAUSE_TIME = 0.5
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
time.sleep( 5 )
#link
elems = driver.find_elements_by_css_selector("h3 a[Href*='/sports/soccer']")
for elem in elems:
print(elem.get_attribute("href"))
#TEAM
langs1 = driver.find_elements_by_css_selector(".row:nth-child(1) td:nth-child(1)")
for lang in langs1:
print (lang.text)
time.sleep( 10)
#ODDS
langs = driver.find_elements_by_css_selector(".row:nth-child(1) span")
for lang in langs:
print (lang.text)
time.sleep( 10 )
import csv
with open ('I AM HERE12345.csv','w') as file:
writer=csv.writer(file)
for row in langs, langs1, elems:
writer.writerow(row)
答案 0 :(得分:0)
您的代码中存在两个问题
#TEAM
langs1 = driver.find_elements_by_css_selector(".row:nth-child(1) td:nth-child(1)")
for lang in langs1:
print (lang.text)
langs1是一个元素数组。您打印每个文本,但数组仍然只有元素而不是文本。那么,如果您从未存储过文本,如何将其添加到CSV?所以我改变它如下。不是最优化的代码,但工作
langs1 = driver.find_elements_by_css_selector(".row:nth-child(1) td:nth-child(1)")
langs1_text = []
for lang in langs1:
print(lang.text)
langs1_text.append(lang.text)
接下来你的csv循环错误
for row in langs_text, langs1_text, elem_href:
writer.writerow(row)
此循环将所有数组合并为单行而非多行。你需要的是每次一个数组中的一个值
for row in zip(langs_text, langs1_text, elem_href):
writer.writerow(row)
修改-1 强>
虽然可以让你的代码工作。但使用的方法是不对的。当您想要从多个部分捕获数据时,您应该遍历每个部分,然后从该部分收集数据。
我更改了代码
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import csv
import requests
import time
from selenium import webdriver
driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()
driver.get('https://www.bookmaker.com.au/sports/soccer/36116103-football-russia-russian-national-football-league/')
SCROLL_PAUSE_TIME = 0.5
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
time.sleep(5)
sections = driver.find_elements_by_css_selector(".fullbox")
# link
import csv
with open('I AM HERE12345.csv', 'w') as file:
writer = csv.writer(file)
for section in sections:
link = section.find_element_by_css_selector("h3 a").get_attribute("href")
team_name = section.find_element_by_css_selector("tr.row[data-teamname]").get_attribute("data-teamname")
bet = section.find_element_by_css_selector("a.odds.quickbet").text
writer.writerow((bet, team_name, link))
CSV生成正常
修改-2 强>
空白行的问题特定于Windows,这就是为什么没有出现在我的Mac上。您可以使用以下任何方法摆脱它
with open('I AM HERE12345.csv', 'w', newline='') as file:
或
with open('I AM HERE12345.csv', 'w', newline='\n') as file: