如何添加“;”划分并合并行中的以下文本? ..保持原有的编程结构。
分开
在标签文本“ HammarbyvsOstersunds ”中,我想分为 Hammarby ; 与; Ostersunds ;。
组合:
在标签文本“ 期望; In ; 播放; 开始; 销售; 时间:”到预计在Play中开始销售的时间:;
# -*- coding:UTF-8 -*-
import sys
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("url")
table = ['; '.join(["; ".join( j.text.split(" ")) for j in i.find_elements_by_class_name('couponRow') if j.text]) for i in driver.find_elements_by_xpath('//*[@id="todds"]//div[@class="couponTable"]') if i.text]
for line in table:
print line
driver.close()
输出:
Monday;Matches
MON;41;HammarbyvsOstersunds;Expected;In;Play;start;selling;time:
10/07;01:00;1.85;3.50;3.35
Tuesday;Matches
TUE;1;FrancevsBelgium;Expected;In;Play;start;selling;time:
11/07;02:00;2.38;2.82;2.95
Wednesday;Matches
WED;1;CroatiavsEngland;Expected;In;Play;start;selling;time:
12/07;02:00;3.45;2.80;2.15
预期结果:
Monday; Matches;
MON;41;Hammarby; vs; Ostersunds;Expected In Play start selling time: ; 10/07;01:00;1.85;3.50;3.35
Tuesday; Matches;
TUE;1;France;vs;Belgium; Expected In Play start selling time:;11/07;02:00;2.38;2.82;2.95
Wednesday;Matches
WED;1;Croatia;vs;England;Expected In Play start selling time: ;12/07;02:00;3.45;2.80;2.15
答案 0 :(得分:1)
正如已经建议的那样,您应该将任务拆分为较小的可管理代码段。
我对您的代码进行了一些尝试,但是很难从中获得完美的结果。这是我得到的:
table = ['; '.join([" ".join( j.text.split(" ")) for j in i.find_elements_by_class_name('couponRow') if j.text]) for i in driver.find_elements_by_xpath('//*[@id="todds"]//div[@class="couponTable"]') if i.text]
lines = [' ; '.join(t.split('\n')) for t in table]
result = [re.sub(r"([A-Z]\w+)vs([A-Z]\w+)", r'; \1 vs \2;', l, 0, re.MULTILINE) for l in lines]
结果:
['Tuesday Matches; TUE 1 ; France vs Belgium; Expected In Play start selling time: ; 11/07 02:00 --- --- ---',
'Wednesday Matches; WED 1 ; Croatia vs England; Expected In Play start selling time: ; 12/07 02:00 --- --- ---']
不太好,但可能还不错。
主要问题是您的选择器有点宽,会选择原始文本。
您可以尝试将您真正感兴趣的节点归零,例如
teams = [team.text for team in driver.find_elements_by_xpath('//div[@class="cteams"]/span/span[@class="teamname"]')]
..等等