这里是Python的新手。我正在运行Selenium Web驱动程序以便从网站查询某些信息(只能从我的组织访问,是的,SQL查询会更好,但这是我目前正在努力的工作)。我正在使用Selenium的.text
方法从表中检索文本,而我print(XXX.text)
则返回类似的内容。
XXX.pdf
[Remove]
XXX.pdf
[Remove]
etc...
问题是我想删除[Remove]
,以便让我留下类似的内容:
XXX.pdf
XXX.pdf
甚至更好
XXX.pdf, XXX.pdf
这是我到目前为止一直没有尝试的尝试。
dataElement = driver.find_element_by_css_selector('''blah blah blah''')
datasheets = str(dataElement.text)
datasheets.replace('[Remove]','')
print(datasheets)
Python 3.5 硒2
感谢您的帮助。 :)
答案 0 :(得分:2)
In [26]: data = '''\
...: XXX.pdf
...: [Remove]
...: XXX.pdf
...: [Remove]\
...: '''
In [27]: def func(string, rep):
...: return ', '.join([x for x in string.split('\n') if x != rep])
...:
In [28]: func(data, '[Remove]')
Out[28]: 'XXX.pdf, XXX.pdf'
您可以使用类似的东西。
答案 1 :(得分:2)
结果显示什么?也许你忘记了什么。
dataElement = driver.find_element_by_css_selector('''blah blah blah''')
datasheets = str(dataElement.text)
datasheets = datasheets.replace('[Remove]','')
print(datasheets)
答案 2 :(得分:0)
尝试一下:
<ul>
<li>Some List Item</li>
<li class="needAwayToMarkThisAsHasChildUl">Some List Item
<ul>
<li>Some sub list item</li>
</ul>
</li>
</ul>
答案 3 :(得分:0)
您需要执行以下操作来解析您的输出。
dataElement = driver.find_element_by_css_selector("blah blah blah")
#I don't know what type is this one, but I asume it's a iterable.
removes = Set(["[remove]","[remove1]", "[remove2]"])
#You can have a set of the strings you want to remove
for data in dataElement:
#for every unit in this iterable variable we'll do the next lines
if str(data) in removes == False:
#if something it is not actually in the set of unwanted stuff.
print(str(data))
#this is your useful output
#whatever you wanna do to the filtered output.
else:
#this is the stuff you don't want to use, the [remove] ones
我希望这会给您提示。问候。