Question

这里是Python的新手。我正在运行Selenium Web驱动程序以便从网站查询某些信息（只能从我的组织访问，是的，SQL查询会更好，但这是我目前正在努力的工作）。我正在使用Selenium的.text方法从表中检索文本，而我print(XXX.text)则返回类似的内容。

XXX.pdf
[Remove]
XXX.pdf
[Remove]
etc...

问题是我想删除[Remove]，以便让我留下类似的内容：

XXX.pdf
XXX.pdf

甚至更好

XXX.pdf, XXX.pdf

这是我到目前为止一直没有尝试的尝试。

dataElement = driver.find_element_by_css_selector('''blah blah blah''')                                             
datasheets = str(dataElement.text)
datasheets.replace('[Remove]','')
print(datasheets)

Python 3.5 硒2

感谢您的帮助。：）

Answer 1

In [26]: data = '''\
    ...: XXX.pdf
    ...: [Remove]
    ...: XXX.pdf
    ...: [Remove]\
    ...: '''

In [27]: def func(string, rep):
    ...:     return ', '.join([x for x in string.split('\n') if x != rep])
    ...: 

In [28]: func(data, '[Remove]')
Out[28]: 'XXX.pdf, XXX.pdf'

您可以使用类似的东西。

Answer 2

结果显示什么？也许你忘记了什么。

dataElement = driver.find_element_by_css_selector('''blah blah blah''') datasheets = str(dataElement.text) datasheets = datasheets.replace('[Remove]','') print(datasheets)

Answer 3

尝试一下：

    <ul>
        <li>Some List Item</li>
        <li class="needAwayToMarkThisAsHasChildUl">Some List Item
            <ul>
                <li>Some sub list item</li>
            </ul>
        </li>
    </ul>

Answer 4

您需要执行以下操作来解析您的输出。

dataElement = driver.find_element_by_css_selector("blah blah blah")
#I don't know what type is this one, but I asume it's a iterable. 

removes = Set(["[remove]","[remove1]", "[remove2]"])
#You can have a set of the strings you want to remove
for data in dataElement:
#for every unit in this iterable variable we'll do the next lines
    if str(data) in removes == False:
    #if something it is not actually in the set of unwanted stuff.                          
        print(str(data))
        #this is your useful output
        #whatever you wanna do to the filtered output.
    else:
        #this is the stuff you don't want to use, the [remove] ones

我希望这会给您提示。问候。

在Python中编辑多行字符串

4 个答案: