Question

这是我要废弃的CSS

<a id="phone-lead" class="callseller-description-link" rel="050 395 7996" href="#">Show Phone Number</a>

目的：

获取CSS中的电话号码。（注意，这种类型的电话号码有多个实例，所以我需要提取所有并将其保存在列表中）

以下是我正在使用的内容：

phone_result=[]
try:
    phone_result = soup.find('a', {'id': 'phone-lead', 'rel':True}).get('rel')
    for a in soup.find_all('a', {'id':'phone-lead', 'rel': True}):
        phone_result+=(a['rel'])
    phone_result=str(phone_result)
    print phone_result

    except StandardError as e:
        phone_result="Error was {0}".format(e)
        print phone_result

问题：

1）它没有给出独特的输出。我试图将字符串转换为集合，但它搞砸了 2）它考虑空格并将它们视为列表的不同条目

输出示例：

['050', '395', '7996', '050', '395', '7996', '04', '551', '9485', '050', '395', '7996', '050', '395', '7996', '04', '551', '9485', '04', '551', '9485', '050', '395', '7996', '050', '395', '7996', '04']

如何修复它以获得类似

的内容

[0503957996, 045519485]

通过帮助解决方案：

phone_result=[]
try:
    # phone_result=  soup.find('a', {'id': 'phone-lead', 'rel': True}).get('rel') (REMOVED)
    for a in soup.find_all('a', {'id':'phone-lead', 'rel': True}):
        phone_result.append(','.join(a['rel']))
    phone_result=str(phone_result)

    print phone_result



except StandardError as e:
    phone_result="Error was {0}".format(e)
    print phone_result

问题： 我的输出就像这样

['055,442,4433','055,334,3342']

我相信我需要削减数字？

Answer 1

a['rel']似乎返回了['050', '395', '7996']之类的列表。所以在你循环中你可以做类似的事情：

phone_result.append(''.join(a['rel']))

请注意list.append在列表末尾添加一个元素（并且不返回任何内容），而+合并两个列表

另外，在循环之前删除第一个soup.find('a',...，否则你会得到它两次。

Answer 2

我不知道这个库，但好像你正在创建几次phone_result列表。

phone_result = [] #  creating phone_result list  
try:
    phone_result = soup.find('a', {'id': 'phone-lead', 'rel':True}).get('rel') # dont know if this creates a list but phone_result is declared again
for a in soup.find_all('a', {'id':'phone-lead', 'rel': True}): # doesn't look right considering the above  
    phone_result += (a['rel']) #  this takes the existing list and adds a['rel] to it 
phone_result = str(phone_result)
print phone_result

获得正确的电话号码列表后，您可以在其上调用set来获取唯一值

Answer 3

你错过了我的answer here。您无需同时使用var topic_id = $(this).data('topic-id'); console.log(topics_page_no["topic_" + topic_id][0] + ".html"); // -----------------------^^^^^^^^^^^^^^^^^^^^^和find。如果要检索与过滤器匹配的所有后代，只需使用find_all即可。而且，正如我所说，你需要使用find_all加入结果。

str.join

唯一列表，不根据python中的空格拆分条目

3 个答案: