数据类型不匹配beautifulsoup TypeError:不可用类型:'list'

时间:2017-01-26 21:42:03

标签: python python-3.x parsing beautifulsoup

我有一段访问link的代码,并尝试在每个keywords中找到某个link

最后,如果link {em> keywords,则将其存储在list中。

然而,当我运行我的代码时,它给了我一个问题: 这一行TypeError: unhashable type: 'list'

for a in soup.find_all('a', class_="result-title hdrlnk", text=re.compile(job_kw,re.IGNORECASE)):

以下是代码:

jobs_by_city = [
'http://boston.website.org/search/widget',
]

job_kw = [['web site','user', 'account'],['permission', 'name']]
job_kw = sum(job_kw, [])

jobs = []

for job_in_city in jobs_by_city:
    a_job = requests.get(job_in_city)
    soup = BeautifulSoup(a_job.text, "lxml")
    for a in soup.find_all('a', class_="result-title hdrlnk", text=re.compile(job_kw,re.IGNORECASE)):
        print(a.get('href'))
        #jobs.append(a.get('href'))

我在这里做错了什么?

1 个答案:

答案 0 :(得分:0)

re.compile不会将list作为输入。你必须迭代关键词:

from bs4 import BeautifulSoup
import requests
import re

jobs_by_city = [
'http://boston.website.org/search/widget',
]

job_kws = [['web site','user', 'account'],['permission', 'name']]
job_kws = sum(job_kws, [])

jobs = []

for job_in_city in jobs_by_city:
    a_job = requests.get(job_in_city)
    soup = BeautifulSoup(a_job.text, "lxml")
    for job_kw in job_kws:
        for a in soup.find_all('a', class_="result-title hdrlnk", text=re.compile(job_kw,re.IGNORECASE)):
            print(a.get('href'))
            #jobs.append(a.get('href'))

给定的网址不提供您要查找的html元素:)