使用BS4进行网络抓取-“传递的值的长度为0,索引表示7”

时间:2019-03-05 09:49:29

标签: python pandas beautifulsoup

我得到一个传递值0的长度错误?

这是我的代码:

_.remove(this.myArray, {myArrayId: id})

有人可以解释其背后的原因吗?

2 个答案:

答案 0 :(得分:4)

使用read_html作为DataFrames的返回列表,并通过索引[3],然后按字典的rename列,选择4. DataFrame:

draft2018 = "https://en.wikipedia.org/wiki/2018_NBA_draft"
d = {'Rnd.':'Round','Pos.':'Position','Nationality[n 1]':'Nationality'}
df = pd.read_html(draft2018)[3].rename(columns=d)
print(df.head())
   Round  Pick             Player Position    Nationality  \
0      1     1      Deandre Ayton        C        Bahamas   
1      1     2  Marvin Bagley III       PF  United States   
2      1     3        Luka Dončić    PG/SF       Slovenia   
3      1     4  Jaren Jackson Jr.       PF  United States   
4      1     5         Trae Young       PG  United States   

                                      Team    School / club team  
0                             Phoenix Suns         Arizona (Fr.)  
1                         Sacramento Kings            Duke (Fr.)  
2      Atlanta Hawks (traded to Dallas)[a]   Real Madrid (Spain)  
3                        Memphis Grizzlies  Michigan State (Fr.)  
4  Dallas Mavericks (traded to Atlanta)[a]        Oklahoma (Fr.)  

答案 1 :(得分:3)

只是为了演示问题,请尝试打印您的行:

print(row)

第一个列表将打印为空,这就是引发错误的原因。数据框期望有7个值,但您要提供的是0个值。尽管Jezrael的解决方案更为优雅,但您可以进行更改以使其起作用:

draft2018 ="https://en.wikipedia.org/wiki/2018_NBA_draft"
draftpage =urllib.request.urlopen(draft2018)
soup=bs.BeautifulSoup(draftpage,"html.parser")

columns = ['Round', 'Pick', 'Player', 'Position',
           'Nationality', 'Team', 'School/club team']

df = pd.DataFrame(columns=columns)

table = soup.find("table",{"class":"wikitable sortable plainrowheaders"}).tbody
print(table)
trs = table.find_all("tr")

for tr in trs:
    tds = tr.find_all('td')
    row = [td.text.replace('\n','') for td in tds]
    if len(row) < 7:
        continue
#     print(row)
    df = df.append(pd.Series(row, index=columns), ignore_index=True)