pandas pd.read_html标题向右移动

时间:2018-03-14 15:20:03

标签: python-3.x pandas dataframe

I'm trying to convert wiki page table to dataframe. Headings are shifted to the  
right, 'Launches' should be there were it is now 'Successes'.  

我使用了skiprows选项,但它没有用。

df = pd.read_html(r'https://en.wikipedia.org/wiki/2018_in_spaceflight',skiprows=[1,2])[7]

df2 = df[df.columns[1:5]]

               1          2         3                 4
0       Launches  Successes  Failures  Partial failures
1          India          1         1                 0
2          Japan          3         3                 0
3    New Zealand          1         1                 0
4         Russia          3         3                 0
5  United States          8         8                 0
6             24         23         0                 1

1 个答案:

答案 0 :(得分:1)

问题是原始表的第一列中有合并的单元格。如果要完全解析它,则应编写解析器。暂时,您可以尝试:

df = pd.read_html(r'https://en.wikipedia.org/wiki/2018_in_spaceflight', header=0)[7]
df.columns = [""] + list(df.columns[:-1])
df.iloc[-1] = [""] + list(df.iloc[-1][:-1])