Question

I'm trying to convert wiki page table to dataframe. Headings are shifted to the  
right, 'Launches' should be there were it is now 'Successes'.

我使用了skiprows选项，但它没有用。

df = pd.read_html(r'https://en.wikipedia.org/wiki/2018_in_spaceflight',skiprows=[1,2])[7]

df2 = df[df.columns[1:5]]

               1          2         3                 4
0       Launches  Successes  Failures  Partial failures
1          India          1         1                 0
2          Japan          3         3                 0
3    New Zealand          1         1                 0
4         Russia          3         3                 0
5  United States          8         8                 0
6             24         23         0                 1

Answer 1

问题是原始表的第一列中有合并的单元格。如果要完全解析它，则应编写解析器。暂时，您可以尝试：

df = pd.read_html(r'https://en.wikipedia.org/wiki/2018_in_spaceflight', header=0)[7]
df.columns = [""] + list(df.columns[:-1])
df.iloc[-1] = [""] + list(df.iloc[-1][:-1])

pandas pd.read_html标题向右移动

1 个答案: