熊猫移动标题

时间:2018-04-16 20:01:54

标签: python pandas io header multiple-columns

我从网站中提取了一个表格,但是生成的列标题错误放置了。

例如,

原始表:

A A-explaned  B B-explaned C C-explaned
1  0.2 10%    2  0.7  20%  3 0.8  15%
1  0.2 10%    2  0.7  20%  3 0.8  15%
1  0.2 10%    2  0.7  20%  3 0.8  15%
1  0.2 10%    2  0.7  20%  3 0.8  15%

但我得到的是:

A A-explaned  B B-explaned C C-explaned  NaN  NaN  NaN
1    0.2     10%    2     0.7   20%      3    0.8  15%
1    0.2     10%    2     0.7   20%      3    0.8  15%
1    0.2     10%    2     0.7   20%      3    0.8  15%
1    0.2     10%    2     0.7   20%      3    0.8  15%

我想要的表格:

A A-explaned  A_    B   B-explaned  B_      C  C-explaned    C_  
1    0.2     10%    2     0.7      20%      3    0.8        15%
1    0.2     10%    2     0.7      20%      3    0.8        15%
1    0.2     10%    2     0.7      20%      3    0.8        15%
1    0.2     10%    2     0.7      20%      3    0.8        15%

如何每两列跳过一次并添加另一列标题?

由于

1 个答案:

答案 0 :(得分:0)

您可以在read_csv中使用正则表达式sep参数:

from io  import StringIO
import pandas as pd
txt = StringIO("""A A-explaned  B B-explaned C C-explaned
1  0.2 10%    2  0.7  20%  3 0.8  15%
1  0.2 10%    2  0.7  20%  3 0.8  15%
1  0.2 10%    2  0.7  20%  3 0.8  15%
1  0.2 10%    2  0.7  20%  3 0.8  15%""")
df = pd.read_csv(txt, sep=r'-|\s+')

df

输出:

   A  A.1 explaned  B  B.1 explaned.1  C  C.1 explaned.2
0  1  0.2      10%  2  0.7        20%  3  0.8        15%
1  1  0.2      10%  2  0.7        20%  3  0.8        15%
2  1  0.2      10%  2  0.7        20%  3  0.8        15%
3  1  0.2      10%  2  0.7        20%  3  0.8        15%