我从网站中提取了一个表格,但是生成的列标题错误放置了。
例如,
原始表:
A A-explaned B B-explaned C C-explaned
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
但我得到的是:
A A-explaned B B-explaned C C-explaned NaN NaN NaN
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
我想要的表格:
A A-explaned A_ B B-explaned B_ C C-explaned C_
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
如何每两列跳过一次并添加另一列标题?
由于
答案 0 :(得分:0)
您可以在read_csv中使用正则表达式sep
参数:
from io import StringIO
import pandas as pd
txt = StringIO("""A A-explaned B B-explaned C C-explaned
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%
1 0.2 10% 2 0.7 20% 3 0.8 15%""")
df = pd.read_csv(txt, sep=r'-|\s+')
df
输出:
A A.1 explaned B B.1 explaned.1 C C.1 explaned.2
0 1 0.2 10% 2 0.7 20% 3 0.8 15%
1 1 0.2 10% 2 0.7 20% 3 0.8 15%
2 1 0.2 10% 2 0.7 20% 3 0.8 15%
3 1 0.2 10% 2 0.7 20% 3 0.8 15%