我有一个这样的数据框:
df=pd.DataFrame([[1,"10/1/2019","I",2879],
[1,"10/1/2019","O",196],
[1,"10/2/2019","I",2840],
[1,"10/2/2019","O",189],
[2,"10/1/2019","I",2907],
[2,"10/1/2019","O",195]],
columns=["A","B","C","D"])
df.set_index(["A","B","C"],inplace=True)
当我显示内容时,它们看起来或多或少像这样:
D
A B C
1 10/1/2019 I 2879
O 196
10/2/2019 I 2840
O 189
2 10/1/2019 I 2907
O 195
所以我的问题是如何生成一个csv文件,其内容看起来像在笔记本上显示的那样,其中“ A”和“ B”值在随后的行中变为空。换句话说,这就是我需要csv文件输出的样子:
A,B,C,D
1,10/1/2019,I,2879
,,O,196
,10/2/2019,I,2840
,,O,189
2,10/1/2019,I,2907
,,O,195
答案 0 :(得分:1)
不要设置索引,而是创建一些返回True
或False
的序列,其中True
将值设为空白:
df=pd.DataFrame([[1,"10/1/2019","I",2879],
[1,"10/1/2019","O",196],
[1,"10/2/2019","I",2840],
[1,"10/2/2019","O",189],
[2,"10/1/2019","I",2907],
[2,"10/1/2019","O",195]],
columns=["A","B","C","D"])
s1 = df.duplicated(subset=['A'])
s2 = df.duplicated(subset=['A','B'])
df['A'] = df['A'].where(~s1,'')
df['B'] = df['B'].where(~s2,'')
df
Out[1]:
A B C D
0 1 10/1/2019 I 2879
1 O 196
2 10/2/2019 I 2840
3 O 189
4 2 10/1/2019 I 2907
5 O 195