我有这样的数据
Column1 Column2 Column3
0 This Sushi is Awesome NaN NaN
1 NaN Id: 2261
2 NaN City: Tokyo
3 NaN Food: Positive
4 NaN Price: NaN
5 This food is really expensi... NaN NaN
6 NaN Id: 3`
7 NaN City: Osaka
8 NaN Food: Negative
9 NaN Price: Negative
我写了这样的代码,但是我出错了
pivoted = data.pivot(index='Column1',columns='Column2', values='Column3')
ValueError:索引包含重复的条目,无法重塑
数据透视表也不起作用
我想要这样的输出
0 Id City Food Price
1 This Sushi is Awesome 2261 Tokyo Positive NaN
2 This food is really expensi... 3 Osaka Negative Negative
答案 0 :(得分:1)
在pivot
之前进行预处理-检查每个Column1
的缺失值,然后向前填充,将:
中的Column2
从rstrip
中删除,最后在{ {3}}:
m = df['Column1'].isnull()
df['Column1'] = df['Column1'].ffill()
df['Column2'] = df['Column2'].str.rstrip(':')
pivoted = df[m].pivot(index='Column1',columns='Column2', values='Column3')
print (pivoted)
Column2 City Food Id Price
Column1
This Sushi is Awesome Tokyo Positive 2261 NaN
This food is really expensive Osaka Negative 3` Negative