我在一个数据帧中有数据,在一个单元格中有两个观测值:
small medium large
apples 258 0.12% 39 0.0091% 89 0.18%
carrots 97 0.16% 6 0.012% 26 0.26%
bananas 377 0.14% 12 0.018% 128 0.22%
pears 206 0.17% 7 0.034% 116 0.24%
我想创建两个单独的数据框,以拆分观察值。像这样:
small medium large
apples 258 39 89
carrots 97 6 26
bananas 377 12 128
pears 206 7 116
和第二个:
small medium large
apples 0.12% 0.0091% 0.18%
carrots 0.16% 0.012% 0.26%
bananas 0.14% 0.018% 0.22%
pears 0.17% 0.034% 0.24%
我可以按列进行拆分:
new_df1 = df['small'].str.extract('([^\s]+)', expand=True)
new_df2 = df['small'].str.extract('([^\s]*$)', expand=True)
但是我不知道如何为整个DataFrame做到这一点。我有许多相似的数据框,具有不同的列名和行名,因此我正在寻找可以重用的解决方案。谢谢!
答案 0 :(得分:2)
您可以这样做:
df1 = df.applymap(lambda x: x.split()[0])
df2 = df.applymap(lambda x: x.split()[1])
示例df:
small medium
0 0 33% 0 33%
1 1 44% 1 33%
2 2 55% 1 55%
df1:
small medium
0 0 0
1 1 1
2 2 1
df2:
small medium
0 33% 33%
1 44% 33%
2 55% 55%
答案 1 :(得分:0)
使用pd.DataFrame.applymap
并通过operator.itemgetter
提取每个组件:
from operator import itemgetter
df = pd.DataFrame([['258 0.12%', '39 0.0091%', '89 0.18%'],
['97 0.16%', '6 0.012%', '26 0.26%']],
columns=['small', 'medium', 'large'],
index=['apples', 'carrots'])
split = df.applymap(lambda x: x.split())
df1 = split.applymap(itemgetter(0)).astype(int)
df2 = split.applymap(lambda x: x[1][:-1]).astype(float) / 100
请注意,您必须注意分别将字符串转换为int
和float
。
print(df1)
small medium large
apples 258 39 89
carrots 97 6 26
print(df2)
small medium large
apples 0.0012 0.000091 0.0018
carrots 0.0016 0.000120 0.0026