Question

我在一个数据帧中有数据，在一个单元格中有两个观测值：

                          small             medium        large
apples                258 0.12%         39 0.0091%     89 0.18%
carrots                97 0.16%          6  0.012%     26 0.26%
bananas               377 0.14%         12  0.018%    128 0.22%
pears                 206 0.17%          7  0.034%    116 0.24%

我想创建两个单独的数据框，以拆分观察值。像这样：

                    small           medium          large
apples                258               39             89
carrots                97                6             26
bananas               377               12            128
pears                 206                7            116

和第二个：

                      small             medium        large
apples                0.12%            0.0091%        0.18%
carrots               0.16%             0.012%        0.26%
bananas               0.14%             0.018%        0.22%
pears                 0.17%             0.034%        0.24%

我可以按列进行拆分：

 new_df1 = df['small'].str.extract('([^\s]+)', expand=True)
 new_df2 = df['small'].str.extract('([^\s]*$)', expand=True)

但是我不知道如何为整个DataFrame做到这一点。我有许多相似的数据框，具有不同的列名和行名，因此我正在寻找可以重用的解决方案。谢谢！

Answer 1

您可以这样做：

df1 = df.applymap(lambda x: x.split()[0])
df2 = df.applymap(lambda x: x.split()[1])

示例df：

   small medium
0  0 33%  0 33%
1  1 44%  1 33%
2  2 55%  1 55%

df1：

 small medium
0  0   0
1  1   1
2  2   1

df2：

  small medium
0  33%  33%
1  44%  33%
2  55%  55%

Answer 2

使用pd.DataFrame.applymap并通过operator.itemgetter提取每个组件：

from operator import itemgetter

df = pd.DataFrame([['258 0.12%', '39 0.0091%', '89 0.18%'],
                   ['97 0.16%', '6  0.012%', '26 0.26%']],
                  columns=['small', 'medium', 'large'],
                  index=['apples', 'carrots'])

split = df.applymap(lambda x: x.split())

df1 = split.applymap(itemgetter(0)).astype(int)
df2 = split.applymap(lambda x: x[1][:-1]).astype(float) / 100

请注意，您必须注意分别将字符串转换为int和float。

print(df1)

         small  medium  large
apples     258      39     89
carrots     97       6     26

print(df2)

          small    medium   large
apples   0.0012  0.000091  0.0018
carrots  0.0016  0.000120  0.0026

将数据框中的列拆分为两个新数据框

2 个答案: