Question

如何正确地将单个项目添加到一系列列表数据类型？我试图制作一个副本并将一个项目添加到列表中，但是此方法反而会影响原始数据框

这是我的代码：

df = pd.DataFrame({'num':[['one'],['three'],['five']]})

# make copy of original df
copy_df = df.copy()

# add 'thing' to every single list
copy_df.num.apply(lambda x: x.append('thing'))

# show results of copy_df
print(copy_df) # this will show [['one', 'thing'], ['three', 'things'], ...]

print(df) # this will also show [['one', 'thing'], ['three', 'things'], ...]
# WHY?

我的问题是：

为什么上述方法也将元素添加到原始副本中？
有没有更好的方法将元素添加到一系列列表中？

Answer 1

因为要复制数据框而不是复制数据框中的列表，所以内部系列仍然引用原始数据框中的列表。

更好的实现方式

copy_df.num = copy_df.num.apply(lambda x: x + ['thing'])

Answer 2

1-指向通过数据框访问的列表的指针，而不是列表本身。因此，当您在一个数据帧中修改一个列表时，您将其全部修改为隐含的（因为它是单个对象）。您可以检查一下-查看列表的ID：

copy_df = df.copy()

copy_df['num'].apply(id)
0    140262813220744
1    140262813299528
2    140262813298888
Name: num, dtype: int64

df['num'].apply(id)
0    140262813220744
1    140262813299528
2    140262813298888
Name: num, dtype: int64

2-最好不要将列表存储在数据框中，而应使用“长”表排序，例如：

   list_index    num
0  0            "one"
0  1          "thing"
1  0          "three"
1  1         "things"
2  0           "five"
2  1         "things"

您存储了相同的数据，但是通过pandas方法来处理它更容易。

修改
如果您使用

copy_df.num = copy_df.num.apply(lambda x: x + 'num')

它将返回带有全新列表的新数据框：

copy_df.num
Out:
0      [one, thing]
1    [three, thing]
2     [five, thing]

copy_df.num.apply(id)
Out:
0    140262813289352
1    140262794045256
2    140262794050504

id刚刚更改！

copy.deepcopy也不起作用：

import copy

deepcopy_df = copy.deepcopy(df)
deepcopy_df.num.apply(id)
Out:
0    140262813220744
1    140262813299528
2    140262813298888

deepcopy_df.apply(lambda x: x.append('things'))
df.num  # original DataFrame
Out:
0      [one, things]
1    [three, things]
2     [five, things]

Answer 3

或者是Sunil的答案的无lambda版本：

copy_df.num=copy_df.num.apply(['thing'].__add__)

如果担心'thing'刚开始：

copy_df.num=copy_df.num.apply(['thing'].__add__).str[::-1]

熊猫将项目添加到一系列列表数据类型

3 个答案: