Question

例如，我有一个数据框：

df

    category                              name
0   [['Clothing & Jewelry', 'Shoes']]     Jason
1   [['Clothing & Jewelry', 'Jewelry']]   Molly

如何使用逗号存储category列的字符串来分隔条目？

我希望得到的结果：

    category                              name
0   Clothing & Jewelry, Shoes             Jason
1   Clothing & Jewelry, Jewelry           Molly

Answer 1

您可以使用apply：

致电lambda

In [21]:
df['category'].apply(lambda x: x.remove('Clothing & Jewelry'))
df

Out[21]:
    category   name
0    [Shoes]  Jason
1  [Jewelry]  Molly

请注意，在系列中存储非标量值是有问题的，因为过滤和矢量化操作不起作用，最好使用逗号存储字符串以分隔条目

修改

要回答您更新的问题，我会将数据元素存储在不同的行中，因为这样可以简化过滤：

In [79]: df['category'].apply(lambda x: ','.join(x[0])).str.split(',',expand=True).stack().reset_index().drop('level_1', axis=1) Out[79]: level_0 0 0 0 Clothing & Jewelry 1 0 Shoes 2 1 Clothing & Jewelry 3 1 Jewelry

然后我们可以merge回到原来的df然后我们可以过滤：

In[80]: df.merge(df['category'].apply(lambda x: ','.join(x[0])).str.split(',',expand=True).stack().reset_index().drop('level_1', axis=1), left_index=True, right_on='level_0', how='left') Out[80]: category name level_0 0 0 [[Clothing & Jewelry, Shoes]] Jason 0 Clothing & Jewelry 1 [[Clothing & Jewelry, Shoes]] Jason 0 Shoes 2 [[Clothing & Jewelry, Jewelry]] Molly 1 Clothing & Jewelry 3 [[Clothing & Jewelry, Jewelry]] Molly 1 Jewelry In [82]: df = df.drop('level_0', axis=1) df Out[82]: category name 0 0 [[Clothing & Jewelry, Shoes]] Jason Clothing & Jewelry 1 [[Clothing & Jewelry, Shoes]] Jason Shoes 2 [[Clothing & Jewelry, Jewelry]] Molly Clothing & Jewelry 3 [[Clothing & Jewelry, Jewelry]] Molly Jewelry In [84]: df.rename(columns={0:'category_values'},inplace=True) df Out[84]: category name category_values 0 [[Clothing & Jewelry, Shoes]] Jason Clothing & Jewelry 1 [[Clothing & Jewelry, Shoes]] Jason Shoes 2 [[Clothing & Jewelry, Jewelry]] Molly Clothing & Jewelry 3 [[Clothing & Jewelry, Jewelry]] Molly Jewelry In [85]: df[df['category_values']!='Clothing & Jewelry'] Out[85]: category name category_values 1 [[Clothing & Jewelry, Shoes]] Jason Shoes 3 [[Clothing & Jewelry, Jewelry]] Molly Jewelry

将列值更改为字符串

1 个答案: