从一列中提取多个值到pandas中的新列

时间:2018-11-30 16:46:06

标签: python python-3.x pandas dataframe

我有一个数据框 df ,其列名称为 Category ,其中的值是
类别
家具类
技术
办公用品

重复这三个值,该列中共有1000个值。我想从 Category 列中创建一个新的列名称 Category_filter ,其值为 Furniture Technology

df['Category_Filter'] = df[df['Category'].isin(['Furniture', 'Technology'])]

我已经尝试了上面的代码来创建新列,但是没有用。

Category_Filter
Furniture
Technology

这是所需的输出

2 个答案:

答案 0 :(得分:0)

我假设您的意思是您想要一个数据框,其中“类别”中的值为“家具”或“技术”。这是您可以做的。

df[df['Category'].isin(['Furniture ', 'Technology '])]

如果这不是您的意思,也许您可​​以澄清一下。

编辑:在下面回复您的评论

 df['Category_filter'] = df['Category'].where(df['Category'].isin(['Furniture ', 'Technology ']))

答案 1 :(得分:0)

如果我对您的理解不正确,则您正在寻找该列中每个元素重复的值的总数。

示例dataFrame:

>>> df
        Category
0      Furniture
1     Technology
2  Office Supply
3      Furniture
4     Technology
5  Office Supply
6      Furniture
7     Technology
8  Office Supply

根据更新后的代码,应该避免,只有与您不匹配的值才会报告为NaN

>>> df['Category_Filter'] = df[df['Category'].isin(['Furniture', 'Technology'])]
>>> df
        Category Category_Filter
0      Furniture       Furniture
1     Technology      Technology
2  Office Supply             NaN
3      Furniture       Furniture
4     Technology      Technology
5  Office Supply             NaN
6      Furniture       Furniture
7     Technology      Technology
8  Office Supply             NaN

或者,如果您希望使用NaN值删除所有行,只需尝试:

>>> df.dropna()
# df.dropna(inplace=True)   # make in permanent to the DataFrame
     Category Category_Filter
0   Furniture       Furniture
1  Technology      Technology
3   Furniture       Furniture
4  Technology      Technology
6   Furniture       Furniture
7  Technology      Technology