Question

我有以下数据框df：

我需要创建一个新列col3，并为每个唯一的T分配F或col1：对于每个唯一的col1，如果至少一个行等于A中的col2，则col3等于T。否则，它等于F。

预期结果：

col1  col2  col3
1     C     T
1     B     T
1     A     T
2     C     F
2     C     F
3     A     T
3     C     T
3     B     T

我该怎么做？我尝试使用apply(lambda ...)解决方案，但是它是按行进行的，并且仅在col1为1时才分配T（基本上是因为1的最后一行等于A）。

Answer 1

用groupby来检查transform

df['col2'].eq('A').groupby(df['col1']).transform('any')
0     True
1     True
2     True
3    False
4    False
5     True
6     True
7     True
Name: col2, dtype: bool

df['col3']=df['col2'].eq('A').groupby(df['col1']).transform('any').map({True:'T', False:'F'})

Answer 2

您还可以像这样使用numpy的{{1}}函数：

where

Answer 3

您可以选择将str和astype('str')的int转换为str.contains的另一种解决方案：

>>> df.assign(col3=df['col1'].astype(str).str.contains('1|3').map({True:'T', False:'F'}))
   col1 col2 col3
0     1    C    T
1     1    B    T
2     1    A    T
3     2    C    F
4     2    C    F
5     3    A    T
6     3    C    T
7     3    B    T

如何根据条件创建新列？

3 个答案: