Question

如何检查熊猫表中的列值是否相同，并在第四列中创建结果：

原始

    red  blue  green
a   1    1     1
b   1    2     1
c   2    2     2

变为：

   red blue green match
a  1   1    1     1
b  1   2    1     0
c  2   2    2     1

最初我只有2列，通过这样做可以实现类似的东西：

df['match']=df['blue']-df['red']

但这不适用于3列。

非常感谢您的帮助！

Answer 1

为了使其更通用，请比较apply方法上的行值。

使用In [54]: df['match'] = df.apply(lambda x: len(set(x)) == 1, axis=1).astype(int) In [55]: df Out[55]: red blue green match a 1 1 1 1 b 1 2 1 0 c 2 2 2 1

pd.Series.nunique

或者，使用In [56]: (df.apply(pd.Series.nunique, axis=1) == 1).astype(int) Out[56]: a 1 b 0 c 1 dtype: int32标识行中唯一的数量。

df.iloc[:, 0]

或者，使用eq作为第一列值，并将df与In [57]: df.eq(df.iloc[:, 0], axis=0).all(axis=1).astype(int) Out[57]: a 1 b 0 c 1 dtype: int32匹配

{{1}}

Answer 2

你可以试试这个：

df["match"] = df.apply(lambda x: int(x[0]==x[1]==x[2]), axis=1)

其中：

x[0]==x[1]==x[2]：测试3个第一列的等式
axis=1：明智的列

或者，您也可以按名称调用列：

df["match"] = df.apply(lambda x: int(x["red"]==x["blue"]==x["green"]), axis=1)

如果你有很多专栏并且想要比较他们的子部分而不知道他们的编号，这会更方便。

如果您想比较所有列，请使用John Galt的解决方案

检查python panda中的列值

2 个答案: