其他列和值

Question

我有一个如下的数据帧。

col = ['B','C']

假设列A始终位于数据框中，但有时可能是列B，列B和C，或多列。

我创建了一个代码，用于将列名称（A除外）保存在列表中，以及将其他列中值的唯一排列保存到列表中。例如，在此示例中，我们将列B和C保存到列中：

permutation = [2,8]

简单df中的排列是1,7; 2,8; 3,9。为简单起见，假设一个排列保存如下：

a[a[col].isin(permutation)]

如何选择与该排列相等的整行（仅限那些行？）

现在，我正在使用：

 static int eval(int [] a) {
     int counter = 0;
     for (int i = 0; i < a.length; i++) {
         for (int j = 0; j < a.length; j++) {
             if (i != j && Math.abs(a[i] - a[j]) % 11 == 0) {
                 counter++;
             }
         }
     }
     return counter / 2;
 }

不幸的是，我没有得到A栏中的值。

（我知道如何删除那些NaN以后的值.BTT我应该如何保持它的动态？有时会有多个列。（最终，我将通过一个循环并保存不同的迭代）基于除A之外的列中的多个排列。

Answer 1

使用布尔序列的交集（两个条件都为真） - 第一个设置代码：

import pandas as pd
df = pd.DataFrame({'A' : ['Bob','Jean','Sally','Sue'], 'B' : [1,2,3, 2],'C' : [7,8,9,8] })
col = ['B','C']
permutation = [2,8]

以下是这个有限例子的解决方案：

>>> df[(df[col[0]] == permutation[0]) & (df[col[1]] == permutation[1])]
      A  B  C
1  Jean  2  8
3   Sue  2  8

要打破这一点：

>>> b, c = col
>>> per_b, per_c = permutation
>>> column_b_matches = df[b] == per_b
>>> column_c_matches = df[c] == per_c
>>> intersection = column_b_matches & column_c_matches
>>> df[intersection]
      A  B  C
1  Jean  2  8
3   Sue  2  8

其他列和值

要获取任意数量的列和值，我将创建一个函数：

def select_rows(df, columns, values):
    if not columns or not values:
        raise Exception('must pass columns and values')
    if len(columns) != len(values):
        raise Exception('columns and values must be same length')
    intersection = True
    for c, v in zip(columns, values):
        intersection &= df[c] == v
    return df[intersection]

并使用它：

>>> select_rows(df, col, permutation)
      A  B  C
1  Jean  2  8
3   Sue  2  8

或者您可以将排列强制转换为数组并使用单个比较完成此操作，假设数值：

import numpy as np

def select_rows(df, columns, values):
    return df[(df[col] == np.array(values)).all(axis=1)]

但是这不适用于您的代码示例

Answer 2

我找到了解决方案。如果我只有两列，那么Aaron上面的效果很好。我需要一个无论df大小如何都能工作的解决方案（大小为3-7列）。

df = pd.DataFrame({'A' : ['Bob','Jean','Sally','Sue'], 'B' : [1,2,3, 2],'C' : [7,8,9,8] })
permutation = [2,8]
col = ['B','C']
interim = df[col].isin(permutation)
df[df.index.isin(interim[(interim != 0).all(1)].index)]

Answer 3

你可以这样做：

In [77]: permutation = np.array([0,2,2])

In [78]: col
Out[78]: ['a', 'b', 'c']

In [79]: df.loc[(df[col] == permutation).all(axis=1)]
Out[79]:
    a  b  c
10  0  2  2
15  0  2  2
16  0  2  2

您的解决方案无法始终正常运行：

样本DF：

In [71]: df
Out[71]:
    a  b  c
0   0  2  1
1   1  1  1
2   0  1  2
3   2  0  1
4   0  1  0
5   2  0  0
6   2  0  0
7   0  1  0
8   2  1  0
9   0  0  0
10  0  2  2
11  1  0  1
12  2  1  1
13  1  0  0
14  2  1  0
15  0  2  2
16  0  2  2
17  1  0  2
18  0  1  1
19  1  2  0

In [67]: col = ['a','b','c']

In [68]: permutation = [0,2,2]

In [69]: interim = df[col].isin(permutation)

关注结果：

In [70]: df[df.index.isin(interim[(interim != 0).all(1)].index)]
Out[70]:
    a  b  c
5   2  0  0
6   2  0  0
9   0  0  0
10  0  2  2
15  0  2  2
16  0  2  2

Python：给定列列表和值列表，返回满足所有条件的数据帧子集

3 个答案:

其他列和值