Question

让我们拥有这个DataFrame

d = {'col1': [[0,1], [0,2], [1,2], [2,3]], 'col2': ["a", "b", "c", "d"]}
df = pandas.DataFrame(data=d)

     col1 col2
0  [0, 1]    a
1  [0, 2]    b
2  [1, 2]    c
3  [2, 3]    d

现在我需要在col1中找到特定列表，并从该行的col2返回值

例如，我要查找[0,2]并获得“ b”作为回报

我已经阅读了有关如何执行此操作的线程：extract column value based on another column pandas dataframe

但是当我尝试在此处应用答案时，我没有得到所需的结果

df.loc[df['col1'] == [0,2], 'col2']
ValueError: Arrays were different lengths: 4 vs 2

df.query('col1==[0,2]')
SystemError: <built-in method view of numpy.ndarray object at 0x000000000D67FA80> returned a result with an error set

Answer 1

一种可能的解决方案是比较tuple或set s：

mask = df['col1'].apply(tuple) == tuple([0,2])

mask = df['col1'].apply(set) == set([0,2])

如果Series的每个值的长度相同，并且比较list或array的长度相同，则按数组比较：

mask = (np.array(df['col1'].values.tolist())== [0,2]).all(axis=1)

s = df.loc[mask, 'col2']
print (s)
1    b
Name: col2, dtype: object

Answer 2

不确定是否可以使用非数字或字符串值在pandas DataFrame中进行逻辑索引。这是一种简单的单行解决方法，该方法比较字符串而不是列表。

df.loc[df['col1'].apply(str) == str([0,1])]['col2'][0]

基本上，您要做的是将第1列中的所有列表都转换为字符串，然后将它们与字符串进行比较：str（[0,1]）。

请注意第二行末尾的[0]。这是因为多于一行的行可能包含列表[0,1]。我选择显示的第一个值。

如何在DataFrame中找到包含特定列表的行

2 个答案: