Question

我需要帮助，才能在一个仅命令中包括以下两个步骤：

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1)
df['Col2'] = df['Col1'].apply(part_is_in, values = list_2)

其中list_1和list_2是字符串列表，并且

def part_is_in(x, values):
    output = 'No'
    for val in values:
        if val in x:
            return 'Yes'
            break                
    return output

我想检查Col1中的元素是否在list_1和/或list_2中。现在，我正在使用顺序更新，但是我想更改定义以便检查值是否可以在更多列表中。我也在使用上面的函数来检查其他列中的元素，并且我还需要保留仅一个列表的大小写。

任何帮助将不胜感激。谢谢

Answer 1

熊猫具有以下功能：

df[df['Col1'].isin(list1+list2)]['Col1']

此返回的'Col1'列中的元素比list1中的元素

Answer 2

尝试一下

df['Col2'] = df['Col1'].apply(part_is_in, values = list_1 + list_2)

Answer 3

如果要处理非常大的数据集，则要避免使用自定义函数。

# Assume the columns you want to look at is included in cols 
new_cols = [f'{item}_1' for item in cols]
for old_col, new_col in zip(cols, new_cols):
    # where values(iterable) is whatever you want to check
    # this checks if each value in column is in values(iterable)
    # overwrite old_col if you can, if not this will add lots of new columns
    df[new_col] = df[old_col].map(lambda item: item in values) 

# This will call any() function on all the rows, 
# recall that each element in row x is True or False that represents if  
# the original value (from old_col at row x) is in values
df['result'] = df[new_cols].map(lambda row: any(row.values), axis=1)

检查一个或多个列表中是否包含元素

3 个答案: