我试图创建一个lambda函数,以便从列的每一行中删除重复的单词。 我试图将一个变量定义为我的列,并创建了一个函数来消除句子中的重复单词,但是我不知道如何使用lambda将该函数应用于所有列。
def unique_list(l):
lst= []
[lst.append(x) for x in l if x not in lst]
return lst
a= 'shoes dress apple shoes mango apple'
a=' '.join(unique_list(a.split()))
我的专栏是'dup_words',请您帮助我了解如何使用lambda将上述函数应用于我专栏中的所有行?
答案 0 :(得分:0)
假设行和列的意思是这样的结构:
data = [
['shoes dress apple shoes mango apple','shoes dress apple shoes mango apple','shoes dress apple shoes mango apple'], # Column 1
['shoes dress apple shoes mango apple','shoes dress apple shoes mango apple','shoes dress apple shoes mango apple'], # Column 2
['shoes dress apple shoes mango apple','shoes dress apple shoes mango apple','shoes dress apple shoes mango apple'] # Column 3
]
您可以将列表推导式与包含您定义的函数的 lambda 函数结合使用:
dup_words = data[0] # ['shoes dress apple shoes mango apple', 'shoes dress apple shoes mango apple', 'shoes dress apple shoes mango apple']
unique_words = [(lambda x: ' '.join(unique_list(x.split())))(row) for row in dup_words] # ['shoes dress apple mango', 'shoes dress apple mango', 'shoes dress apple mango']
可以通过将您的功能更改为:
def unique_list(l):
lst = []
[lst.append(x) for x in l if x not in lst]
return ' '.join(lst)
那么你的 lambda 函数就变成了
unique_words = [(lambda x: unique_list(x.split()))(row) for row in dup_words] # ['shoes dress apple mango', 'shoes dress apple mango', 'shoes dress apple mango']
答案 1 :(得分:0)
如果顺序无关紧要,请使用 set()
。简短明了。
def unique_list(l):
return list(set(l))
a = 'shoes dress apple shoes mango apple'
a = ' '.join(unique_list(a.split())) # 'shoes dress apple mango'
所有人都为简单的单衬纸欢呼:
a = 'shoes dress apple shoes mango apple'
a = ' '.join(list(set(a.split()))) # 'shoes dress apple mango'
您的新列可以写成这样:
df['deduped'] = df['some_column'].apply(lambda x: list(set(x)))