我们有一个清单:
lst1 = ['asd', '123', 'uniq','all']
lst2 = ['asd', '123', 'all']
lst3 = ['asd', 'al']
lst4 = ['all']
result = {}
shadow_list = []
for i in lst1:
if statement1:
result[i] = 0.25
elif statement2:
result[i] = 0.50
elif statement3:
result[i] = 0.75
else:
shadow_list.append(i)
我说的是流行词。
statement1 - 来自lst1的单词仅在lst1(1 / 4-0.25)
声明2 - 来自lst1的单词仅在lst1和lst2或lst或 lst4(2 / 4-0.50)
声明3-来自lst1的单词仅列在其中3个列表中(2,3 / 2,4 / 3,4) 3 / 4-0.75
那么如何在python中结合使用boolean运算符来获得上面的结果?
更新
所以我们从lst1 - 'asd'中取词,我们看到这个词只在4个列表中的3个中,所以传播是3/4
我们采取'123',我们看到这个单词只有4个中的2个,所以2/4
然后我们采用'uniq',这个单词只在list1中,所以1/4
我需要这样的“印度教”代码:
for i in lst1:
if i in (lst2 and lst3 and i not in lst4) or i in (lst3 and lst 4 and not in lst2) or i in (lst2 and lst4 and not in lst3)
我需要检查这个词的传播。我们只有4个清单。如果单词只有3个列表中的4 - 它传播3/4,如果是2 - 4 - 2/4,如果它是uniqs(仅在一个列表中)1/4 对不起我的“愚蠢”英语。
答案 0 :(得分:1)
现在,我明白你想要什么。因此,最好使用collections.Counter
,
import collections
import itertools
s1 = set(['asd', '123', 'uniq','all'])
s2 = set(['asd', '123', 'all'])
s3 = set(['asd', 'all'])
s4 = set(['all'])
l = [s1, s2, s3, s4]
nrof_lists = len(l)
result = {k : v*1.0/nrof_lists for k, v in collections.Counter(itertools.chain.from_iterable(l)).items()}
print(result)
{'uniq': 0.25, 'all': 1.0, '123': 0.5, 'asd': 0.75}
基本解决方案,
s1 = set(['asd', '123', 'uniq','all'])
s2 = set(['asd', '123', 'all'])
s3 = set(['asd', 'all'])
s4 = set(['all'])
result = {}
shadow_list = []
l = [s1, s2, s3, s4]
nrof_lists = len(l)
for word in s1:
times = sum([word in s for s in l])
if times:
result[word] = times*1.0/nrof_lists
else:
shadow_list.append(word)
# Output
print(result)
{'123': 0.5, 'all': 1.0, 'uniq': 0.25, 'asd': 0.75}
print(shadow_list)
[]
答案 1 :(得分:0)
Counter
非常适合:
from collections import Counter
counter = Counter()
word_sets = [
# Use sets (with curly brackets: {}) to prevent duplicates, i.e. {'abc'} == {'abc', 'abc'}
{'asd', '123', 'uniq', 'all'},
{'asd', '123', 'all'},
{'asd', 'all'},
{'all'}
]
for word_set in word_sets:
counter.update(word_set)
print counter # Counter({'all': 4, 'asd': 3, '123': 2, 'uniq': 1})
def spread(word):
return float(counter[word]) / len(word_sets)
for word in word_sets[0]:
print(word, spread(word))
结束的输出:
('123', 0.5)
('all', 1.0)
('uniq', 0.25)
('asd', 0.75)