我想删除下面的列表,但也要保留一份重复列表,以便在下面的屏幕上显示。这是从CSV文件中提取的,因此很高兴向用户显示已添加的内容以及未添加的内容“Dupes”等。
[
['first_name', 'last_name', 'email'],
['Danny', 'Lastnme', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'], < -- Dupe
['Sally', 'Surname', 'name@email.com'], < -- Dupe
['Chris', 'Lastnam', 'name@email.com'],
['Larry', 'Seconds', 'name@email.com'],
['Barry', 'Barrins', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'], < -- Dupe
]
最终的结果是生成两个列表,一个很好的去掉的结果,另一个是重复的列表。
独特性:
[
['first_name', 'last_name', 'email'],
['Danny', 'Lastnme', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Chris', 'Lastnam', 'name@email.com'],
['Larry', 'Seconds', 'name@email.com'],
['Barry', 'Barrins', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'],
]
愚弄:
[
['Sally', 'Surname', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'],
]
答案 0 :(得分:1)
您可以复制并粘贴此代码以获取dupes和uniques的返回字典:
a = [
['first_name', 'last_name', 'email'],
['Danny', 'Lastnme', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Chris', 'Lastnam', 'name@email.com'],
['Larry', 'Seconds', 'name@email.com'],
['Barry', 'Barrins', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'],
]
result = {}
b = [tuple(x) for x in a[1:]]
all_uniques = set(b)
result['unique'] = [list(x) for x in list(all_uniques)]
# To show which ones have duplicates use Mr Es solution:
from collections import Counter
t = Counter(b)
dupes = []
for k, v in t.iteritems():
if v > 1:
dupes.append(list(k)*(v-1))
result['dupes'] = dupes
print(result)
答案 1 :(得分:1)
试试这个。这是最简单的方法。
name_list = [
['first_name', 'last_name', 'email'],
['Danny', 'Lastnme', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Sally', 'Surname', 'name@email.com'],
['Chris', 'Lastnam', 'name@email.com'],
['Larry', 'Seconds', 'name@email.com'],
['Barry', 'Barrins', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'],
['Glenn', 'Melting', 'name@email.com'],
]
sorted_name_list = sorted(name_list[1:])
last_record = False
Unique = []
Dupes = []
for record in sorted_name_list:
if last_record != record:
Unique.append(record)
else:
Dupes.append(record)
last_record = record
print Unique
print Dupes
答案 2 :(得分:0)
您可以使用
获取频率from collections import Counter
t = Counter(tuple(x) for x in data[1:])
uniques = [list(k) for k, v in t.iteritems() if v == 1]
dupes = [list(k) * (v-1) for k, v in t.iteritems() if v > 1]