我是Python的新手,如果我的代码没有用最多的pythonic'写的话,请提前道歉。方式。
我正在将CSV文件上传到该脚本,如果该行符合某些条件,我想过滤该CSV文件。
我有两个列表,a_lst
& b_lst
。词典在a_lst
后,我正在检查是否存在具有相应键的词典:b_lst
中的值。如果有匹配的项目,则会将其打印到控制台。我想从a_lst
删除该项目,而不是打印到控制台。我该怎么做?
a_lst = []
b_lsts = []
with open(file_name, 'rt') as f:
reader = csv.DictReader(f)
for row in reader:
if row['Minutes'] == '0' and row['MB'] == '0' and row['Calls'] == '1':
a_lst.append(row)
elif row['Minutes'] == '0' and row['MB'] == '' and row['Calls'] == '1':
a_lst.append(row)
elif row['Minutes'] == '' and row['MB'] == '0' and row['Calls'] == '1':
a_lst.append(row)
elif row['Minutes'] == '' and row['MB'] == '' and row['Calls'] == '1':
a_lst.append(row)
else:
b_lst.append(row)
i = 0
while i < len(a_lst):
if not any(d['Name'] == a_lst[i]['Name'] for d in b_lst):
print a_lst[i]['Name']+"(Row"+str(i)+") is not b_lst."
else:
print a_lst[i]['Name']+"(Row"+str(i)+") is present."
i+=1
编辑:我想要的结果
Name, PhoneNo, Minutes, MB, Calls
Steve,0777777777,0,0,1
Steve,0777777777,0,2,14
Steve,0777777777,0,0,1
John,078888888,0,0,1
John,078888888,0,0,1
John,078888888,0,0,1
Dave,07999999,2,3,4
Dave,07999999,2,6,24
如果以上数据是我的插入内容,我只想查看John
的名称,因为他是唯一一个名字中所有行都包含值{0}的人,0,1&#39;
答案 0 :(得分:2)
如果元素具有相同的键/值,则只需从列表中删除该元素,您还希望在not
之前移除any
,如果有匹配则删除:
for ele in a_lst[:]:
if any(d['Name'] == ele['Name'] for d in b_lst):
a_lst.remove(ele)
或者在添加之前忘记使用any
并进行过滤,将row['Name']
添加到集合中并检查我们是否已经看过它:
seen = set()
with open(file_name, 'rt') as f:
reader = csv.DictReader(f)
for row in reader:
if row['Name'] in seen:
continue
if all((row['Minutes'] == '0', (row['MB'] == '0' or not row['MB']), row['Calls'] == '1')):
a_lst.append(row)
elif all((not row['Minutes'], (row['MB'] or not row['MB']), row['Calls'] == '1')):
a_lst.append(row)
else:
seen.add(row['Name'])
# remove "else:" and just use seen.add(row['Name']) outside the elif if you want all dups removed
根据您的修改:
seen = set()
with open(infile, 'rt') as f:
reader = csv.reader(f,delimiter=",")
for row in reader:
if row[0] in seen:
continue
if all(x in {"0", "1"} for x in row[2:]):
print(row)
seen.add(row[0])
输出:
['Steve', '0777777777', '0', '0', '1']
['John', '078888888', '0', '0', '1']
Steve
和John
在相对列中只有0和1。
如果您只想要列中专有0和1的名称:
from collections import defaultdict
d = defaultdict(list)
with open(infile, 'rt') as f:
reader = csv.reader(f,delimiter=",")
for row in reader:
d[row[0]].append([row, set(row[2:])])
print([v[0][0] for k, v in d.items() if all(sub[1] == {"0","1"} for sub in v)])
[['John', '078888888', '0', '0', '1']]
如果您的名字始终组合在一起,则使用集合:
seen = set()
temp = set()
with open(infile, 'rt') as f:
reader = csv.reader(f,delimiter=",")
next(reader)
prev = None
for row in reader:
# found new name and it is not the first
if row[0] not in seen and temp:
# set should only hav and 1 if all columns only contain 0,1
if temp == {"0", "1"}:
print(prev) # print previous row
# reset temp
temp = set()
seen.add(row[0])
temp.update(row[2:])
# need to keep track of previous row
prev = row
输出:
['John', '078888888', '0', '0', '1']