现在我想从区域列中删除重复项,但条件是它应该基于键。含义1个键不能有重复的Area。区域可以与其他键重复,但不能在同一键中重复。
我正在尝试创建它,但没有弄清楚背后的逻辑:
这些是我的代码:
import csv
OUTPUT_FILE = 'Desired_format.csv'
filename = "optionsbook.csv"
sublist = []
with open("./"+ filename, "r") as file,open(OUTPUT_FILE, 'w') as f_out:
reader = csv.DictReader(file)
for line in reader:
line["key"] = line["bhk"],line["Area"],line["Property_Type"]
if line["Area"] in line:
continue
else:
sublist.append(line["key"])
答案 0 :(得分:1)
您可以使用toolz.unique
。如果您无权访问此库,则可以使用unique_everseen
文档中相同的itertools
recipe。
这是一个演示:
from io import StringIO
import csv
from toolz import unique
x = StringIO("""key,Area,SomeField
12345,53.5,THIS
12345,56.1,IS
12345,76.0,A
67572,35.7,MINIMAL
67572,76.1,EXAMPLE""")
# replace x with open('file.csv', 'r')
with x as fin:
reader = unique(csv.DictReader(fin), lambda x: x['key'])
res = list(reader)
print(res)
[OrderedDict([('key', '12345'), ('Area', '53.5'), ('SomeField', 'THIS')]),
OrderedDict([('key', '67572'), ('Area', '35.7'), ('SomeField', 'MINIMAL')])]