我有csv文件file1.csv
Territory Sales Zipcode city statename
00001000 10 99764
另一个包含城市详细信息的文件
Zipcode city Statename
99764 Northway Alaska
我想像下面那样更新file1.csv
Territory Sales Zipcode city statename
00001000 10 99764 Northway Alaska
这就像SQL中的典型更新语句
UPDATE file1 SET file1.value = (SELECT table2.CODE
FROM file2
WHERE table1.value = table2.DESC)
我如何在python中做到这一点?
答案 0 :(得分:3)
import pandas as pd
file1 = pd.read_csv('file1.csv')
file2 = pd.read_csv('file2.csv')
df = pd.merge(file1,file2,how='left', on = 'Zipcode')
df.to_csv('new_file.csv')
答案 1 :(得分:1)
如果您无权访问或不想安装pandas
,则可以使用csv
模块。请注意,使用中间字典d2
将邮政编码映射到file2.csv
中的城市和州名称:
with open('file1.csv') as file1, open('file2.csv') as file2, open('output.csv', 'w') as outfile:
output = csv.writer(outfile, delimiter=' ')
d2 = {zip: cols for zip, *cols in csv.reader(file2, delimiter=' ', skipinitialspace=True)}
for *cols, zip in csv.reader(file1, delimiter=' ', skipinitialspace=True):
output.writerow([*cols, zip, *d2.get(zip, [])])
给出file1.csv
以下内容:
Territory Sales Zipcode city statename
00001000 10 99764
00001001 11 99999
并为file2.csv
提供以下内容:
Zipcode city Statename
99764 Northway Alaska
99999 Somewhere CoolState
output.csv
将具有以下内容:
Territory Sales Zipcode city statename
00001000 10 99764 Northway Alaska
00001001 11 99999 Somewhere CoolState
还请注意,由于城市名称和州名称可以包含空格,因此应避免使用空格作为分隔符,而应改用实际的逗号,在这种情况下,您可以从{{1 }}。
答案 2 :(得分:0)
您提供的文件格式不正确,因为它们包含多个空格。在示例中,DSV文件的每一列都需要用单个特殊字符(例如)分隔。
在此示例中,我使用的是Pandas,但是由于Pandas有时在使用空格作为分隔符时遇到麻烦,因此我像下面这样转换了文件:
file1.csv
Territory,Sales,Zipcode
00001000,10,99764
file2.csv
Zipcode,city,Statename
99764,Northway,Alaska
一个利用Pandas写入file3.csv
的脚本如下所示:
import pandas as pd
# Load both files via pandas
file1 = pd.read_csv('file1.csv', sep=',')
file2 = pd.read_csv('file2.csv', sep=',')
# Merge results and save them
merge = file1.merge(file2, on='Zipcode')
merge.to_csv('file3.csv', sep=',', index=None)
您也可以使用sep=' '
,但我建议不要这样做,因为DSV文件已损坏,如前所述。