Question

这是一个文件result.csv：

M11251TH1230 
M11543TH4292 
M11435TDS144

这是另一个文件sample.csv：

M11435TDS144,STB#1,Router#1 
M11543TH4292,STB#2,Router#1 
M11509TD9937,STB#3,Router#1
M11543TH4258,STB#4,Router#1

我是否可以编写一个Python程序来比较这两个文件，如果result.csv中的行与sample.csv中的第一个字匹配，则追加1，否则在{{{}}的每一行追加0 1}}？

Answer 1

以下代码片段适合您

import csv

with open('result.csv', 'rb') as f:
    reader = csv.reader(f)
    result_list = []
    for row in reader:
        result_list.extend(row)
with open('sample.csv', 'rb') as f:
    reader = csv.reader(f)
    sample_list = []
    for row in reader:
        if row[0] in result_list:
            sample_list.append(row + [1])
        else:
            sample_list.append(row + [0]
with open('sample.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(sample_list)

Answer 2

import pandas as pd

d1 = pd.read_csv("1.csv",names=["Type"])
d2 = pd.read_csv("2.csv",names=["Type","Col2","Col3"])
d2["Index"] = 0

for x in d1["Type"] :
    d2["Index"][d2["Type"] == x] = 1

d2.to_csv("3.csv",header=False)

考虑＆＃34; 1.csv＆＃34;和＆＃34; 2.csv＆＃34;是您的csv输入文件和＆＃34; 3.csv＆＃34;是你需要的结果

Answer 3

使用csv.reader和csv.writer（csv模块）的解决方案：

import csv

newLines = []
# change the file path to the actual one
with open('./data/result.csv', newline='\n') as csvfile:
    data = csv.reader(csvfile)
    items = [''.join(line) for line in data]

with open('./data/sample.csv', newline='\n') as csvfile:
    data = list(csv.reader(csvfile))
    for line in data:
        line.append(1 if line[0] in items else 0)
        newLines.append(line)

with open('./data/sample.csv', 'w', newline='\n') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(newLines)

sample.csv内容：

M11435TDS144,STB#1,Router#1,1
M11543TH4292,STB#2,Router#1,1
M11509TD9937,STB#3,Router#1,0
M11543TH4258,STB#4,Router#1,0

Answer 4

只有一列，我想知道你为什么把它作为result.csv。如果它不再有任何列，那么简单的文件读取操作就足够了。将数据从result.csv转换为字典也有助于快速运行。

result_file = "result.csv"
sample_file = "sample.csv"

with open(result_file) as fp:
    result_data = fp.read()
    result_dict = dict.fromkeys(result_data.split("\n"))
    """
    You can change the above logic, in case you have very few fields on csv like this:
    result_data = fp.readlines()
    result_dict = {}
    for result in result_data:
        key, other_field = result.split(",", 1)
        result_dict[key] = other_field.strip()
    """

#Since sample.csv is a real csv, using csv reader and writer
with open(sample_file, "rb") as fp:
    sample_data = csv.reader(fp)
    output_data = []
    for data in sample_data:
        output_data.append("%s,%d" % (data, data[0] in result_dict))

with open(sample_file, "wb") as fp:
    data_writer = csv.writer(fp)
    data_writer.writerows(output_data)

Python文件匹配和追加

4 个答案: