Question

假设我们有以下csv文件

file1.csv

#groups id  owner
abc id1 owner1
abc id2 owner1
bcx id1 owner2
cpa id3 owner1

以下脚本读取file1.csv，在第一列#groups上进行过滤，并添加多余的字符

#!/bin/env python2
#!/usr/bin/python

import re
import csv
print "enter Path to orignal file"
GROUPS = raw_input() 
print "enter Path to modified file"
WORKING = raw_input() 

def filter_lines(f):
    """this generator funtion uses a regular expression
    to include only lines that have a `abc` at the start
    and NO `gep` throughout the record
    """
    filter_regex = r'^abc(?!gep).*'              
    for line in f:
        line = line.strip()
        m = re.match(filter_regex, line)
        if m:
            yield line           

pat = re.compile(r'^(abc)(?!.*gep.*)') #insert gep in any abc records that dont have gep            

#insert gep 
variable1 = 0  

with open(GROUPS, 'r') as f: 
    with open(WORKING, 'w') as data:
        #next(f)  # Skip over header in input file.
        #filter
        filter_generator = filter_lines(f)
        csv_reader = csv.reader(filter_generator)
        count = 0
        writer = csv.writer(data) #, quoting=csv.QUOTE_ALL
        for row in csv_reader:
            count += 1
            variable1 = (pat.sub('\\1gep_', row[0])) #modify all filtered records to include gep
            fields = [variable1]
            writer.writerow(fields)

print 'Filtered (abc at Start and NO gep) Rows Count = ' + str(count)

例如，abc将变为abc_gep，我们将其写入另一个csv文件file2.csv

因此 file2.csv 现在仅包含：

abc_gep
abc_gep

好。

现在我想添加与file1.csv

中的 abc 匹配的其余列

我该怎么做？

我尝试了以下

fields = [variable1,row[1],row[2]]

但这是对列进行硬编码，而不是动态的。我正在寻找更像这样的东西：

fields = [variable1, row[i]]

本质上，这是我正在寻找 file2.csv 的结果：

abc_gep id1 owner1
abc_gep id2 owner1

如何动态地将列添加到.csv文件？

0 个答案: