Question

我正在尝试从输入文件中提取数据并迭代符号文件以创建输出文件的输出，但我的代码在输出文件中创建了不需要的副本。输入文件非常大，所以我需要首先过滤输入，然后再将其引用到符号（城市/州）文件以生成输出。

i_file = ('InputFile.csv')
o_file = ('OutputFile.csv')
symbol_file = ('SymbolFile.csv')
City = 'Tampa'
State = 'FL'

with open(symbol_file, 'r') as symfile:
    with open(i_file, 'r') as infile:
        with open(o_file, 'w') as outfile:

            reader = csv.reader(infile)
            symbol = csv.reader(symfile)
            writer = csv.writer(outfile, delimiter = ',')

            for row in reader:
                if (row[2] == city and row[3] == state):

                   for line in symbol:
                        if (row[4] == line[0]):
                            nline = ([str(city)] + [str(line[3])])
                            writer.writerow(nline)
                    symfile.seek(0)

Answer 1

如果符号文件中有匹配的行，我只需要输入文件中每行的一行。

然后尝试这样：

i_file = 'InputFile.csv'
o_file = 'OutputFile.csv'
symbol_file = 'SymbolFile.csv'

city = 'Tampa'
state = 'FL'

# load the symbols from the symbol file and store them in a dictionary
symbols = {}
with open(symbol_file, 'r') as symfile:
    for line in csv.reader(symfile):
        # key is line[0] which is the thing we match against
        # value is line[3] which appears to be the only thing of interest later
        symbols[line[0]] = line[3]

# now read the other files
with open(i_file, 'r') as infile, open(o_file, 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile, delimiter = ',')

    for row in reader:
        # check if `row[4] in symbols`
        # which essentially checks whether row[4] is equal to a line[0] in the symbols file
        if row[2] == city and row[3] == state and row[4] in symbols:
            # get the symbol
            symbol = symbols[row[4]]

            # write output
            nline = [city, symbol]
            writer.writerow(nline)

Loop创建不需要的重复

1 个答案: