将一个txt文件中的字符串与另一个

时间:2017-03-21 21:35:32

标签: python string python-3.x

我正在开发一个将用于日常自动化的脚本。

我有2个文件,一个是包含cusips列表的静态文件。第二个文件是一个如下所示的数据文件:

<Imports/>
<InterpretFXRates/>
<SCXList date="20170309">
<SCX type="cs" iso="USD" symbol="SPLS" cusip="855030102" name="STAPLES INC COM" issuer="us" record="20170324" maturity="20170413" intdiv=".48" sap="NR" moody="NR" apinternalid="USD855030102" action="a"/>
<SCX type="cs" iso="USD" symbol="ARE" cusip="015271109" name="ALEXANDRIA REAL ESTATE EQUITIES INC COM" issuer="us" record="20170331" maturity="20170417" intdiv="3.32" sap="NR" moody="NR" apinternalid="USD015271109" action="a"/>
<SCX type="cs" iso="USD" symbol="AMGN" cusip="031162100" name="AMGEN INC COM" issuer="us" record="20170517" maturity="20170608" intdiv="4.6" sap="NR" moody="NR" apinternalid="USD031162100" action="a"/>

所以我要做的是迭代静态文件的每个cusip并检查它是否在上面的任何行中。如果找到,那么我们将从新文件中删除该行。

import csv

bond_list = 'BondFilterList.txt'  #containes list of cusips
dataport_file = 'test.scx'        #contained the <SCX... data
output_file = 'out.scx'

data = []

with open(bond_list, 'r') as bl, open(dataport_file, 'r') as df:

    for cusip in bl:
        lines = [y.strip() for y in df]
        for line in lines:
            if cusip in line:
                print("Matched")
            else:
                data.append(line)


with open(output_file, "w") as output:
     writer = csv.writer(output, lineterminator = '\n', escapechar = ' ',     quoting = csv.QUOTE_NONE)
     for x in data:
         writer.writerow([x])

output.close

我肯定错过了一些东西,因为我的if语句总是返回False

2 个答案:

答案 0 :(得分:1)

for cusip in bl:
    for line in (y.strip() for y in df):   

这是2个文件迭代器的双循环。内循环只在第一次工作正常。其他时候它甚至都没有输入,因为df已到达文件末尾。

重写:

lines = [y.strip() for y in df]  # listcomp not gencomp: compute a real list
for cusip in bl:
    for line in lines:   

答案 1 :(得分:0)

这是我的最终代码,再次感谢Jean的帮助。

import csv

bond_list = 'BondFilterList.txt'  #containes list of cusips
dataport_file = 'test.scx'        #contained the <SCX... data
output_file = 'out.scx'

data = []

with open(bond_list, 'r') as bl, open(dataport_file, 'r') as df:
    lines = [y.strip() for y in df]
    cusips = [cusip.upper().strip() for cusip in bl]

    for line in lines:    
         if not any(cusip in line for cusip in cusips):                
            data.append(line)

with open(output_file, "w") as output:
     writer = csv.writer(output, lineterminator = '\n', escapechar = ' ', quoting = csv.QUOTE_NONE)
     for x in data:
         writer.writerow([x])

output.close