使用python程序匹配文件

时间:2011-08-18 11:33:37

标签: python

我有两个文件如下

File1
ids
CID5265
CID7263
CID9289
....

File 2
ids
CID7363  3.5e-06 -3837 
CID5265  4.5      -938
CID9289  8.9      -9873
....

我希望将file1与file2进行比较以匹配file1中的任何ID,如果是,我应该按如下方式打印整行

CIDS9289  8.9  -9873
....

实现这个编写的python脚本如下

infile = open("file1","r")

searchtxt = open("file2.txt","r")

for line in infile.readlines():

    if searchtxt in line:

       print line

但我发出以下错误

Traceback (most recent call last):
  File "finding_words.py", line 7, in <module>
    if searchtxt in line:
TypeError: 'in <string>' requires string as left operand, not file

我知道我在做很简单的错误,但是无法弄清楚任何人都可以告诉我如何解决这个问题。

提前谢谢

NI

4 个答案:

答案 0 :(得分:2)

使用以下内容:

print [line for id in searchTxtData for line in inFileData if id.strip() in line]

或使用with statement:

ids = [id.strip() for id in open("file1.txt","r") if id.strip()]

with open("file2.txt","r") as dataFile:
    for line in dataFile:            
        if line.strip() and line.split()[0] in ids:
           print line

答案 1 :(得分:2)

# Usage: foo.py ID_FILE DATA_FILE

ids = set()
with open(sys.argv[1]) as id_file:
    ids = set(line.strip() for line in id_file)

with open(sys.argv[2]) as data_file:
    for line in data_file:
        if line.split()[0] in ids:
            print line,

答案 2 :(得分:1)

您的程序失败,因为searchtxt是文件对象,而不是字符串。据推测,您希望在该文件对象上添加另一个循环,检查从searchtxt中的line读取的文本。

答案 3 :(得分:0)

问题是您向后搜索 - 而不是查看第一个文件中的行是否在第二个文件中,而是询问整个第二个文件是否在第一个文件的行中。您需要遍历第二个文件,检查启动每一行的密钥是否在第一个文件数据中。按如下方式更改您的代码:

infile = open("file1","r")

keys = set()
for line in infile:
    break   # skip first line
for line in infile:
    keys.add(line.strip())   # get rid of trailing newline

searchtxt = open("file2.txt","r")

for line in searchtxt:
    break   # skip first line
for line in searchtxt:
    key, rest = line.split(' ', 1)
    if key in keys:
        print line