Python:脚本不会在文本文件中找到单词

时间:2016-06-03 15:14:42

标签: python file

我试图从文本文件中找到特定的单词,但是我的脚本似乎无法将该单词与文本文件中的行上写的内容相匹配,即使我知道它匹配。我注意到有空格但是因为我说entry in line,它不应该有用吗?

我也尝试过:

  if str(entry) in line:, 
  if str(entry) in str(line): and 
  if entry in str(line): 

但它们似乎都不起作用

我看不出我哪里出错了。任何帮助将不胜感激。

这是我的代码

with open(address+'file_containing_data_I_want.txt') as f:
    for entry in System_data:
        print "Entry:"
        print entry 
        for line in f:
            print "Start of line"
            print line
            print"End of line"
            if entry in line:
                print "Found entry in line" #This never gets printed

使用print语句(仅针对第一个条目)我看到:

Entry:
Manufacturer


Start of line
??

End of line
Start of line


End of line
Start of line
Manufacturer=manufacturer_data

End of line
Start of line
Model=model_data

End of line
Start of line


End of line
Start of line


End of line

文本文件如下所示(注意:我无法更改文本文件,因为这是我接收它的方式,'表示空行):

'
'
Manufacturer=manufacturer_data
Model=model_data
'
'
'

更新 将我的脚本更改为:

with open(address+'file_containing_data_I_want.txt') as f:
    for line in f:
        print "Start of line %s" % line
        print"End of line" 
        for entry in System_data:
            print "Entry: %s" % entry
            if entry in line.strip():
                print "Found entry in line"

正在打印的结果(仍然没有“找到参赛作品”):

Entry: Manufacturer
Entry: Model
Start of line: 
End of line
Entry: Manufacturer
Entry: Model
Start of line: Manufacturer=manufacturer_data
End of line
Entry: Manufacturer
Entry: Model
Start of line: Model=model_data
Entry: Manufacturer
Entry: Model
Start of line: 
End of line
Entry: Manufacturer
Entry: Model
Start of line: 
End of line

将我的代码更改为:

for line in f:
    print "Start of line: %s" % line.strip("\r\n")
    print "End of line" 
    for entry in System_data:
        print "Entry: %s" % entry.strip()
        if entry.strip() in line.strip("\r\n"):
            print "FOUND!!!!!!!!!!!!!"

给我这个:

Start of line: ??
End of line
Entry: Manufacturer
Entry: Model
Start of line: 
End of line
Entry: Manufacturer
Entry: Model
Start of line: Manufacturer=manufacturer_data
End of line
Entry: Manufacturer
Entry: Model
Start of line: Model=model_data
End of line

3 个答案:

答案 0 :(得分:1)

在第一个循环之后读到文件的末尾。相反,交换循环,以便在文件的每一行检查entry中的每个System_data

for line in f:
    print "Start of line %s" % line
    print "End of line" 
    for entry in System_data:
        print "Entry: %s" % entry
        if entry.strip() in line.strip("\r\n"):
            print "Found entry in line" #This now gets printed

或者您可以在f.seek(0)

之前调用for line in f来更正当前代码中的此行为

答案 1 :(得分:0)

您应该从文件中的条目和行中删除所有空白/换行符。所以,用

作为前缀
entry = entry.strip()

并更改

if entry in line:

if entry in line.strip():

编辑: 还有,Moses Koledoye说的话

答案 2 :(得分:0)

好的,似乎问题是该字符串实际上是十六进制形式。 但是当我使用print repr(line)它时,它只以十六进制形式显示给我,它看起来像: '\x00m\x00a\x00n\x00u\x00f\x00a\x00c\x00t\x00u\x00r\x00e\x00r\x00_\x00d\x00a\x0‌​0t\x00a\x00'

所以我将代码更改为以下内容:

with open(address+'file_containing_data_I_want.txt') as f:
    for line in f:
        for entry in System_data:
            line=line.strip()
            line = re.sub(r'[^\w=]', '', line)
            if entry in line:
                print "Found entry in line"

此脚本现在进入循环if entry in line:并打印"Found entry in line"