我正在尝试搜索多个文件,以查找配方中每种成分的匹配项。大多数时候搜索结果都是正确的;然而,结果出现了一些情况,我无法确定问题所在。我使用的代码和扫描的文本文件可以在下面找到。任何帮助将非常感谢和感谢。
搜索到的文本文件可在以下位置找到: All ingredient text file material
import os
import re
path_files = '/users/haida/desktop/Ingredients'
listing = os.listdir(path_files)
flag = 0
category = []
x = [ u' water ', u' fresh lemon juice ',
u' uncooked quick-cooking bulgur ',
u' cubed cooked chicken breast (about 1/2 pound) ',
u' finely chopped fresh parsley ',
u' (14-ounce) can quartered artichoke hearts, drained and coarsely chopped ',
u' grape tomatoes, halved ',
u" light Northern Italian salad dressing with basil and Romano (such as Ken's Steak House Lite) ",
u' fresh lemon juice ', u'pork',u'cheese',u'apple', u'sugar', u'lime']
for i in x:
for infile in listing:
text = open(path_files + '/' + infile, 'r')
for line in text.readlines():
line = line.strip()
#if line.lower() in i[].lower():
if (re.search(line.lower(), i.lower(),)):
category.append(infile.rsplit('.txt', 1)[0])
flag = 1
break
if flag != 1:
category.append('others')
flag = 0
i = 0
print 'Right result:'
while i < len(x):
print x[i] + ' ' + category[i]
i = i + 1
y = [u' bulgur wheat ',
u' coarsely chopped onions ',
u' ground lean (7% fat or less) beef or lamb ',
u'ice'
,u' pepper ',
u' ground cumin ',
u' ground cinnamon ',
u' About 1 teaspoon salt ','meat',
u' chopped parsley ',
'ice',
u'meat',
u' plain nonfat yogurt ']
for i in y:
for infile in listing:
text = open(path_files + '/' + infile, 'r')
for line in text.readlines():
line = line.strip()
#if line.lower() in i[].lower():
if (re.search(line.lower(), i.lower(),)):
category.append(infile.rsplit('.txt', 1)[0])
flag = 1
break
if flag != 1:
category.append('others')
flag = 0
i = 0
print '\n \n'
print 'Correct Result: '
while i < len(y):
print y[i] + ' ' + category[i]
i = i + 1
注意:您需要更改&#34; path_files&#34;值并将不同文本文件之间的链接数据分开以查看结果。