我有以下格式的4个文本文件
keycountry.txt
UK USA Germany
country.txt
Brexit - UK
USA UK Relations
France win world cup
keylink.txt
www.abc.com
www.ddd.com
www.eee.com
link.txt
www.abc.com
www.eee.com
代码:
import re
keycountryfile = "keycountry.txt"
countryfile = "country.txt"
links = open('links.txt', 'r')
links_data = links.read()
links.close()
keys = open('keylink.txt', 'r')
keys_data = keys.read()
keys.close()
keys_split = keys_data.splitlines()
print('LINKS')
for url in keys_split:
if url in links_data:
print(url)
print("matching")
else:
print("Not matching")
keys = set(key.lower() for key in
re.findall(r'\w+', open(keycountryfile , "r").readline()))
print("COUNTRY")
with open(countryfile) as f:
for line in f:
words = set(word.lower() for word in re.findall(r'\w+', line))
if keys & words:
print(line, end='')
print("matching")
else:
print("Not matching")
在代码print("matching")
中重复了多次。我知道,由于它在循环内,因此会重复执行,print("Not matching")
在没有匹配项时不会显示。我尝试将打印语句放入循环的内部和外部,但是我无法解决该问题。
如果匹配,则输出应类似于:
LINKS
www.abc.com
www.eee.com
matching
COUNTRY
Brexit-UK
USA UK Relations
matching
如果输出不匹配,则应类似于:
LINKS
Not matching
COUNTRY
Not matching
如何处理?
答案 0 :(得分:0)
您可以将结果保存到列表中,并在找到所有匹配项后打印结果。
import re
keycountryfile = '''UK USA Germany'''
countryfile = '''Brexit - UK
USA UK Relations
France win world cup'''
links = '''www.abc.com
www.eee.com'''
links_data = links.split()
keys = '''www.abc.com
www.ddd.com
www.eee.com'''
keys_data = keys.split()
keys_split = keys_data
matching_links = []
not_links = []
for url in keys_split:
if url in links_data:
matching_links.append(url)
else:
not_links.append(url)
keys = set(keycountryfile.split())
matching_country = []
not_country = []
for line in countryfile.split():
words = set(word.lower() for word in re.findall(r'\w+', line))
if keys & words:
matching_country.append(line)
else:
not_country.append(line)
print('LINKS')
if matching_links:
print('\n'.join(matching_links))
print("matching")
print("COUNTRY")
print()
if matching_country:
print('\n'.join(matching_country))
print("matching")
print('LINKS')
if not_links:
print('\n'.join(not_links))
print("Not matching")
print("COUNTRY")
if not_country:
print('\n'.join(not_country))
print("Not matching")
您可以尝试使用此代码here
答案 1 :(得分:0)
似乎您的问题一方面与for-else结构有关。否则,其他将始终以您的代码执行。
此外,以kaihami的答案为基础,要实现您所描述的内容,您需要将匹配的链接/行存储在单独的结构(如列表)中,然后检查该列表是否为空以打印匹配的条目或字符串“ Not匹配”,这是我建议的解决方案:
import re
keycountryfile = "keycountry.txt"
countryfile = "country.txt"
with open('links.txt', 'r') as links:
links_data = [line.strip() for line in links.readlines()]
with open('keylink.txt', 'r') as keys:
keys_links = set([line.strip() for line in keys.readlines()])
matching_links = []
for url in links_data:
if url in keys_links:
matching_links.append(url)
print('LINKS')
if matching_links:
print('\n'.join(matching_links))
print("matching")
else:
print("Not matching")
print()
with open(keycountryfile , "r") as f:
country_keys = set(key.lower() for key in
re.findall(r'\w+', f.readline()))
matching_lines = []
with open(countryfile) as f:
for line in f:
words = set(word.lower() for word in re.findall(r'\w+', line))
if country_keys & words:
matching_lines.append(line.strip())
print("COUNTRY")
if matching_lines:
print('\n'.join(matching_lines))
print("matching")
else:
print("Not matching")