我想使用queryfile.txt作为源文件,它将用于搜索每一行并将其与datafile.txt匹配。但是datafile.txt有不同的结构。
queryfile.txt应如下所示:
Gina Cooper
Asthon Smith
Kim Lee
而datafile.txt看起来像这样:
Gina Cooper
112 Blahblah St., NY
Leigh Walsh
09D blablah, Blah
Asthon Smith
another address here
Kim Lee
another address here
我需要在它之后获取名称和行。以下是在两个文件中获取匹配名称的代码,这是dstromberg(https://stackoverflow.com/a/19934477)的修改代码:
with open('querfile.txt', 'r') as input_file:
input_addresses = set(names.rstrip() for names in input_file)
with open('datafile.txt', 'r') as data_file:
data_addresses = set(names.rstrip() for names in data_file)
with open('names_address.txt', 'w') as output:
names_address=("\n".join(input_addresses.intersection(data_addresses)))
output.write(names_address)
总之,我想在outfile(names_address.txt)中看到的是名称PLUS与其名称对应的地址,这基本上就是下一行。我一个月前刚刚开始玩python而且我相信我被卡住了。谢谢你的帮助。
答案 0 :(得分:0)
重写:
with open('datafile.txt', 'r') as data_file:
data_addresses = set(names.rstrip() for names in data_file)
对此:
with open('datafile.txt', 'r') as data_file:
data = data_file.readlines()
data_addresses = list(filter(None, [line for line in data if not line[0].isdigit()]))
答案 1 :(得分:0)
转而通过选项,然后你可以抓住下一个索引:
for i in range(len(data_addresses):
for entry in input_addresses:
if entry==data_addresses[i]:
output.write(data_address[i] + data_address[i+1])
这可能没有很大的时间复杂性,但您的数据集会出现