我有两个文件,一个是用户输入f1,另一个是数据库f2.I想要搜索来自f1的字符串是否在数据库中(f2)。如果不打印那些不存在的那些,如果f2。我的代码有问题,它工作不正常: 这是f1:
rbs003491
rbs003499
rbs003531
rbs003539
rbs111111
这是f2:
AHPTUR13,rbs003411
AHPTUR13,rbs003419
AHPTUR13,rbs003451
AHPTUR13,rbs003459
AHPTUR13,rbs003469
AHPTUR13,rbs003471
AHPTUR13,rbs003479
AHPTUR13,rbs003491
AHPTUR13,rbs003499
AHPTUR13,rbs003531
AHPTUR13,rbs003539
AHPTUR13,rbs003541
AHPTUR13,rbs003549
AHPTUR13,rbs003581
在这种情况下,它将返回rbs11111
,因为它不在f2中。
代码是:
with open(c,'r') as f1:
s1 = set(x.strip() for x in f1)
print s1
with open("/tmp/ARNE/blt",'r') as f2:
for line in f2:
if line not in s1:
print line
答案 0 :(得分:1)
如果您只关心每一行的第二部分(rbs003411
中的AHPTUR13,rbs003411
):
with open(user_input_path) as f1, open('/tmp/ARNE/blt') as f2:
not_found = set(f1.read().split())
for line in f2:
_, found = line.strip().split(',')
not_found.discard(found) # remove found word
print not_found
# for x in not_found:
# print x
答案 1 :(得分:0)
for循环中的line
变量将包含" AHPTUR13,rbs003411"等内容,但您只对第二部分感兴趣。你应该做点什么:
for line in f2:
line = line.strip().split(",")[1]
if line not in s1:
print line
答案 2 :(得分:0)
你需要检查线条的最后部分而不是所有线条,你可以用,
从f2分割线条,然后选择最后一部分(x.strip().split(',')[-1]
),如果你想搜索如果来自f1的字符串在数据库(f2)中,那么你的LOGIC就错了,你需要从f2
创建你的集合:
with open(c,'r') as f1,open("/tmp/ARNE/blt",'r') as f2:
s1 = set(x.strip().split(',')[-1] for x in f2)
print s1
for line in f1:
if line.strip() not in s1:
print line