Question

我想比较两个.txt文件。第一个文件是＆＃34;键＆＃34;三个值由制表符分隔（即＆＃34;项目编号＆＃34;＆＃34;响应＆＃34;＆＃34;代码＆＃34;）。第二个文件包含由制表符分隔的两个值（＆＃34;项目编号＆＃34;和＆＃34;响应＆＃34;）。我需要我的程序搜索第一个文件，找到任何匹配的＆＃34;项目编号/响应＆＃34;与第二个文件配对，然后输出正确的＆＃34;代码。＆＃34;如果没有匹配，那么我希望输出只是一个空格（＆＃34;＆＃34;）。我不是程序员，但弄清楚这会大大减少我在工作中花在某些任务上的时间。

我发现此thread有助于设置我的代码。我想完成同样的事情。

file 1, "Key.txt":  
1   dog C  
2   cat C  
3   bird    C  
4   pig C  
5   horse   C  
1   cat Sem  
2   bat TA  
3   animal  Super  
4   panda   M  
5   pencil  U  

file2, "Uncoded.txt":  
4   pig  
3   animal  
5   bird  
2   bat  
2   cat  
0   
1   fluffy  
0   dog  
1   

desired output:  
4   pig  C  
3   animal  Super  
5   bird    
2   bat  TA  
2   cat  C  
0     
1   fluffy    
0   dog    
1

以下是我的代码：

f1 = open("Key.txt")  
f2 = open("Uncoded.txt")    
d = {}  

while True:  
    line = f1.readline()  
    if not line:  
        break  
    c0,c1,c2 = line.split('\t')  
    d[(c0,c1)] = (c0,c1,c2)  
while True:  
    line = f2.readline()  
    if not line:  
        break  
    c0,c1 = line.split('\t')  
    if (c0,c1) in d:  
        vals = d[(c0,c1)]  
        print (c0, c1, vals[1])  

f1.close()  
f2.close()

如果我尝试用制表符分隔这些行（＆＃39; \ t＆＃39;），那么我得到一个ValueError：太多的值要解压缩行＆＃34; c0，c1，c2 = line.split （＆＃39; \吨＆＃39;）＆＃34;

非常感谢任何见解或帮助！

Answer 1

您遇到的问题是，您的某个文件中的某一行没有您期望的项目数。可能的原因是额外的换行符（可能在文件的末尾）。 Python会在最后一个 real 行之后看到它只有一个换行符。当它不能将空行拆分为三个部分时，你的逻辑就会失败。

解决此问题的一种方法是拆分为单个变量，而不解压缩值。然后你可以检查分割的项目数量，如果它是预期的数量，只能继续解压缩：

while True:  
    line = f1.readline()  
    if not line:  
        break  
    vals = line.split('\t')  # don't unpack immediately
    if len(val) == 3:        # check you got the expected number of items
        c0, c1, c2 = vals    # unpack only if it will work
        d[(c0,c1)] = (c0,c1,c2)
    else:
        print("got unexpected number of values: {}".format(vals) # if not, report the error

它与您的错误无关，但如果您愿意，可以使用for循环而不是while循环来简化循环。文件对象是可迭代的，产生文件的行（就像你从readline()得到的。最好的事情是你不需要自己查找文件的结尾，迭代只是文件耗尽时结束：

for line in f1:    # this does the same thing as the first four lines in the code above
    ...

ValueError：解压缩的值太多（使用带有元组键的dict时）

1 个答案: