在python中读取文件和比较列

时间:2013-08-13 08:40:32

标签: python

例如: 考虑file1.txt的内容:

1   0   9227    1152    34  2
2   111 7622    1120    34  2
3   68486   710 1024    14  2
6   265065  3389    800 22  2
7   393152  48438   64  132 3
8   412251  46744   64  132 3
9   430593  50866   256 95  4
10  430730  10770   256 95  4
11  433750  12701   256 14  3
12  437926  2794    64  34  2
13  440070  43  32  96  3
14  440102  44  32  96  3
15  440357  43  32  96  3
16  440545  43  32  96  3
17  440599  43  32  96  3
18  440625  43  32  96  3
19  440999  84  32  96  0
20  441574  44  32  96  3
`````````````````````````````````````````
`````````````````````````````````````````
`````````````````````````````````````````
`````````````````````````````````````````
`````````````````````````````````````````
`````````````````````````````````````````

其中包含n6个字段的作业(i,e列(0-5))

现在,例如,我将前19个职位作为历史。然后我需要从20日开始阅读,依此类推比较与历史上述工作相匹配的列3,4,5。 如果它像第20个工作与历史中的6个(13,14,15,16,17,18)6个工作相匹配的例子那样做 现在我需要创建一个仅包含column2的匹配作业的列表?

任何人都可以在python中建议一个代码,通过它我可以读取第20行并与上面的历史进行比较并继续21,22,23------------------直到文件结束......

1 个答案:

答案 0 :(得分:2)

检查这是否适合您:

>>> history = {}
>>> historycount = 17
>>> for line in open("filename"):
    job = line.split()
    jobmatch_criteria = '-'.join(job[-3:])
    if historycount > 0:
        history.setdefault(jobmatch_criteria,[]).append(job)
        historycount -= 1
    else:
        print "Job", job[0], "Matched with:", '\n\t'.join(' '.join(i) for i in history[jobmatch_criteria]) if jobmatch_criteria in history else "None"


Job 20 Matched with: 13 440070 43 32 96 3
    14 440102 44 32 96 3
    15 440357 43 32 96 3
    16 440545 43 32 96 3
    17 440599 43 32 96 3
    18 440625 43 32 96 3
Job 21 Matched with: 6 265065 3389 800 22 2

我用它作为测试数据:

1   0   9227    1152    34  2
2   111 7622    1120    34  2
3   68486   710 1024    14  2
6   265065  3389    800 22  2
7   393152  48438   64  132 3
8   412251  46744   64  132 3
9   430593  50866   256 95  4
10  430730  10770   256 95  4
11  433750  12701   256 14  3
12  437926  2794    64  34  2
13  440070  43  32  96  3
14  440102  44  32  96  3
15  440357  43  32  96  3
16  440545  43  32  96  3
17  440599  43  32  96  3
18  440625  43  32  96  3
19  440999  84  32  96  0
20  441574  44  32  96  3
21  265065  3389    800 22  2