我只是尝试使用python解决this文本处理任务,但我无法比较列。
我尝试过:
#!/usr/bin/env python
import sys
def Main():
print "This is your input Files %s,%s" % ( file1,file2 )
f1 = open(file1, 'r')
f2 = open(file2, 'r')
for line in f1:
column1_f1 = line.split()[:1]
#print column1_f1
for check in f2:
column2_f2 = check.split()[:1]
print column1_f1,column2_f2
if column1_f1 == column2_f2:
print "Match",line
else:
print line,check
f1.close()
f2.close()
if __name__ == '__main__':
if len(sys.argv) != 3:
print >> sys.stderr, "This Script need exact 2 argument, aborting"
exit(1)
else:
ThisScript, file1, file2 = sys.argv
Main()
我是Python的新手,请帮助我学习和理解这个......
答案 0 :(得分:2)
我会以python3
awk
与import sys
codes = {}
with open(sys.argv[2], 'r') as f2:
for line in f2:
fields = line.split()
codes[fields[0]] = fields[1]
with open(sys.argv[1], 'r') as f1:
for line in f1:
fields = line.split(None, 1)
if fields[0] in codes:
print('{0:4s}{1:s}'.format(codes[fields[0]], line[4:]), end='')
else:
print(line, end='')
的方式解决此问题。读取第二个文件并将其密钥保存在字典中。稍后检查第一个文件的每一行是否存在:
python3 script.py file1 file2
像以下一样运行:
060090 AKRABERG FYR DN 6138 -666 101
EKVG 060100 VAGA FLOGHAVN DN 6205 -728 88
060110 TORSHAVN DN 6201 -675 55
060120 KIRKJA DN 6231 -631 55
060130 KLAKSVIK HELIPORT DN 6221 -656 75
060160 HORNS REV A DN 5550 786 21
060170 HORNS REV B DN 5558 761 10
060190 SILSTRUP DN 5691 863 0
060210 HANSTHOLM DN 5711 858 0
EKGF 060220 TYRA OEST DN 5571 480 43
EKTS 060240 THISTED LUFTHAVN DN 5706 870 8
060290 GROENLANDSHAVNEN DN 5703 1005 0
EKYT 060300 FLYVESTATION AALBORG DN 5708 985 13
060310 TYLSTRUP DN 5718 995 0
060320 STENHOEJ DN 5736 1033 56
060330 HIRTSHALS DN 5758 995 0
EKSN 060340 SINDAL FLYVEPLADS DN 5750 1021 28
产量:
{{1}}