我有以下代码,但在某处似乎有一个错误。我得到输出(a)但需要输出(b) - 见下文。任何人都可以看到我错在哪里?所有文件都以制表符分隔。
代码:
import sys
outfile_name = sys.argv[-1]
filename1 = sys.argv[-2]
filename2 = sys.argv[-3]
fileIn1 = open(filename1, "r")
fileIn2 = open(filename2, "r")
fileOut = open(outfile_name, "w")
dict = {}
a = open(filename1)
b = open(filename2)
for line in a:
words = line.split("\t")
if len(words) != 1:
target = words[0]
for word in words[1:]:
dict[word] = target
for line in b:
words = line.split("\t")
if words[0] in dict.keys() and words[1] in dict.keys():
fileOut.write(dict[words[0]] + "\t" + dict[words[1]] + "\n")
elif words[0] in dict.keys() and words[1] not in dict.keys():
fileOut.write(dict[words[0]] + "\t" + words[1] + "\n")
elif words[0] not in dict.keys() and words[1] in dict.keys():
fileOut.write(words[0] + "\t" + dict[words[1]] + "\n")
elif words[0] not in dict.keys() and words[1] not in dict.keys():
fileOut.write(words[0] + "\t" + words[1] + "\n")
fileOut.close()
文件名1:
Area_1 Area_2
A B
A C
A D
D B
D C
L B
L C
L A
D L
K A
K B
K C
K D
K L
D P
D R
L P
L R
K P
K R
A H
D H
L H
K H
B P
B R
R P
A I
D I
I L
I K
C H
I H
C H
J K
J X
J Y
J Z
K X
K Y
Y Z
K Z
X Y
X Z
M G
N T
O S
S Q
文件名2:
Incident_00000001 A D L K
Incident_00000002 B P R
Incident_00000003 C F W
Incident_00000004 J I
M
N
O
Incident_00000005 Q S
X
Y
Z
G
T
输出(b) - 我得到的不良输出:
Area_1 Area_2
Incident_00000001 B
Incident_00000001 C
Incident_00000001 D
Incident_00000001 B
Incident_00000001 C
Incident_00000001 B
Incident_00000001 C
Incident_00000001 A
Incident_00000001 L
K A
K B
K C
K D
K L
Incident_00000001 P
Incident_00000001 Incident_00000002
Incident_00000001 P
Incident_00000001 Incident_00000002
K P
K Incident_00000002
Incident_00000001 H
Incident_00000001 H
Incident_00000001 H
K H
Incident_00000002 P
Incident_00000002 Incident_00000002
R P
Incident_00000001 Incident_00000003
Incident_00000001 Incident_00000003
I L
I Incident_00000004
Incident_00000003 H
I H
Incident_00000003 H
Incident_00000004 Incident_00000004
Incident_00000004 X
Incident_00000004 Y
Incident_00000004 Z
K X
K Y
Y Z
K Z
X Y
X Z
M G
N T
O S
Incident_00000005 Incident_00000005
我期待得到的结果(输出(c))是:
Area_1 Area_2
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000003
Incident_00000001 Incident_00000001
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000003
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000003
Incident_00000001 Incident_00000001
Incident_00000001 Incident_00000001
Incident_00000001 Incident_00000001
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000003
Incident_00000001 Incident_00000001
Incident_00000001 Incident_00000001
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000002
Incident_00000001 Incident_00000002
Incident_00000001 H
Incident_00000001 H
Incident_00000001 H
Incident_00000001 H
Incident_00000002 Incident_00000002
Incident_00000002 Incident_00000002
Incident_00000002 Incident_00000002
Incident_00000001 Incident_00000004
Incident_00000001 Incident_00000004
Incident_00000004 Incident_00000001
Incident_00000004 Incident_00000001
Incident_00000003 H
Incident_00000004 H
Incident_00000003 H
Incident_00000004 Incident_00000001
Incident_00000004 X
Incident_00000004 Y
Incident_00000004 Z
Incident_00000001 X
Incident_00000001 Y
Y Z
Incident_00000001 Z
X Y
X Z
M G
N T
O Incident_00000005
Incident_00000005 Incident_00000005
答案 0 :(得分:1)
import csv
graph = {}
with open(filename2) as infile:
for incident, *rest in csv.reader(infile, delimiter='\t'):
if not rest: continue
for node in rest:
graph[node] = incident
with open('filename1') as infile, open('path/to/output', 'w') as outfile:
writer = csv.writer(outfile, delimiter='\t')
for source, dest in csv.reader(infile):
if source in graph: source = graph[source]
if dest in graph: dest = graph[dest]
writer.writerow([source, dest])