我遇到迭代器无法迭代的问题。我试图在df2中查找df1的每个项目:查找行应该与'开始'的值相对应。在df1。然后我想返回匹配的列名。例如。对于df1 [2,0],它应该查找行' C'在df2中,返回' C',这是包含匹配值(5)的列。
DF1:
0 1 2
0 1 3 6
1 4 4 3
2 5 6 2
Start C A B
df2:
A B C
A 6 3 4
B 2 3 6
C 4 1 5
到目前为止,我已经:
for i, row in df1.iterrows():
for ii in range(0,len(df1.columns)):
col = df1.columns[ii]
result = pd.DataFrame(df2.loc[df1.loc['Start']].eq(col).idxmin(1))
这给了我一个系列(C,B,C),它只对df1的第0行进行匹配。理想的输出是3x3数据帧,对应于没有'开始'的df1。行:
0 1 2
0 C B C
1 A C B
2 ...
任何指针都非常感谢!
答案 0 :(得分:0)
如果我正确理解了问题,则您提供的输出不正确。它应该是:
0 1 2
0 B B C
1 A C B
2 C A A
我对大熊猫不是很流利但是能够使用一个版本..
def find_key_by_value(dic, value):
for k, v in dic.items():
if v == value:
return k
data = {0:[], 1:[], 2: []}
index = [0, 1, 2]
for i, row in df1.iterrows():
if i != 'Start': # Avoid calculating last line
for ii in range(0,len(df1.columns)):
col = df1.columns[ii]
to_match = row[ii] # number to match
to_start = df1.loc['Start'][ii] # row under Start label
# this is where my lack of pandas knowledge appears
df2_row_keys = df2.loc[to_start].to_dict()
result = find_key_by_value(df2_row_keys, to_match)
data[ii].insert(i, result)
# data = {0: ['B', 'A', 'C'], 1: ['B', 'C', 'A'], 2: ['C', 'B', 'A']}
result = pd.DataFrame(data=data, index=index)
答案 1 :(得分:0)
我建议的方式是:
result = []
for y, row in df1.iterrows():
if y == 'Start': # Skip the row named 'Start'
continue
result.append([]) # Make a new row in the result
for x, item in row.iteritems():
start = df1.loc['Start', x] # The same column, but in the start row
search_row = df2.loc[start] # The row to look for a match in
occurences = search_row.where(search_row == item)
result[y].append(occurences.argmax()) # '.argmax' limits it to one occurence.
print(pd.DataFrame(result))
给出了输出:
0 1 2
0 B B C
1 A C B
2 C A A