我正在寻找向数据框中添加引用该数据框中其他行的信息。数据框具有成对的科学术语,按排名层次排列(物种-属对,属-家庭对等)。我需要在同一行中对源对和目标对都具有taxrank。我在行上有目标taxrank,但需要搜索源单元格是目标的位置,以提取适当的taxrank并将其添加为taxrank列。以下是到目前为止的示例:
TAXRANK_target target source
45139 order Salmoniformes Protacanthopterygii
45140 family Salmonidae Salmoniformes
45201 genus Salmo Salmonidae
45202 species labrax Salmo
45203 species carpio Salmo
45204 species trutta Salmo
45205 species letnica Salmo
45206 species marmoratus Salmo
45207 species fibreni Salmo
我希望它看起来像什么
TAXRANK_target target source TAXRANK_source
45139 order Salmoniformes Protacanthopterygii NaN
45140 family Salmonidae Salmoniformes order
45201 genus Salmo Salmonidae family
45202 species labrax Salmo genus
45203 species carpio Salmo genus
45204 species trutta Salmo genus
45205 species letnica Salmo genus
45206 species marmoratus Salmo genus
45207 species fibreni Salmo genus
45208 species obtusirostris Salmo genus
我不知道是如何故意引用一行以影响另一行。
答案 0 :(得分:1)
由Series.map
创建的Series
使用DataFrame.set_index
:
#if values in target column are not duplicated
s = df.set_index('target')['TAXRANK_target']
#if possible duplicated keep first value only
#s = df.drop_duplicates('target').set_index('target')['TAXRANK_target']
df['TAXRANK_source'] = df['source'].map(s)
print (df)
TAXRANK_target target source TAXRANK_source
45139 order Salmoniformes Protacanthopterygii NaN
45140 family Salmonidae Salmoniformes order
45201 genus Salmo Salmonidae family
45202 species labrax Salmo genus
45203 species carpio Salmo genus
45204 species trutta Salmo genus
45205 species letnica Salmo genus
45206 species marmoratus Salmo genus
45207 species fibreni Salmo genus