这是第一个数据框
Umls Snomed
C0027497/Nausea /Sign or Symptom Nausea (finding)[FN/422587007]
C0151786 / Muscle/Sign or Symptom Muscle weakness [(finding) /FN/26544005]
C2127305 /bitter/ Sign or Symptom ?
NA NA
我使用以下代码
创建了它的字典df_dic_1= df_dic_1[['UMLS', 'snomed']]
df_dic_1['UMLS'].fillna(0, inplace=True)
df_dic_1['snomed'].fillna(0, inplace=True)
equiv_snomed=df_dic_1.set_index('UMLS')['snomed'].to_dict()
现在,对于数据框B:
id symptom UMLS
1 nausea C0027497/Nausea /Sign or Symptom
2 muscle C2127305 /bitter/ Sign or Symptom
3 headache
4 pain
5 bitter C2127305 /bitter/ Sign or Symptom
对于字典中可用的“UMLS”列中的任何值,我想创建另一列“Snomed”,其中包含字典中的“snomed”值。所以数据框C应该是这样的:
id symptom UMLS Snomed
1 nausea C0027497/Nausea /Sign or Symptom Nausea (finding)[FN/422]
2 muscle C0151786 / Muscle/Sign or Symptom Muscle [(fi)/FN/25]
3 headache
4 pain
5 bitter C2127305 /bitter/ Sign or Symptom ?
有任何帮助吗?谢谢
答案 0 :(得分:2)
见EdChum对this Stack Overflow question的回答。
适用于您的情况,它看起来像:
import pandas as pd
# create dictionary
d = {'umls1':'snomed1','umls2':'snomed2','umls3':'snomed3'}
# create empty dataframe
columns = ['symptom','umls','snomed']
df = pd.DataFrame(columns = columns)
# fill it with symptoms and with umls, with some umls NULL
df['symptom'] = ['nausea','muscle','headache','pain','bitter']
df.ix[0,'umls'] = 'umls1'
df.ix[1,'umls'] = 'umls2'
df.ix[4,'umls'] = 'umls3'
# add a third column with snomed values from dictionary
df['snomed'] = df['umls'].map(d)
提供以下输出:
df.head()
Out[21]:
symptom umls snomed
0 nausea umls1 snomed1
1 muscle umls2 snomed2
2 headache NaN NaN
3 pain NaN NaN
4 bitter umls3 snomed3
答案 1 :(得分:1)
您可以对列UMLS的每个元素使用apply
函数,并从字典equiv_snomed
中获取值。如果字典中没有键,则可以返回np.nan
如果您的数据框B名为df2。然后
df2['Snomed'] = df2['UMLS'].apply(lambda x: equiv_snomed.get(x, np.nan))