I have 2 tables of data. Table 1 is called TERMS and contains three columns: P_TERM, STRESC, and CATEGORY. It looks like this: P_TERM STRESC CATEGORY __________ _______ ________ Discolored Stained Hair Discolored Stained Skin
The second table contains 4 columns and is called FINDINGS. It looks like: SPECIES STRAIN STRESC CATEGORY _______ ______ _______ _________ Rat Wistar Stained Hair Dog Beagle Stained Skin
I'm reading both tables into dataframes.
I need to replace every value of FINDING in the FINDINGS dataframe with the P_TERM value from the TERMS dataframe by comparing the values of STRESC and CATEGORY in the 2 dataframes and retrieving the P_TERM value from the TERMS dataframe.
so after the process the FINDINGS table would look like:
SPECIES STRAIN STRESC CATEGORY
_______ ______ ______ _________
Rat Wistar Discolored Hair
Dog Beagle Discolored Skin
I'd like to do this without iterating through the thousands of rows in the FINDINGS dataframe. Using the value 'UNMAPPED" when no match is found.
I've tried the following:
s = terms.drop_duplicates(subset=['STRESC', 'DATATYPE']).set_index(['STRESC', 'CATEGORY'])['P_TERM']
findings['STRESC'] = findings.loc[:,['STRESC', 'CATEGORY']].map(s).fillna('UNMAPPED')
But I'm obviously not using the map function correctly. Can anyone point me in the right direction?
答案 0 :(得分:0)
如果我正确理解了您的问题,您可以使用pandas merge
根据类别合并数据框,然后选择要保留的列。 https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html
findings = findings.merge(terms, on='CATEGORY')