Question

I have 2 tables of data. Table 1 is called TERMS and contains three columns: P_TERM, STRESC, and CATEGORY. It looks like this: P_TERM STRESC CATEGORY __________ _______ ________ Discolored Stained Hair Discolored Stained Skin

The second table contains 4 columns and is called FINDINGS. It looks like: SPECIES STRAIN STRESC CATEGORY _______ ______ _______ _________ Rat Wistar Stained Hair Dog Beagle Stained Skin

I'm reading both tables into dataframes.

I need to replace every value of FINDING in the FINDINGS dataframe with the P_TERM value from the TERMS dataframe by comparing the values of STRESC and CATEGORY in the 2 dataframes and retrieving the P_TERM value from the TERMS dataframe.

so after the process the FINDINGS table would look like:

SPECIES  STRAIN  STRESC     CATEGORY
_______  ______  ______     _________
Rat      Wistar  Discolored Hair
Dog      Beagle  Discolored Skin

I'd like to do this without iterating through the thousands of rows in the FINDINGS dataframe. Using the value 'UNMAPPED" when no match is found.

I've tried the following:

s = terms.drop_duplicates(subset=['STRESC', 'DATATYPE']).set_index(['STRESC', 'CATEGORY'])['P_TERM']
findings['STRESC'] = findings.loc[:,['STRESC', 'CATEGORY']].map(s).fillna('UNMAPPED')

But I'm obviously not using the map function correctly. Can anyone point me in the right direction?

Answer 1

如果我正确理解了您的问题，您可以使用pandas merge根据类别合并数据框，然后选择要保留的列。 https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html

findings = findings.merge(terms, on='CATEGORY')

I need help figuring out how to change values in a dataframe column to a value in another dataframe

1 个答案: