熊猫:使用地图功能查找另一个df中的值

时间:2018-11-16 07:43:54

标签: python pandas dataframe

我想基于package classwork7_2; import java.util.*; import java.io.*; public class ClassWork7_2 { public static void main(String[] args)throws IOException { Scanner s = new Scanner(System.in); int[] numbers = fileToArray(); Arrays.sort(numbers); System.out.print("Enter a number in the file: "); int numb = s.nextInt(); int i = Arrays.binarySearch(numbers, numb); if(i < 0){ while(i < 0){ System.out.print("Number is not in file\n"); System.out.print("Enter number in the file: "); s.nextInt(); } } else if(i >= 0){ while(i >= 0){ System.out.print("This number is in the file\n"); System.out.print("Enter number in the file: "); s.nextInt(); } } } public static int[] fileToArray() throws IOException{ Scanner s = new Scanner(System.in); int[] array = new int[7]; System.out.print("Enter name of file: "); String filename = s.nextLine(); File f = new File(filename); Scanner inputFile = new Scanner(f); int i = 0; while(inputFile.hasNext()){ array[i] = inputFile.nextInt(); i++; } inputFile.close(); return array; } } 中的查找值使用map函数更新df1中的值。查找列为df2

ISIN_CUSIP_CODE

我的地图函数未返回df1 = [('ISIN_CUSIP_CODE', ['US68323ABL70', '9128284D9', '912828W89', 'CA135087J470','CA135087J470','912796QP7','US20030NCM11','US912810SD19','XS1851277969',]), ('Product', ['GOVT', 'GOVT', 'GOVT', 'GOVT', 'GOVT', 'GOVT', '', '', '',]), ] df1 = pd.DataFrame.from_items(df1) print(df1) df2 = [('ISIN_CUSIP_CODE', ['US20030NCM11', 'US912810SD19', 'XS1851277969', 'XS1391086987', 'CA064151BL66', 'CA13595ZZ661', ]), ('Product_MRD', ['CORP', 'GOVT', 'CORP', 'CORP','CORP','CORP',]), ] df2 = pd.DataFrame.from_items(df2) print(df2) df1 ISIN_CUSIP_CODE Product 0 US68323ABL70 GOVT 1 9128284D9 GOVT 2 912828W89 GOVT 3 CA135087J470 GOVT 4 CA135087J470 GOVT 5 912796QP7 GOVT 6 US20030NCM11 7 US912810SD19 8 XS1851277969 df2 ISIN_CUSIP_CODE Product_MRD 0 US20030NCM11 CORP 1 US912810SD19 GOVT 2 XS1851277969 CORP 3 XS1391086987 CORP 4 CA064151BL66 CORP 5 CA13595ZZ661 CORP 中的查找值

df2

2 个答案:

答案 0 :(得分:2)

纯熊猫解决方案:

pd.concat([df1,df2.rename(columns = {'Product_MRD':'Product'})]).drop_duplicates(['ISIN_CUSIP_CODE'],keep='last').sort_values('ISIN_CUSIP_CODE')

不需要额外的库

答案 1 :(得分:1)

这是使用局部的简单解决方案。

from functools import partial
def lookup(row, lookup_df):
    try:
        return lookup_df[lookup_df.ISIN_CUSIP_CODE == row['ISIN_CUSIP_CODE']].Product_MRD.values[0]
    except:
        return row['Product']
df1['ProductLooked'] = df1.apply(partial(lookup, lookup_df=df2), axis=1)