pandas:在多列上合并(连接)两个数据帧

时间:2017-01-23 20:32:10

标签: python python-3.x pandas join

我正在尝试使用两列连接两个pandas数据框:

new_df = pd.merge(A_df, B_df,  how='left', left_on='[A_c1,c2]', right_on = '[B_c1,c2]')

但出现以下错误:

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4164)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4028)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13166)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13120)()

KeyError: '[B_1, c2]'

知道应该采取什么样的正确方法吗?谢谢!

3 个答案:

答案 0 :(得分:158)

试试这个

Set objCon = CreateObject("ADODB.Connection")
Set objRec = CreateObject("ADODB.RecordSet")
Dim fieldName, fieldValue
Dim host_name: host_name = ""
Dim service_name: service_name = ""
Dim user_name : user_name = ""
Dim pass : pass = ""
Dim strSQL : strSQL = ""
Dim data_array
data_array = Array("","","")

conStr = "Driver={Microsoft ODBC for Oracle};Server=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=" & host_name & ")(PORT=1525))(CONNECT_DATA=(SERVICE_NAME=" & service_name & "))); Uid=" & user_name & ";Pwd=" & pass &";"
objCon.Open conStr      

objRec.Open strSQL, objCon

Set objFields = objRec.Fields
Do Until objRec.EOF
    For intLoop = 0 To (objFields.Count - 1)
        fieldName = objFields.Item(intLoop).Name
        fieldValue = objFields.Item(intLoop).Value          
        If Cstr(fieldValue) = Cstr(data_arr(intLoop)) Then
            Debug.print "Check value of " & fieldName &  " Value of " & fieldName & " in DB " & fieldValue & " is same as application " & data_arr(intLoop)
        Else
            Debug.print "Check value of " & fieldName &  " Value of " & fieldName & " in DB " & fieldValue & " is not same as application " & data_arr(intLoop)
        End If
    Next
    objRec.MoveNext
Loop

objRec.Close
objCon.Close

http://pandas.pydata.org/pandas-docs/version/0.19.1/generated/pandas.DataFrame.merge.html

  

left_on:标签或列表,或类似于数组的字段名称,在左侧加入   数据帧。可以是矢量或矢量长度的矢量列表   DataFrame使用特定向量作为连接键而不是   专栏

     

right_on:要加入的标签或列表或类似数组的字段名称   在右侧DataFrame或矢量/每个left_on docs的矢量列表

答案 1 :(得分:1)

另一种方法: new_df = A_df.merge(B_df, left_on=['A_c1','c2'], right_on = ['B_c1','c2'], how='left')

答案 2 :(得分:0)

这里的问题是,通过使用撇号,您可以将要传递的值设置为字符串,实际上,正如文档中的@Shijo所述,该函数需要的是标签或列表,而不是字符串!如果列表包含为左和右数据帧传递的列的每个名称,则每个列名称​​必须都应放在撇号内。通过上面的陈述,我们可以理解为什么这是不正确的:

new_df = pd.merge(A_df, B_df,  how='left', left_on='[A_c1,c2]', right_on = '[B_c1,c2]')

这是使用该函数的正确方法:

new_df = pd.merge(A_df, B_df,  how='left', left_on=['A_c1','c2'], right_on = ['B_c1','c2'])