如何使用其中一列作为参考来匹配两列?

时间:2016-08-17 09:40:58

标签: python pandas dataframe mapping multiple-columns

我做了一些分析并找到了一个特定的模式,现在我正在尝试做一些预测。 我有一个数据集可以预测童年时期发生一定数量事故的学生的评分。 我的预测矩阵看起来像这样:

Public Class NaturalSort

Implements IComparer

Public Function Compare(ByVal x As Object,
            ByVal y As Object) As Integer Implements IComparer.Compare

    ' [1] Validate the arguments.
    Dim s1 As String = x
    If s1 = Nothing Then
        Return 0
    End If

    Dim s2 As String = y
    If s2 = Nothing Then
        Return 0
    End If

    Dim len1 As Integer = s1.Length
    Dim len2 As Integer = s2.Length
    Dim marker1 As Integer = 0
    Dim marker2 As Integer = 0

    ' [2] Loop over both Strings.
    While marker1 < len1 And marker2 < len2

        ' [3] Get Chars.
        Dim ch1 As Char = s1(marker1)
        Dim ch2 As Char = s2(marker2)

        Dim space1(len1) As Char
        Dim loc1 As Integer = 0
        Dim space2(len2) As Char
        Dim loc2 As Integer = 0

        ' [4] Collect digits for String one.
        Do
            space1(loc1) = ch1
            loc1 += 1
            marker1 += 1

            If marker1 < len1 Then
                ch1 = s1(marker1)
            Else
                Exit Do
            End If
        Loop While Char.IsDigit(ch1) = Char.IsDigit(space1(0))

        ' [5] Collect digits for String two.
        Do
            space2(loc2) = ch2
            loc2 += 1
            marker2 += 1

            If marker2 < len2 Then
                ch2 = s2(marker2)
            Else
                Exit Do
            End If
        Loop While Char.IsDigit(ch2) = Char.IsDigit(space2(0))

        ' [6] Convert to Strings.
        Dim str1 = New String(space1)
        Dim str2 = New String(space2)

        ' [7] Parse Strings into Integers.
        Dim result As Integer
        If Char.IsDigit(space1(0)) And Char.IsDigit(space2(0)) Then
            Dim thisNumericChunk = Integer.Parse(str1)
            Dim thatNumericChunk = Integer.Parse(str2)
            result = thisNumericChunk.CompareTo(thatNumericChunk)
        Else
            result = str1.CompareTo(str2)
        End If

        ' [8] Return result if not equal.
        If Not result = 0 Then
            Return result
        End If
    End While

    ' [9] Compare lengths.
    Return len1 - len2

End Function

End Class

我的数据集如下所示:

   A
   injuries      ratings  
         0            5
         1            4.89
         2            4.34
         3            3.99 
         4            3.89
         5            3.77 

现在我要创建一个列名预测,它基本上匹配B siblings income injuries total_scoldings_from father 3 12000 4 09 4 34000 5 22 1 23400 3 12 3 24330 1 1 0 12000 1 12 A的条目并返回

B

请帮助

同时建议一个标题,因为我的缺乏对未来参考文献重要的一切

1 个答案:

答案 0 :(得分:1)

如果映射的所有值都在DataFrame A中,您可以使用map

B['predictions'] = B['injuries'].map(A.set_index('injuries')['ratings'])
print (B)
   siblings  income  injuries  total_scoldings_from_father  predictions
0         3   12000         4                            9         3.89
1         4   34000         5                           22         3.77
2         1   23400         3                           12         3.99
3         3   24330         1                            1         4.89
4         0   12000         1                           12         4.89

merge的另一个解决方案:

C = pd.merge(B,A)
print (C)
   siblings  income  injuries  total_scoldings_from_father  ratings
0         3   12000         4                            9     3.89
1         4   34000         5                           22     3.77
2         1   23400         3                           12     3.99
3         3   24330         1                            1     4.89
4         0   12000         1                           12     4.89