我做了一些分析并找到了一个特定的模式,现在我正在尝试做一些预测。 我有一个数据集可以预测童年时期发生一定数量事故的学生的评分。 我的预测矩阵看起来像这样:
Public Class NaturalSort
Implements IComparer
Public Function Compare(ByVal x As Object,
ByVal y As Object) As Integer Implements IComparer.Compare
' [1] Validate the arguments.
Dim s1 As String = x
If s1 = Nothing Then
Return 0
End If
Dim s2 As String = y
If s2 = Nothing Then
Return 0
End If
Dim len1 As Integer = s1.Length
Dim len2 As Integer = s2.Length
Dim marker1 As Integer = 0
Dim marker2 As Integer = 0
' [2] Loop over both Strings.
While marker1 < len1 And marker2 < len2
' [3] Get Chars.
Dim ch1 As Char = s1(marker1)
Dim ch2 As Char = s2(marker2)
Dim space1(len1) As Char
Dim loc1 As Integer = 0
Dim space2(len2) As Char
Dim loc2 As Integer = 0
' [4] Collect digits for String one.
Do
space1(loc1) = ch1
loc1 += 1
marker1 += 1
If marker1 < len1 Then
ch1 = s1(marker1)
Else
Exit Do
End If
Loop While Char.IsDigit(ch1) = Char.IsDigit(space1(0))
' [5] Collect digits for String two.
Do
space2(loc2) = ch2
loc2 += 1
marker2 += 1
If marker2 < len2 Then
ch2 = s2(marker2)
Else
Exit Do
End If
Loop While Char.IsDigit(ch2) = Char.IsDigit(space2(0))
' [6] Convert to Strings.
Dim str1 = New String(space1)
Dim str2 = New String(space2)
' [7] Parse Strings into Integers.
Dim result As Integer
If Char.IsDigit(space1(0)) And Char.IsDigit(space2(0)) Then
Dim thisNumericChunk = Integer.Parse(str1)
Dim thatNumericChunk = Integer.Parse(str2)
result = thisNumericChunk.CompareTo(thatNumericChunk)
Else
result = str1.CompareTo(str2)
End If
' [8] Return result if not equal.
If Not result = 0 Then
Return result
End If
End While
' [9] Compare lengths.
Return len1 - len2
End Function
End Class
我的数据集如下所示:
A
injuries ratings
0 5
1 4.89
2 4.34
3 3.99
4 3.89
5 3.77
现在我要创建一个列名预测,它基本上匹配B
siblings income injuries total_scoldings_from father
3 12000 4 09
4 34000 5 22
1 23400 3 12
3 24330 1 1
0 12000 1 12
到A
的条目并返回
B
请帮助
同时建议一个标题,因为我的缺乏对未来参考文献重要的一切
答案 0 :(得分:1)
如果映射的所有值都在DataFrame A
中,您可以使用map
:
B['predictions'] = B['injuries'].map(A.set_index('injuries')['ratings'])
print (B)
siblings income injuries total_scoldings_from_father predictions
0 3 12000 4 9 3.89
1 4 34000 5 22 3.77
2 1 23400 3 12 3.99
3 3 24330 1 1 4.89
4 0 12000 1 12 4.89
merge
的另一个解决方案:
C = pd.merge(B,A)
print (C)
siblings income injuries total_scoldings_from_father ratings
0 3 12000 4 9 3.89
1 4 34000 5 22 3.77
2 1 23400 3 12 3.99
3 3 24330 1 1 4.89
4 0 12000 1 12 4.89