强化学习:神经网络

时间:2016-07-01 15:43:56

标签: neural-network reinforcement-learning

在使用神经网络获得高状态空间的泛化时,输入单位是什么?

例如,如果状态向量是1维,比如说实轴上的位置......那么只有一个输入单位? (为每个行动提供单独的网络)

1 个答案:

答案 0 :(得分:0)

是的,至少如果您使用类似于Q-learning或Sarsa的算法,其中函数逼近器应该学习Q函数Sub StripRow2Node() 'Read the Strip Design table With Sheets("Design-Moment") Sheets("Design-Moment").Activate LastR1 = .Range("B" & Cells.Rows.Count).End(xlUp).Row DM_arr = .Range(Cells(1, 1), Cells(LastR1, 7)) 'Col 1 to Col 7 DM_count = UBound(DM_arr, 1) End With 'Read the x and y coordinations and thickness of a node in node design With Sheets("Design-Shear") Sheets("Design-Shear").Activate LastR2 = .Range("B" & Cells.Rows.Count).End(xlUp).Row DS_arr = .Range(Cells(1, 4), Cells(LastR2, 5)) 'Col 4 to Col 5 SX_arr = .Range(Cells(1, 26), Cells(LastR2, 27)) SY_arr = .Range(Cells(1, 30), Cells(LastR2, 31)) DS_count = UBound(DS_arr, 1) End With '** Find correponding reference row in Design-Moment for nodes** 'Match node to striip station and output row index For i = 5 To DS_count XStrip = SX_arr(i, 1) XStation = DS_arr(i, 1) YStrip = SY_arr(i, 1) YStation = DS_arr(i, 2) For j = 5 To DM_count If DM_arr(j, 1) = XStrip Then 'X-Strip Name is matched If DM_arr(j, 4) >= XStation And DM_arr(j - 1, 4) < XStation Then SX_arr(i, 2) = j 'matched row reference for X-strip End If End If If DM_arr(j, 1) = YStrip Then If DM_arr(j, 5) <= YStation And DM_arr(j - 1, 5) > YStation Then SY_arr(i, 2) = j End If End If Next j Next i 'Write the matched strip information to node For i = 5 To LastR2 With Sheets("Design-Shear") .Cells(i, 27) = SX_arr(i, 2) .Cells(i, 31) = SY_arr(i, 2) End With Next i 。在您的情况下,如果每个操作使用一个神经网络,则网络必须近似函数Q(s,a)。此外,如果状态具有维度1,那么网络将只需要一个输入神经元。