Question

我有此列表：

new_tree = {'cues': 'glucose_tol',
  'directions': '<=',
  'thresholds': '122.5',
  'exits': 1.0,
  'children': [{'cues': True},
   {'cues': 'mass_index',
    'directions': '<=',
    'thresholds': '30.8',
    'exits': 1.0,
    'children': [{'cues': 'pedigree',
      'directions': '<=',
      'thresholds': '0.305',
      'exits': 1.0,
      'children': [{'cues': True},
       {'cues': 'diastolic_pb',
        'directions': '<=',
        'thresholds': '77.0',
        'exits': 1,
        'children': [{'cues': True}, 
        {'cues': 'insulin',
         'directions': '<=',
         'thresholds': '480',
         'exits': '0.5',
         'children': [{'cues': True}, {'cues': False}]}]}]}]}]}

我想获得这些数据点在此树形列表中的路径，以便我可以知道这些数据点所处的位置，然后进行一些计算。

我在df中有数据点（2个数据点仅用于说明）：

print(df)

times_pregnant,glucose_tol,diastolic_pb,triceps,insulin,mass_index,pedigree,age,label
6,148,72,35,0,33.6,0.627,50,1
1,85,66,29,0,26.6,0.351,31,0

第一个将进入glucose_tol，mass_index，谱系，diastolic_pb，并将其分类为“真”。如何从该数据点经过的列表中获得这4条线索，并将其保存以备将来计算？任何帮助将不胜感激。

Answer 1

这看起来像决策树。

它的工作方式是，在每个步骤中，您要么处于最终决策状态（“线索”：是，否则是“线索”：否），或者您需要做出决定。

要做出决定，您需要从数据框中获取以“提示”命名的字段，然后使用方向和阈值形成条件。第一个基本上是if glucose_tol <= 122.5。每个节点应该有2个子节点，我认为第一个节点是真实情况，第二个节点是假情况（如果您知道域，则对您来说应该很明显）。然后，根据您的决定选择孩子，然后继续。

可能最简单的方法是实现递归函数。一旦具有根据一行数据评估树的功能，就可以添加功能来存储所需的内容或您认为有趣的内容。

从列表中的树遍历做计算？

1 个答案: