决策树算法的实现

时间:2018-08-20 19:17:05

标签: python machine-learning decision-tree

要使该代码实现决策树算法,我会遇到以下错误:

  

IndexError:数组索引过多。

我实际上想将2-D Numpy数组的一个分区分配给另一个2-D Numpy数组,但是默认情况下,函数def gen_decision_tree(data, attribute_list): y = data[:,8] root = Node(None) if(y.all()): root.data = y[0] return root else: splitting_attribute = information_gain(data, attribute_list) root.data = splitting_attribute attribute_list.remove(splitting_attribute) data_left = np.array([]).astype(int) data_right = np.array([]).astype(int) for i in range(0, (len(y) - 1)): if(data[i, splitting_attribute] == 0): data_left = np.append(data_left, data[i, :]) else: data_right = np.append(data_right, data[i, :]) #print(data_left) #print(data_right) root.left = gen_decision_tree(data_left, attribute_list) root.right = gen_decision_tree(data_right, attribute_list) return root IndexError ----> 1 gen_decision_tree(data, attribute_list) ---> 23 root.left = gen_decision_tree(data_left, attribute_list) <ipython-input-86-10fd7c8ef9e5> in gen_decision_tree(data, attribute_list) 1 def gen_decision_tree(data, attribute_list): ----> 2 y = data[:,8] 3 root = Node(None) 4 if(y.all()): 5 root.data = y[0] IndexError: too many indices for array 使该数组变平。由于我是python的新手,所以这可能是一个愚蠢的疑问。

WITH
    LossSummary AS (
        SELECT  Cat.CustomerCategoryName,
                SUM(OL.Quantity * OL.UnitPrice) as SumLossesOrdersNotConverted,
                C.CustomerName AS CustName,
                C.CustomerID AS CustID,
                ROW_NUMBER() OVER (PARTITION BY Cat.CustomerCategoryName ORDER BY SUM(OL.Quantity * OL.UnitPrice) DESC) AS [Order]
        FROM    Sales.OrderLines AS OL 
        JOIN    Sales.Orders O ON OL.OrderID = O.OrderID
        JOIN    Sales.Customers AS C ON C.CustomerID = O.CustomerID
        JOIN    Sales.CustomerCategories AS Cat ON Cat.CustomerCategoryID = C.CustomerCategoryID
        WHERE   NOT EXISTS (
                    SELECT *
                    FROM Sales.Invoices as I
                    WHERE O.OrderID = I.OrderID
                )
        GROUP BY C.CustomerID,C.CustomerName,Cat.CustomerCategoryName
    )
SELECT  *
FROM    LossSummary
WHERE   [Order] = 1
;

0 个答案:

没有答案