要使该代码实现决策树算法,我会遇到以下错误:
IndexError:数组索引过多。
我实际上想将2-D Numpy数组的一个分区分配给另一个2-D Numpy数组,但是默认情况下,函数def gen_decision_tree(data, attribute_list):
y = data[:,8]
root = Node(None)
if(y.all()):
root.data = y[0]
return root
else:
splitting_attribute = information_gain(data, attribute_list)
root.data = splitting_attribute
attribute_list.remove(splitting_attribute)
data_left = np.array([]).astype(int)
data_right = np.array([]).astype(int)
for i in range(0, (len(y) - 1)):
if(data[i, splitting_attribute] == 0):
data_left = np.append(data_left, data[i, :])
else:
data_right = np.append(data_right, data[i, :])
#print(data_left)
#print(data_right)
root.left = gen_decision_tree(data_left, attribute_list)
root.right = gen_decision_tree(data_right, attribute_list)
return root
IndexError
----> 1 gen_decision_tree(data, attribute_list)
---> 23 root.left = gen_decision_tree(data_left, attribute_list)
<ipython-input-86-10fd7c8ef9e5> in gen_decision_tree(data, attribute_list)
1 def gen_decision_tree(data, attribute_list):
----> 2 y = data[:,8]
3 root = Node(None)
4 if(y.all()):
5 root.data = y[0]
IndexError: too many indices for array
使该数组变平。由于我是python的新手,所以这可能是一个愚蠢的疑问。
WITH
LossSummary AS (
SELECT Cat.CustomerCategoryName,
SUM(OL.Quantity * OL.UnitPrice) as SumLossesOrdersNotConverted,
C.CustomerName AS CustName,
C.CustomerID AS CustID,
ROW_NUMBER() OVER (PARTITION BY Cat.CustomerCategoryName ORDER BY SUM(OL.Quantity * OL.UnitPrice) DESC) AS [Order]
FROM Sales.OrderLines AS OL
JOIN Sales.Orders O ON OL.OrderID = O.OrderID
JOIN Sales.Customers AS C ON C.CustomerID = O.CustomerID
JOIN Sales.CustomerCategories AS Cat ON Cat.CustomerCategoryID = C.CustomerCategoryID
WHERE NOT EXISTS (
SELECT *
FROM Sales.Invoices as I
WHERE O.OrderID = I.OrderID
)
GROUP BY C.CustomerID,C.CustomerName,Cat.CustomerCategoryName
)
SELECT *
FROM LossSummary
WHERE [Order] = 1
;