Question

在此处共享数据的详细信息-

数据集包含116列和30229行。数据集dtype = dataframe。最后一列是因变量，而其他所有列都是自变量。

X，Y分别是float64和int64。训练和测试是numpy模块的value = ndarray对象的对象。

我为欧几里得距离定义了一个数学函数 =（（x1-x2）^ 2 +（y1-y2）^ 2 + ....）^ 1/2。

我在代码的最后一行面临挑战-出现错误“ IndexError：numpy数组的索引过多”。即使我只是输入train [0]或train [0] [0]，错误仍然保持不变。

请帮助我解决问题。如果您需要更多详细信息，请告诉我。

代码-`

import numpy as np
import pandas as pd
import math as mt
import matplotlib.pyplot as plt

dataset = pd.read_csv('Quote_Viewed_Doc_Updated.csv')
X = dataset.iloc[:,:-1].values
y = dataset.iloc[:,115].values

#splitting into Test and Train

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 1/3, random_state = 0)

# 1) given two data points, calculate the euclidean distance between them
def get_distance(data1, data2):
    points = zip(data1, data2)
    diffs_squared_distance = [pow(a - b, 2) for (a, b) in points]
    return mt.sqrt(sum(diffs_squared_distance))

# reformat train/test datasets for convenience
train = np.array(zip(X_train,y_train))
test = np.array(zip(X_test, y_test))

get_distance(train[0][0], train[1][0])

`

IndexError：numpy数组的索引过多

0 个答案: