我正在学习熊猫。我需要以下帮助。 我正在尝试从相关矩阵中找出最高的相关特征。
# Iris Dataset
features = ['sepal_length','sepal_width','petal_length','petal_width','class']
data = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data",\
header = None,\
names = features)
correlation = data.corr()
c = correlation.where(np.triu(np.ones(correlation.shape),k=1).astype(np.bool)).stack().sort_values(ascending = False)
highest = c[c>0.5]
print(highest)
print(highest.index)
以上代码段的输出为:
petal_length petal_width 0.962757
sepal_length petal_length 0.871754
petal_width 0.817954
dtype: float64
MultiIndex(levels=[['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']],
labels=[[2, 0, 0], [3, 2, 3]])
是否有可能将“最高”系列的输出转换为具有以下指定格式的列表?
list = [['petal_length','petal_width',0.962757],['sepal_length','petal_length',0.871754]['sepal_length','petal_width',0.817954]]
用外行的话来说,我需要该系列的列表中的索引列(两个列)。
我尝试了这个及其工作。但我需要上面的列表:
length = highest.shape[0]
list = []
for i in range(length):
list.append(highest.index[i])
print('list =',list)
输出:
list = [('petal_length', 'petal_width'), ('sepal_length', 'petal_length'), ('sepal_length', 'petal_width')]
谢谢。
答案 0 :(得分:2)
是,使用:
highest.reset_index().values.tolist()
输出:
[['petal_length', 'petal_width', 0.9627570970509667],
['sepal_length', 'petal_length', 0.8717541573048719],
['sepal_length', 'petal_width', 0.8179536333691635]]