为什么“ sort_values”不能正常工作?

时间:2019-01-09 15:20:08

标签: python python-3.x pandas

我正在尝试在target_playlist中打印值。问题是我想按target_playlist列对percentuali中的值进行排序,而我使用了target_playlist.sort_values('percentuali', inplace=True, ascending=False)sort_values函数之前,结果为:

print("{}".format(target_playlist['percentuali'][i]))

是:

0.7010264012452779
0.19662758090847976
0.6508863154849628
0.557740362863367
0.47418798688188313
0.6634307395184526
0.17661982395954637
0.6334661569944786
0.5226247859195567
0.37647399781797003
0.6107562358792401
0.10866013071895426
0.6259167928556538
0.5107723732317271
0.5107723732317271
0.440188723891383
0.473270990299173
0.5807994015581672
0.45540535868625753
0.4156854080449265
0.5659237264842225
0.5942257114281826
0.5763053500588216
0.43676171660260443
0.6947640279542424
0.37155299947773396
0.6055124707313475
0.6642522917728619
0.6339323841512609
0.6836084778718268
0.4585485761594801
0.7687767193517359
0.7739306342996543
0.6792746883779797
0.5688985142793829
0.5763507447689178
0.6265388222033668
0.5262211637961803
0.631776719351736
0.7016345319242638
0.6549247063300238
0.6218895455057429
0.3926510809451985
0.5081035167373568
0.6149459682682933
0.44069739392952245
0.46799465192894985
0.69161263493496
0.5534053586862575
0.6968509819258842
0.4988988577428972
0.5059165111353879
0.7355655050414504
0.6792746883779797
0.4401208506283063
0.49320548887003335
0.5112768045242271
0.7361528565218765
0.2329438202247191
0.6123902228073447
0.49864712823852325
0.6909989415739581
0.6754433860184025
0.566520509644565
0.37663089180304893
0.6529677236233883
0.6089596366830047
0.7687767193517359
0.6101347817993262
0.7559795411177228

当我调用sort_values后打印值时,它们是:

Titolo: Possibili Scenari,  Artista:  Cesare Cremonini,  Probabilita: 0.7559795411177228 
Titolo: Shallow,  Artista:  Lady Gaga,  Probabilita: 0.7559795411177228 
Titolo: To the Trees,  Artista:  An Early Bird,  Probabilita: 0.7559795411177228 
Titolo: If You Wanna Love Somebody - Acoustic,  Artista:  Tom Odell,  Probabilita: 0.7559795411177228 
Titolo: Happier - Acoustic,  Artista:  Ed Sheeran,  Probabilita: 0.7559795411177228 
Titolo: Lie With Me,  Artista:  Josiah and the Bonnevilles,  Probabilita: 0.7559795411177228 
Titolo: Jubilee Road,  Artista:  Tom Odell,  Probabilita: 0.7559795411177228 
Titolo: I'll Never Love Again - Film Version,  Artista:  Lady Gaga,  Probabilita: 0.7559795411177228 
Titolo: Rise - Acoustic,  Artista:  Jonas Blue,  Probabilita: 0.7559795411177228 
Titolo: Hold My Girl,  Artista:  George Ezra,  Probabilita: 0.7559795411177228 
Titolo: Love Someone,  Artista:  Lukas Graham,  Probabilita: 0.7559795411177228 
Titolo: Angels,  Artista:  Tom Walker,  Probabilita: 0.7559795411177228 
Titolo: These Days (feat. Jess Glynne, Macklemore & Dan Caplen) - Acoustic,  Artista:  Rudimental,  Probabilita: 0.7559795411177228 
Titolo: Just For Tonight - Acoustic,  Artista:  James Bay,  Probabilita: 0.7559795411177228 
Titolo: Perfect,  Artista:  Ed Sheeran,  Probabilita: 0.7559795411177228 
Titolo: No Roots,  Artista:  Joshua Hyslop,  Probabilita: 0.7559795411177228 
Titolo: Slide,  Artista:  James Bay,  Probabilita: 0.7559795411177228 
Titolo: Be Your Man,  Artista:  Rhys Lewis,  Probabilita: 0.7559795411177228 
Titolo: No Matter What,  Artista:  Calum Scott,  Probabilita: 0.7559795411177228 
Titolo: Woes,  Artista:  Tom Rosenthal,  Probabilita: 0.7559795411177228 
Titolo: Barbed Wire (Acoustic),  Artista:  Tom Grennan,  Probabilita: 0.7559795411177228 
Titolo: Stay Awake with Me,  Artista:  Dan Owen,  Probabilita: 0.7559795411177228 
Titolo: Spent So Long,  Artista:  Jamie Harrison,  Probabilita: 0.7559795411177228 
Titolo: Tummy,  Artista:  Tamino,  Probabilita: 0.7559795411177228 
Titolo: LOVISA,  Artista:  FELIX SANDMAN,  Probabilita: 0.7559795411177228 
Titolo: Girl - Acoustic,  Artista:  SYML,  Probabilita: 0.7559795411177228 
Titolo: Party Of One (feat. Sam Smith),  Artista:  Brandi Carlile,  Probabilita: 0.7559795411177228 
Titolo: Electricity - Acoustic,  Artista:  Silk City,  Probabilita: 0.7559795411177228 
Titolo: Leftovers,  Artista:  Dennis Lloyd,  Probabilita: 0.7559795411177228 
Titolo: Hand That You Hold,  Artista:  Dan Owen,  Probabilita: 0.7559795411177228 
Titolo: Company (feat. Molly Hammar),  Artista:  Paul Rey,  Probabilita: 0.7559795411177228 
Titolo: Too Good At Goodbyes - Edit,  Artista:  Sam Smith,  Probabilita: 0.7559795411177228 
Titolo: Need You Now - Acoustic,  Artista:  Dean Lewis,  Probabilita: 0.7559795411177228 
Titolo: Such A Simple Thing,  Artista:  Ray LaMontagne,  Probabilita: 0.7559795411177228 
Titolo: Acoustic,  Artista:  Billy Raffoul,  Probabilita: 0.7559795411177228 
Titolo: Don’t Matter To Me,  Artista:  Drake,  Probabilita: 0.7559795411177228 
Titolo: when the party's over,  Artista:  Billie Eilish,  Probabilita: 0.7559795411177228 
Titolo: Someone You Loved,  Artista:  Lewis Capaldi,  Probabilita: 0.7559795411177228 
Titolo: Collide,  Artista:  Tom Speight,  Probabilita: 0.7559795411177228 
Titolo: Fading Into Grey - Acoustic,  Artista:  Billy Lockett,  Probabilita: 0.7559795411177228 
Titolo: Never Let You Go (feat. John Newman) - Acoustic Version,  Artista:  Kygo,  Probabilita: 0.7559795411177228 
Titolo: T-Shirts,  Artista:  James Smith,  Probabilita: 0.7559795411177228 
Titolo: In My Head,  Artista:  Peter Manos,  Probabilita: 0.7559795411177228 
Titolo: Where Were You In The Morning?,  Artista:  Shawn Mendes,  Probabilita: 0.7559795411177228 
Titolo: come out and play,  Artista:  Billie Eilish,  Probabilita: 0.7559795411177228 
Titolo: Tear Me Down,  Artista:  Paul Rey,  Probabilita: 0.7559795411177228 
Titolo: Come As You Are,  Artista:  Imaginary Future,  Probabilita: 0.7559795411177228 
Titolo: Consequences - orchestra,  Artista:  Camila Cabello,  Probabilita: 0.7559795411177228 
Titolo: All I Am - Acoustic,  Artista:  Jess Glynne,  Probabilita: 0.7559795411177228 

这是我正在研究的程序的一部分

import tkinter as tk                
from tkinter import font  as tkfont 
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
import spotipy
import spotipy.util as util
from numpy import integer
from tkinter import Radiobutton
sp = spotipy.Spotify() 
from spotipy.oauth2 import SpotifyClientCredentials 
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.cluster import KMeans
import itertools
import threading
import time
import sys
from operator import itemgetter, attrgetter, methodcaller

 target_playlist = pd.DataFrame(newPlaylist_features)

    if(algoritmo_scelto==1):
        pred = c.predict(target_playlist[features])
        p = c.predict_proba(target_playlist[features])
    if(algoritmo_scelto==2):
        pred = knn.predict(target_playlist[features])
        p = knn.predict_proba(target_playlist[features])
    if(algoritmo_scelto==3):
        pred = forest.predict(target_playlist[features])
        p = forest.predict_proba(target_playlist[features])
    if(algoritmo_scelto==4):
        pred = k_means.predict(target_playlist[features])
        p = k_means.predict_proba(target_playlist[features])

    likedSongs = 0
    i = 0

    for prediction in pred:
        target_playlist['percentuali'] = p[i][1]
        print("{}".format(target_playlist['percentuali'][i]))
        i = i +1


    target_playlist.sort_values('percentuali', inplace=True, ascending=False)

    i=0
    for prediction in pred:

        if(prediction == 1):
            print ("Titolo: " + target_playlist["song_title"][i] + ",  Artista:  "+ target_playlist["artist"][i] + ",  Probabilita: {} ".format(target_playlist["percentuali"][i]))
            likedSongs= likedSongs + 1
        i = i +1

我在哪里错了?

2 个答案:

答案 0 :(得分:1)

在此循环中,您将"target_playlist['percentuali']"系列设置为单个值:

i = 0

for prediction in pred:
    target_playlist['percentuali'] = p[i][1]
    print("{}".format(target_playlist['percentuali'][i]))
    i = i +1

由于"target_playlist['percentuali'] = p[i][1]""p[i][1]"用作每一行的值。

如本例所示:

>>> for i in [0, 1, 2]:
...     print(i)
...     df['this'] = i
...
0
1
2
>>> df
   id   col_1  col_2  col_3  this
0   1    blue     15   True    2
1   2     red     25  False    2
2   3  orange     35  False    2
3   4  yellow     24   True    2
4   5   green     12   True    2

修复:

我不知道对象p,但是您应该将结果转换为pd.Series。 您可以将 that 整个循环修改为如下形式:

target_playlist['percentuali'] = pd.Series(item[1] for item in p)
print(target_playlist['percentuali'])

在DataFrame上调用sort_values后,由于按索引e.g. (0, 1, 2)引用行,因此值不会按降序打印。

您可以通过重置索引来快速修复,请参见下面的示例:

>>> df.sort_values('col_2', inplace=True, ascending=False)
>>> df
   id   col_1  col_2  col_3
2   3  orange     35  False
1   2     red     25  False
3   4  yellow     24   True
0   1    blue     15   True
4   5   green     12   True
>>> df['col_2'][0]
15
>>> df.reset_index(inplace=True)
>>> df['col_2'][0]
35

遍历数据框行

您可以像这样遍历行,而不是通过索引进行引用:

for _, row in df.iterrows():
    print("Title: {}, Artist: {}, Probability: {}".format(
        row['song_title'], row['artist'], row['percentuali']
    ))

答案 1 :(得分:1)

除了foxy指出的问题外,很有可能所有最大元素都具有相同的概率,因为将类别1分配给所有概率大于给定阈值的元素。如果您删除if prediction == 1,则所有看到的预测的可能性都会降低。

此外,您的代码中还有一个错误:

i=0
for prediction in pred:
    if(prediction == 1):
        print ("Titolo: " + target_playlist["song_title"][i] + ",  Artista:  "+ target_playlist["artist"][i] + ",  Probabilita: {} ".format(target_playlist["percentuali"][i]))
        likedSongs= likedSongs + 1
    i = i +1   # this should be indented inside the if

使用enumerate可以轻松避免此类错误:

for i, prediction in enumerate(pred):
    # now i is incremented automatically