我有这个数据框:
df
artist track pos neg neu
0 Sufjan Stevens Should Have Known Better 0.07 0.93 0.0
8 Radiohead Daydreaming 0.05 0.95 0.0
1 Sufjan Stevens To Be Alone With You 0.05 0.95 0.0
5 Radiohead Desert Island Disk 0.08 0.92 0.0
11 Elliott Smith Between the Bars 0.03 0.97 0.0
7 Aphex Twin Avril 14th 1.00 0.00 0.0
2 Jeff Buckley Hallelujah 0.39 0.61 0.0
4 Sufjan Stevens Casimir Pulaski Day 0.09 0.91 0.0
9 Sufjan Stevens The Only Thing 0.09 0.91 0.0
3 Sufjan Stevens Death with Dignity 0.03 0.97 0.0
6 Radiohead Codex 1.00 0.00 0.0
10 Radiohead You And Whose Army? 0.00 1.00 0.0
我根据与input_value = 0.8
的接近程度进行排序
v = df[['pos', 'neg', 'neu']].values
df.iloc[np.lexsort(np.abs(v - input_value).T)]
产生:
artist track pos neg neu
4 Sufjan Stevens Casimir Pulaski Day 0.09 0.91 0.0
9 Sufjan Stevens The Only Thing 0.09 0.91 0.0
5 Radiohead Desert Island Disk 0.08 0.92 0.0
0 Sufjan Stevens Should Have Known Better 0.07 0.93 0.0
1 Sufjan Stevens To Be Alone With You 0.05 0.95 0.0
8 Radiohead Daydreaming 0.05 0.95 0.0
3 Sufjan Stevens Death with Dignity 0.03 0.97 0.0
11 Elliott Smith Between the Bars 0.03 0.97 0.0
2 Jeff Buckley Hallelujah 0.39 0.61 0.0
6 Radiohead Codex 1.00 0.00 0.0
7 Aphex Twin Avril 14th 1.00 0.00 0.0
10 Radiohead You And Whose Army? 0.00 1.00 0.0
但是给出input_label = 'neg'
我想插入if input_label = 'neg'
,
然后neg
值必须是最高值row-wise
,
如果不满足条件,则相应地消除行,
结束于:
artist track pos neg neu
4 Sufjan Stevens Casimir Pulaski Day 0.09 0.91 0.0
9 Sufjan Stevens The Only Thing 0.09 0.91 0.0
5 Radiohead Desert Island Disk 0.08 0.92 0.0
0 Sufjan Stevens Should Have Known Better 0.07 0.93 0.0
1 Sufjan Stevens To Be Alone With You 0.05 0.95 0.0
8 Radiohead Daydreaming 0.05 0.95 0.0
3 Sufjan Stevens Death with Dignity 0.03 0.97 0.0
11 Elliott Smith Between the Bars 0.03 0.97 0.0
2 Jeff Buckley Hallelujah 0.39 0.61 0.0
10 Radiohead You And Whose Army? 0.00 1.00 0.0
我该怎么做?
答案 0 :(得分:0)
v = df.iloc[:, -3:]
df = df.iloc[np.lexsort(np.abs(v - input_value).T)]
你可以在这里使用df.query
,简化一些事情。
result = df.query('neg > pos and neg > neu'); result
artist track pos neg neu
4 Sufjan Stevens Casimir Pulaski Day 0.09 0.91 0.0
9 Sufjan Stevens The Only Thing 0.09 0.91 0.0
5 Radiohead Desert Island Disk 0.08 0.92 0.0
0 Sufjan Stevens Should Have Known Better 0.07 0.93 0.0
8 Radiohead Daydreaming 0.05 0.95 0.0
1 Sufjan Stevens To Be Alone With You 0.05 0.95 0.0
11 Elliott Smith Between the Bars 0.03 0.97 0.0
3 Sufjan Stevens Death with Dignity 0.03 0.97 0.0
2 Jeff Buckley Hallelujah 0.39 0.61 0.0
10 Radiohead You And Whose Army? 0.00 1.00 0.0
np.argmax
的替代解决方案:
mask = np.argmax(df.iloc[:, -3:].values, 1) == 1
mask
array([ True, True, True, True, True, True, True, False, True,
True, True, False], dtype=bool)
result = df[mask]; result
artist track pos neg neu
11 Elliott Smith Between the Bars 0.03 0.97 0.0
2 Jeff Buckley Hallelujah 0.39 0.61 0.0
0 Sufjan Stevens Should Have Known Better 0.07 0.93 0.0
4 Sufjan Stevens Casimir Pulaski Day 0.09 0.91 0.0
9 Sufjan Stevens The Only Thing 0.09 0.91 0.0
5 Radiohead Desert Island Disk 0.08 0.92 0.0
1 Sufjan Stevens To Be Alone With You 0.05 0.95 0.0
3 Sufjan Stevens Death with Dignity 0.03 0.97 0.0
10 Radiohead You And Whose Army? 0.00 1.00 0.0
8 Radiohead Daydreaming 0.05 0.95 0.0
您可以使用df
sort_index
进行排序
result.sort_index()
artist track pos neg neu
0 Sufjan Stevens Should Have Known Better 0.07 0.93 0.0
1 Sufjan Stevens To Be Alone With You 0.05 0.95 0.0
2 Jeff Buckley Hallelujah 0.39 0.61 0.0
3 Sufjan Stevens Death with Dignity 0.03 0.97 0.0
4 Sufjan Stevens Casimir Pulaski Day 0.09 0.91 0.0
5 Radiohead Desert Island Disk 0.08 0.92 0.0
8 Radiohead Daydreaming 0.05 0.95 0.0
9 Sufjan Stevens The Only Thing 0.09 0.91 0.0
10 Radiohead You And Whose Army? 0.00 1.00 0.0
11 Elliott Smith Between the Bars 0.03 0.97 0.0