在数据框字典中对每个数据框进行排序

时间:2019-04-05 19:19:31

标签: python pandas dictionary

在此感谢@Woody Pride的回答:https://stackoverflow.com/a/19791302/5608428,我达到了我想要实现的目标的95%。

顺便说一下,这是从大df创建子数据帧的字典。

我要做的就是对字典中的每个数据框进行排序。这是一件很小的事情,但我在这里或Google上找不到答案。

import pandas as pd
import numpy as np
import itertools

def points(row):
    if row['Ob1'] > row['Ob2']:
        val = 2
    else:
        val = 1
    return val

#create some data with Names column
data = pd.DataFrame({'Names': ['Joe', 'John', 'Jasper', 'Jez'] *4, \
                     'Ob1' : np.random.rand(16), 'Ob2' : np.random.rand(16)})

#create list of unique pairs
comboNames = list(itertools.combinations(data.Names.unique(), 2))

#create a data frame dictionary to store your data frames
DataFrameDict = {elem : pd.DataFrame for elem in comboNames}

for key in DataFrameDict.keys():
    DataFrameDict[key] = data[:][data.Names.isin(key)]

#Add test calculated column
for tbl in DataFrameDict:
    DataFrameDict[tbl]['Test'] = DataFrameDict[tbl].apply(points, axis=1)

#############################
#Checking test and sorts
##############################

#access df's to print head
for tbl in DataFrameDict:
    print(DataFrameDict[tbl].head())
    print()

#access df's to print summary  
for tbl in DataFrameDict:    
    print(str(tbl[0])+" vs "+str(tbl[1])+": "+str(DataFrameDict[tbl]['Ob2'].sum()))

print()

#trying to sort each df   
for tbl in DataFrameDict:
    #Doesn't work
    DataFrameDict[tbl].sort_values(['Ob1'])
    #mistakenly deleted other attempts (facepalm)

for tbl in DataFrameDict:
    print(DataFrameDict[tbl].head())
    print()

代码可以运行,但是无论我尝试什么,都不会对每个df进行排序。我可以访问每个df进行打印等问题,但是没有.sort_values()

顺便说一句,用元组来创建名称(键)的df是很容易的事。有更好的方法吗?

非常感谢

1 个答案:

答案 0 :(得分:1)

看起来您只需要将排序后的DataFrame重新分配给dict:

for tbl in DataFrameDict:
    DataFrameDict[tbl] = DataFrameDict[tbl].sort_values(['Ob1'])