嵌套字典中键上的OR和GROUPBY操作

时间:2018-06-25 13:53:48

标签: python-3.x pandas

我在词典中有词典。我的问题陈述是,我想对字典中的键(or'filter')0索引执行'filteer'操作。基于该结果,我想对字典中索引1中的特定列应用groupby操作。 (例如,如果brand(conditions [0])== AMBI(conditions [8])   (或)Manufacturer(conditions [1])== AMBI(conditions [8])我想返回数据框,并希望在该数据框上对列之一执行groupby操作。)

我的代码:

import csv
import pandas as pd
import sys
class sample:
        def create_df(self, f):
                 self.z=pd.read_csv(f)
        def get_resultant_df(self, list_cols):
                 self.data_frame = self.z[list_cols[:]]
        def process_df(self, df, conditions):
                 resultant_df = self.data_frame   
                 if conditions[2] == 'equals':
                         new_df =resultant_df[resultant_df[conditions[1]] == conditions[3]]
                         return new_df
                 elif conditions[2] == 'contains':
                         new_df = resultant_df[resultant_df[conditions[1]].str.contains(conditions[3])]
                         return new_df
                 elif conditions[2] == 'not equals':
                         new_df = resultant_df[resultant_df[conditions[1]] != conditions[3]]
                         return new_df
                 elif conditions[2] == 'startswith':
                         new_df = resultant_df[resultant_df[conditions[1]].str.startswith(conditions[3])]
                         return new_df
                 elif conditions[2] == 'in':
                         new_df = resultant_df[resultant_df[conditions[1]].isin(resultant_df[conditions[3]])]
                         return new_df
                 elif conditions[2] == 'not in':
                         new_df = resultant_df[~resultant_df[conditions[1]].isin(resultant_df[conditions[3]])]
                         return new_df
                 elif conditions[2] == 'group':
                         new_df = list(resultant_df.groupby(conditions[0])[conditions[1]])
                         return new_df
                 elif conditions[2] == 'specific':
                         new_df = resultant_df.loc[resultant_df[conditions[0]] == conditions[8]]
                         return new_df
                 elif conditions[2] == 'same':
                         new_df = resultant_df[(resultant_df[conditions[0]] == conditions[8]) & (resultant_df[conditions[1]] == conditions[8])]
                         return new_df
                 elif conditions[2]=='trail':
                         new_df={0:{'filter'{'filter1':resultant_df.loc[resultant_df[conditions[0]] == conditions[8]]},'filteer':{'filter1':resultant_df.loc[resultant_df[conditions[0]] == conditions[8]]}},
                                 1:{'group':{resultant_df.groupby(new_df[0][filter])}}}
                         return new_df


if __name__ =='__main__':
        sample = sample()

      sample.create_df("/home/purpletalk/GrammarandProductReviews.csv")
        df = sample.get_resultant_df(['brand', 'reviews.id','manufacturer','reviews.title','reviews.username','id','dateAdded','reviews.rating'])
        new_df = sample.process_df(df, ['brand','manufacturer','trail','Windex', 'size', 'equal',8,700,'AMBI'])
        print (new_df[1][group])

可以请人帮我吗?上面的代码返回错误,我想知道如何执行or操作。

0 个答案:

没有答案