用对象查询熊猫数据框

时间:2019-08-02 14:58:36

标签: python pandas dataframe

我有一个熊猫和一个Dataframe以及一个带有某些相同字段的类。 示例:

数据框:

customer_dict = {"name": ["matthew", "mark", "luke", "john" , "john"], 
                "series_number": [2,2,5,8,8], 
                "personality": ["intj", "entp", "intp", "enfj", "intj"] , 
                "classification" : ["good", "bad", "bad", "good", "bad"]}                                                                          

customer_df = pandas.DataFrame(customer_dict)  
 customer_df.head()                                                                                                                                   
      name  series_number personality classification
0  matthew              2        intj           good
1     mark              2        entp            bad
2     luke              5        intp            bad
3     john              8        enfj           good
4     john              8        intj            bad

班级:

class Customer(): 
         def __init__(self, name, series_number, personality): 
             self.name = name 
             self.series_number = series_number 
             self.personality = personality 

请注意,Class没有分类属性,但其他属性与数据框的列名相同。

如果我有该类的对象,则希望能够在数据框中搜索与其匹配的行,以便获得其分类:


customer = Customer("john", 8, "enfj")

customer_df[customer]

预期结果:

      name  series_number personality classification
3     john              8        enfj           good

有简单的方法吗?

4 个答案:

答案 0 :(得分:0)

您可以使用多种条件选择数据框中的行。每个都必须用括号括起来,并且&转换为AND。

condition = (customer_df['name']=='john') & (customer_df['series_number']==8) & (customer_df['personality']=='enfj')

customer_df = customer_df[condtion]

答案 1 :(得分:0)

您需要创建布尔掩码:

mask_name = customer_df['name'] == customer.name
mask_series_number = customer_df['series_number'] == customer.series_number
mask_personality = customer_df['personality'] == customer.personality

然后使用蒙版进行查询

customer_df.loc[mask_name & mask_series_number & mask_personality]

输出:

    name    series_number   personality classification
3   john    8               enfj        good

答案 2 :(得分:0)

您可以执行一个简单的功能来启用查询您的类实例的方法,而不是使用冗长的掩码。这甚至适用于可选属性。

def clsQuery(cls):
    return ' & '.join(["{}=='{}'".format(key, value) 
                       for key, value in cls.__dict__.items()
                      if not value is None])

customer = Customer("john", 8, "enfj", 'good')
customer_df.query(clsQuery(customer))

输出:

     name   series_number   personality classification
3   john        8                enfj           good

您还可以将函数定义为类方法,以方便访问:

class Customer(): 
    def __init__(self, name, series_number, personality, classification): 
        self.name = name 
        self.series_number = series_number 
        self.personality = personality

    def query(self):
        return ' & '.join(["{}=='{}'".format(key, value) 
                       for key, value in self.__dict__.items()
                      if not value is None])


customer = Customer("john", 8, "enfj", 'good')
customer_df.query(customer.query())

答案 3 :(得分:0)

尝试此操作,请为分类提供默认值

class Customer(): 
         def __init__(self, name, series_number, personality, classification=None): 
             self.name = name 
             self.series_number = series_number 
             self.personality = personality 

#initialize the obj
cust = Customer("john", 8, "enfj")

customer_df[(customer_df['name']==cust.name)
            &(customer_df['series_number']==cust.series_number)
            &(customer_df['personality']==cust.personality)]

    name    series_number   personality classification
3   john    8                    enfj   good