在实体中找不到es.normalize_entity错误变量

时间:2018-07-29 15:28:00

标签: featuretools

我正在使用Featuretools文档来学习实体集,并且当前遇到以下代码段的错误KeyError: 'Variable: device not found in entity'

import featuretools as ft
data = ft.demo.load_mock_customer()
customers_df = data["customers"]
customers_df
sessions_df = data["sessions"]
sessions_df.sample(5)
transactions_df = data["transactions"]
transactions_df.sample(10)
products_df = data["products"]
products_df
### Creating an entity set 
es = ft.EntitySet(id="transactions")
### Adding entities
es = es.entity_from_dataframe(entity_id="transactions", dataframe=transactions_df, index="transaction_id", time_index="transaction_time", variable_types={"product_id": ft.variable_types.Categorical})
es
es["transactions"].variables
es =  es.entity_from_dataframe(entity_id="products",dataframe=products_df,index="product_id")
es
### Adding new relationship

new_relationship = ft.Relationship(es["products"]["product_id"],
                                   es["transactions"]["product_id"]) 
es = es.add_relationship(new_relationship)
es

### Creating entity from existing table
es = es.normalize_entity(base_entity_id="transactions",
        new_entity_id="sessions",
        index = "session_id",
        additional_variables=["device",customer_id","zip_code"])

这是根据URL-https://docs.featuretools.com/loading_data/using_entitysets.html

从API es.normalise_entity看,该函数将创建索引为'session_id'的其余三个变量的新实体'sessions',但是错误为:

C:\ Users \ s_belvi \ AppData \ Local \ Continuum \ Anaconda2 \ lib \ site-packages \ featuretools \ entityset \ entity.pyc在_get_variable(self,variable_id)中     250返回v     251 -> 252提高KeyError(“变量:在实体中找不到%s”%(variable_id))     253     254 @property

KeyError:'变量:在实体中找不到设备'

在使用es.normalize_entity之前,我们是否需要单独创建实体“会话”?看起来语法上的错误在流程中出现了一些小错误。

1 个答案:

答案 0 :(得分:0)

此处的错误是由于device不在您的transactions_df中的一列中引起的。在该文档的该页面中引用的“交易”表的字典形式中的列多于demo.load_mock_customer。您可以使用return_single_table参数找到其余的列。这是normalize_entity的完整示例,仅从您尝试的代码中进行了一些修改:

import featuretools as ft
data = ft.demo.load_mock_customer(return_single_table=True)

es = ft.EntitySet(id="Mock Customer")
es = es.entity_from_dataframe(entity_id="transactions", 
                              dataframe=data, 
                              index="transaction_id", 
                              time_index="transaction_time", 
                              variable_types={"product_id": ft.variable_types.Categorical})

es = es.normalize_entity(base_entity_id="transactions",
        new_entity_id="sessions",
        index = "session_id",
        additional_variables=["device","customer_id","zip_code"])

这将返回具有两个实体和一个关系的EntitySet:

Entityset: Mock Customer
  Entities:
    transactions [Rows: 500, Columns: 8]
    sessions [Rows: 35, Columns: 5]
  Relationships:
    transactions.session_id -> sessions.session_id