从Dask DataFrame中删除类别?

时间:2018-03-26 19:48:20

标签: dataframe dask

在将分区数据读入Dask DataFrame时是否可以删除某些类别?

例如,我在

中划分了镶木地板
events/year=2017/month=09/day=01/hour=01/customer=a.com/xxxx.parquet
events/year=2017/month=09/day=01/hour=02/customer=a.com/xxxx.parquet
events/year=2017/month=09/day=01/hour=01/customer=a.com/xxxx.parquet

我用以下内容阅读:

df = dd.read_parquet('./events/24.100/year=*/month=*/day=*/hour=*/customer=*/*.parquet')

阅读后,hourcustomer在我的数据中显示为类别:

Dask DataFrame Structure:
                   url referrer session_id              ts             hour         customer
npartitions=24
                object   object     object  datetime64[ns]  category[known]  category[known]
                   ...      ...        ...             ...              ...              ...
                   ...      ...        ...             ...              ...              ...
Dask Name: read-parquet, 24 tasks

我想放弃hour,但保留customer。我该怎么做?

0 个答案:

没有答案