我有以下数据集:
import pandas as pd
from datetime import datetime
import numpy as np
date_rng = pd.date_range(start='2020-07-01', end='2020-07-10', freq='d')
l1 = [np.nan, np.nan, "local_max", np.nan, np.nan, "local_min", np.nan, np.nan, "local_max", np.nan]
l2 = [np.nan, np.nan, "local_max", np.nan, np.nan, "local_min", np.nan, np.nan, "local_max", "local_min"]
df = pd.DataFrame({
'date':date_rng,
'value':l1,
'group':'a'
})
df2 = pd.DataFrame({
'date':date_rng,
'value':l1,
'group':'b'
})
df = df.append(df2, ignore_index=True)
我想计算特征,例如每组的local_min和local_max计数,并将其保存在具有所需输出的新数据框中:
我能够计算特征,但无法以一种优雅的方式将其应用于群组:
columns = ["group", "local_min", "local_max"]
df_features = pd.DataFrame([["a", 1, 2],
["b", 1, 3],],
columns=columns)
df_features
任何帮助将不胜感激!
答案 0 :(得分:1)
df.groupby的工作原理:
df.groupby(['group','value']).count()
输出:
date
group value
a local_max 2
local_min 1
b local_max 2
local_min 1
答案 1 :(得分:0)
尝试使用数据透视表:
pd.pivot_table(df, index='group', columns='value', aggfunc='count')
date
value local_max local_min
group
a 2 1
b 2 1