合并DataFrames的条件

时间:2017-01-18 10:56:44

标签: python pandas

我有2个DataFrames

df1由列组成:person_id,day,flag

df2由列组成:person_id,day_start,day_end

我想添加到df2列num_flags,它会在flagperson_id区间内显示day_start < day < day_end列的总和。

没有复杂的循环,这是最快的方法吗?我正在寻求像merge

这样的快速解决方案

1 个答案:

答案 0 :(得分:1)

>>> df = pd.merge(df1,df2, on="person_id", how="outer")
>>> df["lies_between"] = df.day.between(df.day_start, df.day_end,inclusive=False)
>>> x = pd.pivot_table(df,values="flags",columns="lies_between", index="person_id",aggfunc=np.sum)
>>> x.reset_index(drop=False,inplace=True)
>>> x[["person_id", True]]

这可能会有所帮助:

>>> help(pandas.Series.between_time)