按特定日期过滤数据框

时间:2021-03-27 20:34:00

标签: python pandas dataframe pandas-groupby

我有下面的数据框,日期范围从 2016-01-01 到 2021-03-27

timestamp   close   circulating_supply  issuance_native
0   2016-01-01  0.944695    7.389026e+07    26070.31250
1   2016-01-02  0.931646    7.391764e+07    27383.90625
2   2016-01-03  0.962863    7.394532e+07    27675.78125
3   2016-01-04  0.944515    7.397274e+07    27420.62500
4   2016-01-05  0.950312    7.400058e+07    27839.21875

我希望按月和日过滤此数据框,以查看每年 12 月 31 日的流通供应量。

这里是数据框的数据类型的输出

timestamp             datetime64[ns]
close                        float64
circulating_supply           float64
issuance_native              float64
dtype: object

我可以使用这个提取单行:

ts = pd.to_datetime('2016-12-31')

df.loc[df['timestamp'] == td]

但没有运气在 df.loc[] 中传递日期时间列表

结果应如下所示,显示每年 12 月 31 日的行:

timestamp   close   circulating_supply  issuance_native
0   2016-31-12  0.944695    7.389026e+07    26070.31250
1   2017-31-12  0.931646    7.391764e+07    27383.90625
2   2018-31-12  0.962863    7.394532e+07    27675.78125
3   2019-31-12  0.944515    7.397274e+07    27420.62500
4   2020-31-12  0.950312    7.400058e+07    27839.21875

这是我得到的最接近的,但我收到此错误

#query dataframe for the circulating supply at the end of the year
circulating_supply = df.query("timestamp == '2016-12-31' or timestamp =='2017-12-31' or timestamp =='2018-12-31' or timestamp =='2019-12-31' or timestamp =='2020-12-31' or timestamp =='2021-03-01'")
​
circulating_supply.drop(columns=['close', 'issuance_native'], inplace=True)
circulating_supply.copy()
circulating_supply.head()

/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/frame.py:4308: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(

2 个答案:

答案 0 :(得分:0)

尝试这样的事情:

end_of_year = [
    pd.to_datetime(ts)
    for ts in [
        "2016-12-31",
        "2017-12-31",
        "2018-12-31",
        "2019-12-31",
        "2020-12-31",
        "2021-03-01",
    ]
]
end_of_year_df = df.loc[df["timestamp"].isin(end_of_year), :]
circulating_supply = end_of_year_df.drop(columns=["close", "issuance_native"])
circulating_supply.head()

答案 1 :(得分:0)

我能够通过忽略在我的 df.query 结果上使用 .drop() 函数时遇到的错误来解决这个问题

#query dataframe for the circulating supply at the end of the year
circulating_supply = df.query("timestamp == '2016-12-31' or timestamp =='2017-12-31' or timestamp =='2018-12-31' or timestamp =='2019-12-31' or timestamp =='2020-12-31' or timestamp =='2021-03-01'")

circulating_supply.drop(columns=['close', 'issuance_native'], inplace=True)
circulating_supply.copy() #not sure if this did anything
circulating_supply.head()

#add the column 
yearly_issuance['EOY Supply'] = circulating_supply['circulating_supply'].values

yearly_issuance.head()
相关问题