Question

我正在尝试确定所获得的数据框中使用的适当格式，但是我找不到任何可行的方法。

问题在于格式包含年度数字，该年度的数字被假定为某种零填充的第零个月。例如，年度名义GDP报告为2014-00，而不是通常的2014-01。

因此，当我使用时，

df['end_of_month'] =pandas.to_datetime(df['end_of_month'], format="%Y-%m")

我明白了：

ValueError: time data 2014-00 doesn't match format specified

为您考虑，以下是数据框：

end_of_month  nominal_gdp
0       2014-00    2260005.0
1       2015-00    2398280.0
2       2016-00    2490617.0
3       2017-00    2662836.0
4       2018-00    2842883.0
5       2018-09     726352.0
6       2018-10          NaN
7       2018-11          NaN
8       2018-12     754904.0
9       2019-01          NaN
10      2019-02          NaN
11      2019-03     712514.0
12      2019-04          NaN
13      2019-05          NaN
14      2019-06     698044.0
15      2019-07          NaN
16      2019-08          NaN
17      2019-09     722831.0
18      2019-10          NaN
19      2019-11          NaN

对于有兴趣或可能面临类似问题的任何人，数据是使用其开放API计划从香港金融管理局获得的。有关更多信息，请访问HKMA's documentation。

特别是，在使用经济统计数据集时会出现此问题，该数据集可在文档的以下page中找到。

Answer 1

看来我已经设法找到了解决该问题的方法。这是我找到解决方法的链接： Handling multiple datetime formats with pd.to_datetime

这是我使用的行：

df['end_of_month'] = pandas.to_datetime(df['end_of_month'], format='%Y-%m',errors='coerce').fillna(pandas.to_datetime(df['end_of_month'], format='%Y-00',errors='coerce'))

它只是用不同的格式“填充”了强制行。我使用的年度数字的格式用零填充的第零个月没有意义，因为“％Y-00”可以忽略年度频率值没有意义的“ -00”。

pandas.to_datetime在使用金管局的开放API时识别正确格式的问题

1 个答案: