熊猫/日期时间问题

时间:2020-06-02 14:22:47

标签: python python-3.x pandas

我有一个熊猫数据集,在这里我试图关联两列...一个(df ['IssueDatetime'])正确地格式化为日期时间,另一个仅具有%dd /%HH(df ['forecastTime']):

            IssueDatetime                   Regions forecastTime WindDirSpeed
0     2019-01-01 06:00:00                EAST COAST        01/06         NW25
1     2019-01-01 06:00:00                EAST COAST        01/15         SW15
2     2019-01-01 06:00:00                EAST COAST        02/00         SE25
3     2019-01-01 06:00:00                EAST COAST        02/06      SE35-45
4     2019-01-01 06:00:00                EAST COAST        02/15         SW40
...                   ...                       ...          ...          ...
12292 2019-12-30 06:00:00  SOUTHEASTERN GRAND BANKS        01/00       N15-20
12293 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS        30/06      NW15-20
12294 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS        31/00          N25
12295 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS        31/15       N15-20
12296 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS        01/00     VRB10-15

是否可以将df ['IssueDatetime']与df ['forecastTime']相关联,以使结果如下:

            IssueDatetime                   Regions     forecastTime             WindDirSpeed
0     2019-01-01 06:00:00                EAST COAST      2019-01-01 06:00:00             NW25
1     2019-01-01 06:00:00                EAST COAST      2019-01-01 15:00:00             SW15
2     2019-01-01 06:00:00                EAST COAST      2019-01-02 00:00:00             SE25
3     2019-01-01 06:00:00                EAST COAST      2019-01-02 06:00:00          SE35-45

在月底关联列时会出现问题。任何建议都会有所帮助。

2 个答案:

答案 0 :(得分:0)

尝试一下:

df['IssueDatetime'] = pd.to_datetime(df['IssueDatetime'])
df['forecastTime'] = pd.to_datetime(df['forecastTime'], format='%d/%H')
df['forecastTime'] = df['forecastTime'].astype(str).str.replace('1900', '2019')
print(df)

        IssueDatetime                   Regions         forecastTime WindDirSpeed
0 2019-01-01 06:00:00                EAST COAST  2019-01-01 06:00:00         NW25
1 2019-01-01 06:00:00                EAST COAST  2019-01-01 15:00:00         SW15
2 2019-01-01 06:00:00                EAST COAST  2019-01-02 00:00:00         SE25
3 2019-01-01 06:00:00                EAST COAST  2019-01-02 06:00:00      SE35-45
4 2019-01-01 06:00:00                EAST COAST  2019-01-02 15:00:00         SW40
5 2019-12-30 06:00:00  SOUTHEASTERN GRAND BANKS  2019-01-01 00:00:00       N15-20
6 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS  2019-01-30 06:00:00      NW15-20
7 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS  2019-01-31 00:00:00          N25
8 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS  2019-01-31 15:00:00       N15-20
9 2019-12-30 06:00:00  SOUTHWESTERN GRAND BANKS  2019-01-01 00:00:00     VRB10-15

答案 1 :(得分:0)

这与先前的答案类似,但有2处修改:

  • 现在的预测日期明确是问题timestamps,但天和小时值已替换
  • 对于发行日期接近月底的情况,relativedelta可以确保将预测延续到下个月(我假设这就是您想要的?)
import pandas as pd
from dateutil.relativedelta import relativedelta

#replicating your data
issuetimes = ['2019-01-01 06:00:00']*5 + ['2019-12-30 06:00:00']*5
forecasts = ['01/06','01/15','02/00','02/06','02/15',
             '01/00','30/06','31/00','31/15','01/00',]

def replace_days_hours(row):
    row['forecastTime'] = row['IssueDatetime'].replace(day=row['forecastTime'].day,
                                                       hour=row['forecastTime'].hour,)
    if row['forecastTime'] < row['IssueDatetime']:
        row['forecastTime'] += relativedelta(months=1)
    return row

df = pd.DataFrame({'IssueDatetime':issuetimes,'forecastTime':forecasts})
df['IssueDatetime'] = pd.to_datetime(df['IssueDatetime'])
df['forecastTime'] = pd.to_datetime(df['forecastTime'], format='%d/%H')
df = df.apply(replace_days_hours,axis=1)

输出:

        IssueDatetime        forecastTime
0 2019-01-01 06:00:00 2019-01-01 06:00:00
1 2019-01-01 06:00:00 2019-01-01 15:00:00
2 2019-01-01 06:00:00 2019-01-02 00:00:00
3 2019-01-01 06:00:00 2019-01-02 06:00:00
4 2019-01-01 06:00:00 2019-01-02 15:00:00
5 2019-12-30 06:00:00 2020-01-01 00:00:00
6 2019-12-30 06:00:00 2019-12-30 06:00:00
7 2019-12-30 06:00:00 2019-12-31 00:00:00
8 2019-12-30 06:00:00 2019-12-31 15:00:00
9 2019-12-30 06:00:00 2020-01-01 00:00:00