填充Pandas DataFrame时如何处理异常?

时间:2019-06-18 18:47:03

标签: python pandas api dataframe try-catch

我正在尝试用历史每小时天气数据填充数据框。通过调用DarkSky API完成。但是,有时某些字段会丢失并显示KeyError。

API每小时返回以下信息:

sample(seq(as.Date('2011/01/01'), as.Date('2018/01/01'), by="day"),50)

因此,在填充数据框时,我会收到KeyError,因为有时'summary': 'Mostly cloudy throughout the day.', 'icon': 'partly-cloudy-day', 'data': [{ 'time': 1528354800, 'summary': 'Partly Cloudy', 'icon': 'partly-cloudy-night', 'precipIntensity': 0, 'precipProbability': 0, 'temperature': 12.94, 'apparentTemperature': 12.94, 'dewPoint': 9.36, 'humidity': 0.79, 'pressure': 1011.4, 'windSpeed': 2.69, 'windGust': 2.69, 'windBearing': 252, 'cloudCover': 0.33, 'uvIndex': 0, 'visibility': 13.818}] precipIntensity不会出现,而是有一个称为precipProbability的字段。

这是我尝试填充数据框的方式:

precipType

我试图使用try / except语句来产生如下异常:

VICTORIA = 48.407326, -123.329773
        dt = datetime(2018, month, day).isoformat()
        weather = forecast('APIKEY', *VICTORIA, time = dt)
        weather.refresh(units='si')
        for hour in weather['hourly']['data']:
            daily_weather = daily_weather.append(
            {'time': hour['time'],
             'realtime': datetime.fromtimestamp(hour['time']),
             'summary': hour['summary'],
             'icon': hour['icon'],
             'precipIntensity': hour['precipIntensity'],
             'precipProbability': hour['precipProbability'],
             'temperature': hour['temperature'],
             'apparentTemperature': hour['apparentTemperature'],
             'dewPoint': hour['dewPoint'],
             'humidity': hour['humidity'],
             'pressure': hour['pressure'],
             'windSpeed': hour['windSpeed'],
             'windBearing': hour['windBearing'],
             'cloudCover': hour['cloudCover'],
             'uvIndex': hour['uvIndex'],
             'visibility': hour['visibility'],
             }, ignore_index=True)

但是for hour in weather['hourly']['data']: daily_weather = daily_weather.append( {'time': hour['time'], 'realtime': datetime.fromtimestamp(hour['time']), 'summary': hour['summary'], 'icon': hour['icon'], 'temperature': hour['temperature'], 'apparentTemperature': hour['apparentTemperature'], 'dewPoint': hour['dewPoint'], 'humidity': hour['humidity'], 'pressure': hour['pressure'], 'windSpeed': hour['windSpeed'], 'windBearing': hour['windBearing'], 'cloudCover': hour['cloudCover'], 'uvIndex': hour['uvIndex'], 'visibility': hour['visibility'], }, ignore_index=True) try: daily_weather = daily_weather.append({'precipIntensity': hour['precipIntensity'], 'precipProbability': hour['precipProbability']}, ignore_index=True) except KeyError: daily_weather = daily_weather.append({'precipType': hour['precipType']}, ignore_index=True) 字段会填充未使用的行,而不是与其他行在一起:

Dataframe Output

我希望在尝试填充数据框时如何使用异常语句提供一些建议。谢谢。

1 个答案:

答案 0 :(得分:0)

您正在使用两次追加到代码中的调用在输出列表中创建不同的行。将每一行的dict保存在局部变量中,填充它,然后将其追加到列表中。

出于代码可读性的原因,我还建议您不要使用try / catch,而应该直接进行i.order检查。您甚至可以针对多个可选字段将其自动化。

示例(未经测试):

if

或者使其更加整洁:

for hour in weather['hourly']['data']:
     row = {
         'time': hour['time'],
         'realtime': datetime.fromtimestamp(hour['time']),
         'summary': hour['summary'],
         'icon': hour['icon'],
         'temperature': hour['temperature'],
         'apparentTemperature': hour['apparentTemperature'],
         'dewPoint': hour['dewPoint'],
         'humidity': hour['humidity'],
         'pressure': hour['pressure'],
         'windSpeed': hour['windSpeed'],
         'windBearing': hour['windBearing'],
         'cloudCover': hour['cloudCover'],
         'uvIndex': hour['uvIndex'],
         'visibility': hour['visibility'],
     })
     for field in ('precipIntensity', 'precipIntensity', 'precipProbability', 'precipType'):
         if field in hour:
             row[field] = hour[field]
     daily_weather.append(row)