我有一个包含天气数据的JSON文件。每天有30天的数据和24小时。每天包含当天的数据I.E.日出和每小时包含小时I.E.的数据。沉淀。
我正在尝试创建一个每小时包含一行的数据框。每一行都包含包含当天数据的列。
这是我的代码:
weather_list = []
for day_dict in weather['data'][1]:
weather_dict = {}
for day_key, day_val in day_dict.items():
if not isinstance(day_val, list):
# Daily info
weather_dict[day_key] = day_val
else:
if len(day_val) == 1:
# Astronomy i.e. sunrise sunset for the day
weather_dict.update(day_val[0])
else:
# Hourly weather info
for hour in day_val:
for hour_key, hour_val in hour.items():
if isinstance(hour_val, list):
if len(hour_val) == 1:
# weatherIconUrl and weatherDesc
# Maybe be more than one value in the future
weather_dict[hour_key] = hour_val[0]['value']
else:
weather_dict[hour_key] = hour_val
weather_list.append(weather_dict)
weather_df = pd.DataFrame(weather_list)
此代码的工作原理是它可以添加每日数据,但是当它添加每小时数据时。它完全一样.E。它包含24行小时23。
以下是我的数据示例。
{\"data\":{\"request\":[{\"type\":\"City\",\"query\":\"Cowansville, Canada\"}],\"weather\":[{\"date\":\"2009-07-04\",\"astronomy\":[{\"sunrise\":\"04:09 AM\",\"sunset\":\"07:42 PM\",\"moonrise\":\"05:59 PM\",\"moonset\":\"01:27 AM\"}],\"maxtempC\":\"17\",\"maxtempF\":\"63\",\"mintempC\":\"16\",\"mintempF\":\"60\",\"totalSnow_cm\":\"0.0\",\"sunHour\":\"11.5\",\"uvIndex\":\"0\",\"hourly\":[{\"time\":\"0\",\"tempC\":\"16\",\"tempF\":\"60\",\"windspeedMiles\":\"5\",\"windspeedKmph\":\"8\",\"winddirDegree\":\"197\",\"winddir16Point\":\"SSW\",\"weatherCode\":\"353\",\"weatherIconUrl\":[{\"value\":\"http://cdn.worldweatheronline.net/images/wsymbols01_png_64/wsymbol_0025_light_rain_showers_night.png\"}],\"weatherDesc\":[{\"value\":\"Light rain shower\"}],\"precipMM\":\"0.4\",\"humidity\":\"98\",\"visibility\":\"10\",\"pressure\":\"1005\",\"cloudcover\":\"87\",\"HeatIndexC\":\"16\",\"HeatIndexF\":\"60\",\"DewPointC\":\"15\",\"DewPointF\":\"60\",\"WindChillC\":\"16\",\"WindChillF\":\"60\",\"WindGustMiles\":\"10\",\"WindGustKmph\":\"15\",\"FeelsLikeC\":\"16\",\"FeelsLikeF\":\"60\"},{\"time\":\"100\",\"tempC\":\"16\",\"tempF\":\"60\",\"windspeedMiles\":\"5\",\"windspeedKmph\":\"8\",\"winddirDegree\":\"201\",\"winddir16Point\":\"SSW\",\"weatherCode\":\"353\",\"weatherIconUrl\":[{\"value\":\"http://cdn.worldweatheronline.net/images/wsymbols01_png_64/wsymbol_0025_light_rain_showers_night.png\"}],\"weatherDesc\":[{\"value\":\"Light rain shower\"}],\"precipMM\":\"1.2\",\"humidity\":\"99\",\"visibility\":\"9\",\"pressure\":\"1005\",\"cloudcover\":\"91\",\"HeatIndexC\":\"16\",\"HeatIndexF\":\"60\",\"DewPointC\":\"15\",\"DewPointF\":\"60\",\"WindChillC\":\"16\",\"WindChillF\":\"60\",\"WindGustMiles\":\"9\",\"WindGustKmph\":\"15\",\"FeelsLikeC\":\"16\",\"FeelsLikeF\":\"60\"},{\"time\":\"200\",\"tempC\":\"16\",\"tempF\":\"60\",\"windspeedMiles\":\"5\",\"windspeedKmph\":\"9\",\"winddirDegree\":\"204\",\"winddir16Point\":\"SSW\",\"weatherCode\":\"356\",\"weatherIconUrl\":[{\"value\":\"http://cdn.worldweatheronline.net/images/wsymbols01_png_64/wsymbol_0026_heavy_rain_showers_night.png\"}],\"weatherDesc\":[{\"value\":\"Moderate or heavy rain shower\"}],\"precipMM\":\"2.0\",\"humidity\":\"99\",\"visibility\":\"8\",\"pressure\":\"1005\",\"cloudcover\":\"96\",\"HeatIndexC\":\"16\",\"HeatIndexF\":\"60\",\"DewPointC\":\"16\",\"DewPointF\":\"60\",\"WindChillC\":\"16\",\"WindChillF\":\"60\",\"WindGustMiles\":\"9\",\"WindGustKmph\":\"15\",\"FeelsLikeC\":\"16\",\"FeelsLikeF\":\"60\"}
答案 0 :(得分:0)
问题是我在列表中更新了def flatten_json(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
weather_list = []
for day_dict in weather['data'][1]:
# Create a dictionary that only contains the day data
day_data = copy.deepcopy(day_dict)
del day_data['hourly']
day_data = flatten_json(day_data)
for hour_dict in day_dict['hourly']:
hour_data = flatten_json(hour_dict)
day_data.update(hour_data)
day_hour_data = copy.deepcopy(day_data)
weather_list.append(day_hour_data)
weather_df = pd.DataFrame(weather_list)
。我需要创建一个对象的深层副本,所以我没有引用同一个对象。
我重构了代码看起来像这样。
d