我必须为我的测量数据创建一个大词典。 到目前为止,我的(简化)代码看起来像这样:
i = 0
for i in range(len(station_data_files_pandas)): # range(0, 299)
station_data_f_pandas = station_data_files_pandas[i]
station_id = str(int(station_data_f_pandas["STATIONS_ID"][0]))
Y_RR = station_data_f_pandas["MO_RR"].resample("A").apply(very_sum)
# creating the dictionary layer for the anual data in this dictionary
anual_data = {
"Y_RR" : Y_RR
}
# creating the dictionary layer for the montly data in this dictionary
montly_data = {
"MO_RR"
}
# creating the dictionary layer for every station. Everystation has montly and anual data
station = {
"montly_data" : montly_data,
"anual_data" : anual_data
}
# creating the dictionary layer where the staiondata can get called by station id
station_data_dic = {
station_id : station
}
# creating the final layer of the dictionary
station_data_dictionary = {
"station_data": station_data_dic
}
这是输出:
station_data_dictionary
Out[387]:
{'station_data': {'4706': {'montly_data': {'MO_RR'}, # "4706" is the id from the last element in station_data_files_pandas
'anual_data': {'Y_RR': YearMonth
# YearMonth is the index...
# I actually wanted the Index just to show yyyy-mm ...
1981-12-31 1164.3
1982-12-31 852.4
1983-12-31 826.5
1984-12-31 798.8
1985-12-31 NaN
1986-12-31 NaN
1987-12-31 NaN
1988-12-31 NaN
1989-12-31 NaN
1990-12-31 1101.1
1991-12-31 892.4
1992-12-31 802.1
1993-12-31 873.5
1994-12-31 842.7
1995-12-31 962.0
1996-12-31 NaN
1997-12-31 927.9
1998-12-31 NaN
1999-12-31 NaN
2000-12-31 997.8
2001-12-31 986.3
2002-12-31 1117.6
2003-12-31 690.8
2004-12-31 NaN
2005-12-31 NaN
2006-12-31 NaN
2007-12-31 NaN
2008-12-31 NaN
2009-12-31 NaN
2010-12-31 NaN
Freq: A-DEC, Name: MO_RR, dtype: float64}}}}
如您所见,我的输出仅包含一个“工作表”。预计将为300张。
我假设我的代码在循环时会覆盖数据,因此最后我的输出只是从station_data_files_pandas中的最后一个元素制成的一张纸。我怎样才能解决这个问题?我的方法可能是完全错误的吗?...
准备就绪后,它必须看起来像:
station_data_dictionary["station_data"]["403"]["anual_data"]["Y_RR"]
station_data_dictionary["station_data"]["573"]["anual_data"]["Y_RR"]
station_data_dictionary["station_data"]["96"]["anual_data"]["Y_RR"]
...等等。
如您所见,由于我在字典中称不同的东西,唯一可以更改的是我的station_id。
注意:只有一个标题完全相同的问题,但这对我完全没有帮助...
答案 0 :(得分:1)
我没有经过测试,因为我没有您的数据,但这应该可以生成您需要的字典。唯一的变化是在顶部和底部:
station_data_dictionary = {
"station_data": {}
}
for i in range(len(station_data_files_pandas)): # range(0, 299)
station_data_f_pandas = station_data_files_pandas[i]
station_id = str(int(station_data_f_pandas["STATIONS_ID"][0]))
Y_RR = station_data_f_pandas["MO_RR"].resample("A").apply(very_sum)
# creating the dictionary layer for the anual data in this dictionary
anual_data = {
"Y_RR" : Y_RR
}
# creating the dictionary layer for the montly data in this dictionary
montly_data = {
"MO_RR"
}
# creating the dictionary layer for every station. Everystation has montly and anual data
station = {
"montly_data" : montly_data,
"anual_data" : anual_data
}
station_data_dictionary["station_data"][station_id] = station
请注意,i = 0
循环之前不需要for
之类的语句,因为该循环会为您初始化变量。
此外,字典的"station_data"
层似乎是多余的,因为它是该层的唯一键,但是您将其包含在所需的输出中,所以我将其保留了。
答案 1 :(得分:1)
请在下面尝试。 另外,如果您需要按照添加字典的方式使字典保持有序,则必须使用collections包中的OrderedDict。
因此,当您打印字典或遍历其数据时,将按照在以下代码中添加它们的顺序进行循环。
Obs:我假设station_data_files_pandas是一个列表,而不是一个字典,这就是为什么我更改了for循环“签名”以使用增强功能的原因。 如果我错了,并且这个变量实际上是一个字典,并且for循环的每个整数都是该字典的键,那么您也可以遍历以下项:
for k, v in station_data_files_pandas.items():
# now k carries the integer you were using before.
# and v carries station_data_f_pandas
import collections
station_data_dictionary=collections.OrderedDict()
#for i in range(len(station_data_files_pandas)): # range(0, 299)
# using the enhanced for loop
for station_data_f_pandas in station_data_files_pandas: # range(0, 299)
# This is not needed anymore
# station_data_f_pandas = station_data_files_pandas[i]
# station_id = str(int(station_data_f_pandas["STATIONS_ID"][0]))
# You could directly convert to string
station_id = str(int(station_data_f_pandas["STATIONS_ID"][0]))
Y_RR = station_data_f_pandas["MO_RR"].resample("A").apply(very_sum)
MO_RR = # something goes here
# creating the dictionary layer for the anual data in this dictionary
anual_data = {
"Y_RR" : Y_RR
}
# creating the dictionary layer for the montly data in this dictionary
montly_data = {
# "MO_RR"
# You can't have just a key to your dictionary, you need to assign a value to it.
"MO_RR": MO_RR
}
# creating the dictionary layer for every station. Everystation has montly and anual data
station = {
"montly_data" : montly_data,
"anual_data" : anual_data
}
# creating the dictionary layer where the staiondata can get called by station id
station_data_dic = {
station_id : station
}
# creating the final layer of the dictionary
#station_data_dictionary = {
# "station_data": station_data_dic
# }
# Why use {"apparently_useless_id_layer": {"actual_id_info": "data"}}
# instead of {"actual_info_id": "data"} ?
station_data_dictionary[station_id] = station