我正在使用pandasdmx版本0.7.0,虽然我已经成功使用频率维度的OECD数据集,但还有其他数据集,例如Fossil Fuel支持数据集,其中没有频率维度我无法创建一个pandas DataFrame,如下所示:
from pandasdmx import Request
# http://stats.oecd.org/sdmx-json/data/FFS_BRA/all/all
oecd = Request('OECD')
try:
data_response = oecd.data(resource_id='FFS_BRA', key='all/all')
except UnicodeDecodeError:
pass
except KeyError:
pass
else:
oecd_data = data_response.data
print(oecd_data.dim_at_obs)
series_list = list(oecd_data.series)
print(len(series_list))
print(series_list[0].key)
print(set(s.key.FUEL for s in oecd_data.series))
fuel = (s for s in oecd_data.series if s.key.FUEL == 'HARDCOAL')
df = data_response.write(fuel)
print("completed ...")
Output:
TIME_PERIOD
25
SeriesKeyTuple(MEA='BRA_DT_03', FUEL='HARDCOAL', IND='CSE',INC='CONSUMPTION', STG='GENER', MEC='DT', LVL='FED')
{'NONBIOJETK', 'LIGNITE', 'HARDCOAL', 'NAPHTHA', 'NONBIODIES', 'CRUDEOIL', 'RESFUEL', 'NATGAS', 'NGL', 'LPG', 'NONBIOGASO'}
File "stackoverflow_dataframe.py", line 23, in <module>
df = data_response.write(fuel)
File "pandasdmx\api.py", line 635, in write
return self._writer.write(source=source, **kwargs)
File "pandasdmx\writer\data2pandas.py", line 109, in write
reverse_obs, fromfreq, parse_time))
File "pandasdmx\writer\data2pandas.py", line 107, in <genexpr>
series_list = list(s for s in self.iter_pd_series(
File "pandasdmx\writer\data2pandas.py", line 228, in iter_pd_series
obs_values, index=series_index, name=series.key)
UnboundLocalError: local variable 'series_index' referenced before assignment
这是非频率维度数据集的正确方法,还是我刚编错了?
答案 0 :(得分:0)
pandasdmx(0.7.0)作者的正式回答是使用写入函数的parse_time参数设置为False。然后,write函数不会尝试生成DateTime索引,而是获取字符串索引。
CREATE TABLE `tbl_usr_login` (
`LoginID` int(11) NOT NULL AUTO_INCREMENT,
`UID` int(50) NOT NULL,
`ip_address` varchar(55) DEFAULT NULL,
`device` varchar(100) DEFAULT NULL,
`time_stamp` datetime DEFAULT NULL,
PRIMARY KEY (`LoginID`)
)ENGINE=InnoDB DEFAULT CHARSET=latin1;
由于pandasdmx使用SDMX ID为列名和数据返回一个宽格式表,并且由于时间压力,我编写了自己的表来返回一个长格式表(我可以根据需要进行调整),并可以选择使用SDMX ID或完整描述。该解决方案已在835个经合组织数据集上进行了测试!
from pandasdmx import Request
# http://stats.oecd.org/sdmx-json/data/FFS_BRA/all/all
oecd = Request('OECD')
try:
data_response = oecd.data(resource_id='FFS_BRA', key='all/all')
except UnicodeDecodeError:
pass
except KeyError:
pass
else:
oecd_data = data_response.data
df = data_response.write(oecd_data.series, parse_time=False)
df.to_csv('FFS_BRA_solution1.csv')