我敢肯定,对大家来说,这是一个快速而简单的问题!与熊猫和数据框有关
因此,基本上我只需要创建一个数据框,其中包含来自已导入的熊猫库的GDP数据。 “ links”一词下有一个字典,其中包含带有数据的csv文件。实际的问题如下
“词典链接包含具有所有数据的CSV文件。关键GDP的值是包含GDP数据的文件。关键失业率的值包含失业数据。
links={'GDP':'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_gdp.csv',\'unemployment':'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_unemployment.csv'
strong text (问题1):创建一个包含GDP数据的数据框,并显示该数据框的前五行。 使用字典链接和函数pd.read_csv创建包含GDP数据的Pandas数据框。 提示:links [“ GDP”]包含文件的路径或名称。强文本
links={'GDP':'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_gdp.csv'}
links_frame=pd.DataFrame(links, index=[0])
path_csv=links
df=pd.read_csv(path_csv)
写作[52]给我以下错误:
我在做什么错?!我敢肯定这很简单,但是由于我对此很陌生,因此感谢您的帮助!!!谢谢大家微笑微笑微笑
ValueError Traceback (most recent call last) in 2 links_frame=pd.DataFrame(links, index=[0]) 3 path_csv=links ----> 4 df=pd.read_csv(path_csv)
/opt/conda/envs/Python36/lib/python3.6/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision) 700 skip_blank_lines=skip_blank_lines) 701 --> 702 return _read(filepath_or_buffer, kwds) 703 704 parser_f.name = name
/opt/conda/envs/Python36/lib/python3.6/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds) 411 compression = _infer_compression(filepath_or_buffer, compression) 412 filepath_or_buffer, _, compression, should_close = get_filepath_or_buffer( --> 413 filepath_or_buffer, encoding, compression) 414 kwds['compression'] = compression 415
/opt/conda/envs/Python36/lib/python3.6/site-packages/pandas/io/common.py in get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode) 230 if not is_file_like(filepath_or_buffer): 231 msg = "Invalid file path or buffer object type: {_type}" --> 232 raise ValueError(msg.format(_type=type(filepath_or_buffer))) 233 234 return filepath_or_buffer, None, compression, False
ValueError: Invalid file path or buffer object type: <class 'dict'>
答案 0 :(得分:0)
您可以在python中使用urllib
。
import urllib.request
df = pd.read_csv(urllib.request.urlopen("https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_gdp.csv")))
否则
df = pd.read_csv(urllib.request.urlopen(links['GDP']))
希望这会有所帮助。
答案 1 :(得分:0)
首先,您的链接字典中有一个错字,“失业”前有一个\
据我了解您的问题,您想做这样的事情:
import pandas as pd
links={'GDP':'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_gdp.csv','unemployment':'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_unemployment.csv'}
df_objects = {}
for key in links:
df_objects[key] = pd.read_csv(links[key])
for key in df_objects:
print(df_objects[key])
输出:
date level-current level-chained change-current change-chained
0 1948 274.8 2020.0 -0.7 -0.6
1 1949 272.8 2008.9 10.0 8.7
2 1950 300.2 2184.0 15.7 8.0
3 1951 347.3 2360.0 5.9 4.1
4 1952 367.7 2456.1 6.0 4.7
.. ... ... ... ... ...
64 2012 16155.3 15354.6 3.6 1.8
65 2013 16691.5 15612.2 4.4 2.5
66 2014 17427.6 16013.3 4.0 2.9
67 2015 18120.7 16471.5 2.7 1.6
68 2016 18624.5 16716.2 4.2 2.2
[69 rows x 5 columns]
date unemployment
0 1948 3.750000
1 1949 6.050000
2 1950 5.208333
3 1951 3.283333
4 1952 3.025000
.. ... ...
64 2012 8.075000
65 2013 7.358333
66 2014 6.158333
67 2015 5.275000
68 2016 4.875000
[69 rows x 2 columns]
我的解决方案非常简单而且不言自明。