我有从IMDb API获得的值列表。该列表来自字典,字典(定义为dct
)如下所示:
{'data': {'akas': ["Dave 'Gruber' Allen",
'Dave Gruber Allen',
"David 'Gruber' Allen",
'David Gruber Allen',
'Dave Gruber',
'The Higgins Boys and Gruber',
'The Naked Trucker'],
'birth info': {'birth place': 'Naperville, Illinois, USA'},
'filmography': [{'actor': [<Movie id:8050858[http] title:_"Ski Master Academy ()" (None)_>,
<Movie id:7116704[http] title:_"It's a Beach Thing" (2018)_>,
<Movie id:5016504[http] title:_"Preacher" (2018)_>,
<Movie id:4847134[http] title:_"Mighty Magiswords (2017-2018)" (None)_>,
<Movie id:6196406[http] title:_Boy Band (2018)_>,
<Movie id:4061080[http] title:_"Love (2016-2018)" (None)_>,
<Movie id:5511512[http] title:_"Trial & Error" (2017)_>,
<Movie id:2758770[http] title:_"Star vs. the Forces of Evil (2016-2017)" (None)_>,
<Movie id:5909786[http] title:_"Tween Fest" (2016)_>,
<Movie id:1289401[http] title:_Ghostbusters: Answer the Call (2016)_>,
...
要获取actor
下的所有内容,请输入
In: dct['data']['filmography'][0]['actor']
Out: [<Movie id:8050858[http] title:_"Ski Master Academy ()" (None)_>,
<Movie id:7116704[http] title:_"It's a Beach Thing" (2018)_>,
<Movie id:5016504[http] title:_"Preacher" (2018)_>,
<Movie id:4847134[http] title:_"Mighty Magiswords (2017-2018)" (None)_>,
<Movie id:6196406[http] title:_Boy Band (2018)_>,
<Movie id:4061080[http] title:_"Love (2016-2018)" (None)_>,
<Movie id:5511512[http] title:_"Trial & Error" (2017)_>,
<Movie id:2758770[http] title:_"Star vs. the Forces of Evil (2016-2017)" (None)_>,
<Movie id:5909786[http] title:_"Tween Fest" (2016)_>,
<Movie id:1289401[http] title:_Ghostbusters: Answer the Call (2016)_>,
<Movie id:2176287[http] title:_"Comedy Bang! Bang!" (2016)_>,
<Movie id:2624370[http] title:_"Granite Flats" (2015)_>,
<Movie id:4574708[http] title:_"W/ Bob and David" (2015)_>,
<Movie id:4548442[http] title:_Thrilling Adventure Hour Live (2015)_>,
输出只是一个列表。但是,如果我尝试将其变成pd.Series(dct['data']['filmography'][0]['actor'])
的系列,则会收到错误消息:
KeyError: 0
为什么会这样?
编辑:这是我用来获取字典的代码:
from imdb import IMDb
import pandas as pd
ia = IMDb()
people = ia.search_person('Dave Allen')
people[0]
dct = ia.get_person_filmography('0020405')
答案 0 :(得分:1)
Here是一个与问题相关的问题-引导我沿途认为列表中的元素就是问题。如果您查看文档,则pd.Series的要求是它必须包含
类似数组,字典或标量值
所以...尝试这个:
In [79]: new_list = []
In [80]: for item in dct['data']['filmography'][0]['actor']:
...: new_list.append(str(item))
...:
In [81]: df = pd.Series(new_list)
In [82]: df.head()
Out[82]:
0 Ski Master Academy ()
1 It's a Beach Thing
2 Preacher
3 Mighty Magiswords (2017-2018)
4 Boy Band
dtype: object
我也很乐意对发生这种情况的原因进行更详细的解释。我注意到您的原始列表元素在new-list
中转换为字符串类型时,实际上会产生不同的“外观”元素。实际上,在转换时,仅将内部引号之间的字符捕获为迭代器中的item
。我的猜测是dct['data']['filmography'][0]['actor']
的元素某种程度上不是列表类型的元素(?)。
KeyError Traceback (most recent call last)
/anaconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
700 type_pprinters=self.type_printers,
701 deferred_pprinters=self.deferred_printers)
--> 702 printer.pretty(obj)
703 printer.flush()
704 return stream.getvalue()
/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
398 if cls is not object \
399 and callable(cls.__dict__.get('__repr__')):
--> 400 return _repr_pprint(obj, self, cycle)
401
402 return _default_pprint(obj, self, cycle)
/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
693 """A pprint that just redirects to the normal repr function."""
694 # Find newlines and replace them with p.break_()
--> 695 output = repr(obj)
696 for idx,output_line in enumerate(output.splitlines()):
697 if idx:
/anaconda3/lib/python3.6/site-packages/pandas/core/base.py in __repr__(self)
78 Yields Bytestring in Py2, Unicode String in py3.
79 """
---> 80 return str(self)
81
82
/anaconda3/lib/python3.6/site-packages/pandas/core/base.py in __str__(self)
57
58 if compat.PY3:
---> 59 return self.__unicode__()
60 return self.__bytes__()
61
/anaconda3/lib/python3.6/site-packages/pandas/core/series.py in __unicode__(self)
1064
1065 self.to_string(buf=buf, name=self.name, dtype=self.dtype,
-> 1066 max_rows=max_rows, length=show_dimensions)
1067 result = buf.getvalue()
1068
/anaconda3/lib/python3.6/site-packages/pandas/core/series.py in to_string(self, buf, na_rep, float_format, header, index, length, dtype, name, max_rows)
1108 float_format=float_format,
1109 max_rows=max_rows)
-> 1110 result = formatter.to_string()
1111
1112 # catch contract violations
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in to_string(self)
257
258 fmt_index, have_header = self._get_formatted_index()
--> 259 fmt_values = self._get_formatted_values()
260
261 if self.truncate_v:
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in _get_formatted_values(self)
247 values_to_format = self.tr_series._formatting_values()
248 return format_array(values_to_format, None,
--> 249 float_format=self.float_format, na_rep=self.na_rep)
250
251 def to_string(self):
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in format_array(values, formatter, float_format, na_rep, digits, space, justify, decimal)
1820 space=space, justify=justify, decimal=decimal)
1821
-> 1822 return fmt_obj.get_result()
1823
1824
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in get_result(self)
1840
1841 def get_result(self):
-> 1842 fmt_values = self._format_strings()
1843 return _make_fixed_width(fmt_values, self.justify)
1844
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in _format_strings(self)
1886 fmt_values.append(float_format(v))
1887 else:
-> 1888 fmt_values.append(u' {v}'.format(v=_format(v)))
1889
1890 return fmt_values
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in _format(x)
1868 else:
1869 # object dtype
-> 1870 return u'{x}'.format(x=formatter(x))
1871
1872 vals = self.values
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in <lambda>(x)
1855 formatter = (
1856 self.formatter if self.formatter is not None else
-> 1857 (lambda x: pprint_thing(x, escape_chars=('\t', '\r', '\n'))))
1858
1859 def _format(x):
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/printing.py in pprint_thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings, max_seq_items)
220 result = _pprint_seq(thing, _nest_lvl, escape_chars=escape_chars,
221 quote_strings=quote_strings,
--> 222 max_seq_items=max_seq_items)
223 elif isinstance(thing, compat.string_types) and quote_strings:
224 if compat.PY3:
/anaconda3/lib/python3.6/site-packages/pandas/io/formats/printing.py in _pprint_seq(seq, _nest_lvl, max_seq_items, **kwds)
116 for i in range(min(nitems, len(seq))): # handle sets, no slicing
117 r.append(pprint_thing(
--> 118 next(s), _nest_lvl + 1, max_seq_items=max_seq_items, **kwds))
119 body = ", ".join(r)
120
/anaconda3/lib/python3.6/site-packages/imdb/utils.py in __getitem__(self, key)
1471 # Handle key aliases.
1472 key = self.keys_alias.get(key, key)
-> 1473 rawData = self.data[key]
1474 if key in self.keys_tomodify and \
1475 self.modFunct not in (None, modNull):
KeyError: 0