Question

在我的DataFrame中，我有dicts列表。当我做的时候

data.stations.apply(lambda x: x)[5]

输出是：

[{'id': 245855,
'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
   {'connector': 3, 'id': 514161, 'power': 0},
   {'connector': 7, 'id': 514160, 'power': 0}]},
 {'id': 245856,
  'outlets': [{'connector': 13, 'id': 514165, 'power': 0},
   {'connector': 3, 'id': 514164, 'power': 0},
   {'connector': 7, 'id': 514163, 'power': 0}]},
 {'id': 245857,
  'outlets': [{'connector': 13, 'id': 514168, 'power': 0},
   {'connector': 3, 'id': 514167, 'power': 0},
   {'connector': 7, 'id': 514166, 'power': 0}]}]

所以它在列表中看起来像3个dicts。

当我这样做时

data.stations.apply(lambda x: x[0] )[5]

它做了应有的事情：

{'id': 245855,
 'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
  {'connector': 3, 'id': 514161, 'power': 0},
  {'connector': 7, 'id': 514160, 'power': 0}]}

然而，当我选择第二或第三个元素时，它不起作用：

data.stations.apply(lambda x: x[1])[5]

这会出错：

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-118-1210ba659690> in <module>()
----> 1 data.stations.apply(lambda x: x[1])[5]

~\AppData\Local\Continuum\Anaconda3\envs\geo2\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
   2549             else:
   2550                 values = self.asobject
-> 2551                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   2552 
   2553         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()

<ipython-input-118-1210ba659690> in <lambda>(x)
----> 1 data.stations.apply(lambda x: x[1])[5]

IndexError: list index out of range

为什么呢？它应该只给我第二个元素。

Answer 1

原因可能很简单，即每行中的所有列表条目的长度可能不同。让我们考虑一个例子

data = pd.DataFrame({'stations':[[{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}],
                                [{'1':2,'3':4},{'1':2,'3':4}],
                                [{'1':2,'3':4}],
                                 [{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}]]
                    })

                                         stations
0  [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...
1               [{'1': 2, '3': 4}, {'1': 2, '3': 4}]
2                                 [{'1': 2, '3': 4}]
3  [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...

如果你这样做：

data['stations'].apply(lambda x: x[0])[3]

你会得到：

{'1': 2, '3': 4}

但如果你这样做：

data['stations'].apply(lambda x: x[1])[3]

您将获得Index Error... list out of bounds，因为如果您观察第3行，则列表中只有一个元素。希望它能清除你的怀疑。

访问列表中第一个元素以外的内容并不起作用

1 个答案: