说我有一个pandas数据框,其中的一列是键列表。如何创建另一个具有与这些键对应的值的列?
下面是声明的数据框和字典的最小示例
ex = pd.DataFrame( {'a': [1,2,3], 'b': [[1,2,3], [3, 2, 1], [2, 1, 3]] })
ex.head()
a b
0 1 [1, 2, 3]
1 2 [3, 2, 1]
2 3 [2, 1, 3]
din = {1: 'A', 2:'B', 3:'C'}
如何创建另一列,该列使用字典映射b列中每个列表中的每个值?
例如,我想做这样的事情:
a b c
0 1 [1, 2, 3] [A, B, C]
1 2 [3, 2, 1] [C, B, A]
2 3 [2, 1, 3] [B, A, C]
通常,要在列不是列表的情况下执行类似的操作,则使用map函数,如下所示
ex['c'] = ex['b'].map(din)
但是,由于b列是列表而不是键本身,因此会出现此错误
TypeError Traceback (most recent call last)
<ipython-input-44-d5b753372a81> in <module>()
----> 1 ex['c'] = ex['b'].map(din)
/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in map(self, arg, na_action)
2348 if isinstance(arg, Series):
2349 # arg is a Series
-> 2350 indexer = arg.index.get_indexer(values)
2351 new_values = algorithms.take_1d(arg._values, indexer)
2352 else:
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
2682 target = target.astype(object)
2683 return this.get_indexer(target, method=method, limit=limit,
-> 2684 tolerance=tolerance)
2685
2686 if not self.is_unique:
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
2700 'backfill or nearest reindexing')
2701
-> 2702 indexer = self._engine.get_indexer(target._values)
2703
2704 return _ensure_platform_int(indexer)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_indexer()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.lookup()
TypeError: unhashable type: 'list'
答案 0 :(得分:1)
由于列中的每个值都是一个列表,所以不能直接使用map
。
您需要像这样在列表中映射每个值:
ex['c']=ex['b'].apply(lambda x: [din.get(v) for v in x])
a b c
0 1 [1, 2, 3] [A, B, C]
1 2 [3, 2, 1] [C, B, A]
2 3 [2, 1, 3] [B, A, C]
或根据Zero
的建议:
ex['c'] = ex['b'].apply(lambda L: list(map(din.get, L)))
或根据jezrael
的建议:
ex['c'] = [list(map(din.get, x)) for x in ex['b']]
答案 1 :(得分:1)
更多大熊猫方式:
ex['c']=ex['b'].apply(lambda x: pd.Series(x).map(din).tolist())
print(ex)
输出:
a b c
0 1 [1, 2, 3] [A, B, C]
1 2 [3, 2, 1] [C, B, A]
2 3 [2, 1, 3] [B, A, C]
您的代码不起作用,因为您一次将其分配给整个列,而不是每个值,因此您可以为其使用apply
或者:
ex['c']=list(map(lambda i: list(map(din.get,i)),ex['b']))
或者如@jezrael所述:
ex['c']=list(map(lambda i: [din.get(a) for a in i],ex['b']))