我正在尝试压缩一个列表列表:
var var2
0 9122532.0 [[458182615.0], [79834910.0]]
1 79834910.0 [[458182615.0], [9122532.0]]
2 458182615.0 [[79834910.0], [9122532.0]]
我想:
var var2
0 9122532.0 [458182615.0, 79834910.0]
1 79834910.0 [458182615.0, 9122532.0]
2 458182615.0 [79834910.0, 9122532.0]
应用
sample8['var2'] = sample8['var2'].apply(chain.from_iterable).apply(list)
给我:
var1 var2
0 9122532.0 [[, 4, 5, 8, 1, 8, 2, 6, 1, 5, ., 0, ], [, 7, ...
1 79834910.0 [[, 4, 5, 8, 1, 8, 2, 6, 1, 5, ., 0, ], [, 9, ...
2 458182615.0 [[, 7, 9, 8, 3, 4, 9, 1, 0, ., 0, ], [, 9, 1, ...
答案 0 :(得分:4)
数据:强>
In [162]: df
Out[162]:
var var2
0 9122532.0 [[458182615.0], [79834910.0]]
1 79834910.0 [[458182615.0], [9122532.0]]
2 458182615.0 [[79834910.0], [9122532.0]]
解决方案:使用np.ravel():
In [163]: df['var2'] = df['var2'].apply(np.ravel)
In [164]: df
Out[164]:
var var2
0 9122532.0 [458182615.0, 79834910.0]
1 79834910.0 [458182615.0, 9122532.0]
2 458182615.0 [79834910.0, 9122532.0]
答案 1 :(得分:2)
考虑数据框df
df = pd.DataFrame(dict(
var=[9122532.0, 79834910.0, 458182615.0],
var2=[[[458182615.0], [79834910.0]],
[[458182615.0], [9122532.0]],
[[79834910.0], [9122532.0]]]
))
print(df)
var var2
0 9122532.0 [[458182615.0], [79834910.0]]
1 79834910.0 [[458182615.0], [9122532.0]]
2 458182615.0 [[79834910.0], [9122532.0]]
<强> np.concatenate
强>
您可以apply
np.concatenate
df.assign(var2=df.var2.apply(np.concatenate))
var var2
0 9122532.0 [458182615.0, 79834910.0]
1 79834910.0 [458182615.0, 9122532.0]
2 458182615.0 [79834910.0, 9122532.0]
w / o apply
这要求所有都具有相同的2 x 1
形状。它总是可以适应另一种形状。但是,这种方法仍然要求所有形状都是一致的。
df.assign(var2=np.array(df.var2.tolist()).reshape(-1, 2).tolist())
var var2
0 9122532.0 [458182615.0, 79834910.0]
1 79834910.0 [458182615.0, 9122532.0]
2 458182615.0 [79834910.0, 9122532.0]
<强> 定时 强>
小数据
大数据