我有/PROJECT/..
维MultiIndex
,希望一个I, M
同时更新所有i \in I
行。
这是我的数据框:
M
以下是我想填写的内容:
>>> result.head(n=10)
Out[9]:
FINLWT21
i INCAGG
0 1 NaN
7 NaN
9 NaN
5 NaN
3 NaN
1 1 NaN
7 NaN
9 NaN
5 NaN
3 NaN
我认为正确的命令是sample.groupby(field).sum()
FINLWT21
INCAGG
1 8800809.719
3 9951002.611
5 9747721.721
7 7683066.990
9 11091861.692
。但是,以下是result.loc[i] = sample.groupby(field).sum()
的内容:
result
如何更新所有"内部索引"在同一时间?
答案 0 :(得分:1)
您想使用pd.IndexSlice
。它返回一个可以用loc
进行剪裁的对象。
result.sort_index();
slc = pd.IndexSlice[i, :]
result.loc[slc, :] = sample.groupby(field).sum()
result.sort_index();
- > pd.IndexSclice
要求对索引进行排序。
slc = pd.IndexSclice[i, :]
- >用于创建通用切片器的语法,以获得具有2个级别的pd.MultiIndex
的第1级的第i个组。
' result.loc [slc,:] =` - >使用切片
import pandas as pd
import numpy as np
result = pd.DataFrame([], columns=['FINLWT21'],
index=pd.MultiIndex.from_product([[0, 1], [1, 7, 9, 5, 3]]))
result.sort_index(inplace=True);
slc = pd.IndexSlice[0, :]
result.loc[slc, :] = [1, 2, 3, 4, 5]
print result
FINLWT21
0 1 1
3 2
5 3
7 4
9 5
1 1 NaN
3 NaN
5 NaN
7 NaN
9 NaN
答案 1 :(得分:0)
这是我可能正在寻找的功能:
def _assign_multi_index(dest, k, v, inplace=True, bool_nan=False):
"""
assigns v to dest[k] inplace, doing a "sensible" multi-index alignment, raising
a ValueError if no alignment is achieved.
I'm not sure if there's a better way to do this, or a reason not to do it
the way it's currently written.
"""
if not inplace:
raise NotImplementedError()
if k in dest:
warn("key '{}' already exists, continue with caution!".format(k))
v_names = v.index.names
dest_names = dest.index.names
if all(n in dest_names for n in v_names):
if len(v_names) < len(dest_names):
# if need be, drop some index levels temporarily in dest
dropped_names = [n for n in dest_names if n not in v_names]
dest.reset_index(dropped_names, inplace=True)
v.index = v.index.reorder_levels([n for n in dest_names if n in v_names]) # just to be safe
else:
raise ValueError("index levels do not match dest.")
dest[k] = v
# restore the original index levels if need be
if dest.index.names != dest_names:
dest.reset_index(inplace=True)
dest.set_index(dest_names, inplace=True)
if bool_nan != np.nan and v.dtype.name == 'bool' and dest[k].dtype.name != 'bool':
# this happens when nans had to be inserted, let's convert nans
dest_k = dest[k].copy()
dest_k[pd.isnull(dest_k)] = bool_nan
dest[k] = dest_k.astype(bool)
答案 2 :(得分:0)
事实证明,最好的方法是在正确的数据集中添加索引。以下按预期工作:
data = sample.groupby(field).sum()
data['index'] = i
result.loc[i] = data.reset_index().set_index(['index', field])