我有df:
arrays = [np.array(['bar', 'bar', 'bar', 'bar', 'bar', 'bar', 'baz', 'baz', 'baz', 'baz', 'baz', 'baz', 'foo', 'foo', 'foo', 'foo', 'foo', 'foo']),
np.array(['one','two'] * 9),
np.array([1,2,3] * 6)]
df = pd.DataFrame(np.random.randn(18,2), index=arrays)
col0 col1
bar one 1 0.872359 -1.115871
two 2 -0.937908 -0.528563
one 3 -0.118874 0.286595
two 1 -0.507698 1.364643
one 2 1.507611 1.379498
two 3 -1.398019 -1.603056
baz one 1 1.498263 0.412380
two 2 -0.930022 -1.483657
one 3 -0.438157 1.465089
two 1 0.161887 1.346587
one 2 0.167086 1.246322
two 3 0.276344 -1.206415
foo one 1 -0.045389 -0.759927
two 2 0.087999 -0.435753
one 3 -0.232054 -2.221466
two 1 -1.299483 1.697065
one 2 0.612211 -1.076738
two 3 -1.482573 0.907826
现在我要创建“新”列:
for 'bar'
if index.level(2) > 1
"NEW" = col1
else
"NEW" = col2
对于'baz'与> 2相同
对于'foo'与> 3相同
没有Py循环怎么做?
答案 0 :(得分:2)
您可以按级别使用get_level_values
选择索引值,然后使用新列numpy.where
:
#if possible use dictionary
d = {'bar':1, 'baz':2, 'foo':3}
m = df.index.get_level_values(2) > df.rename(d).index.get_level_values(0)
df['NEW'] = np.where(m, df.col1, df.col2)
有关更一般的解决方案,请使用Series.rank
:
a = df.index.get_level_values(2)
b = df.index.get_level_values(0).to_series().rank(method='dense')
df['NEW'] = np.where(a > b, df.col1, df.col2)
详情:
print (b)
bar 1.0
bar 1.0
bar 1.0
bar 1.0
bar 1.0
bar 1.0
baz 2.0
baz 2.0
baz 2.0
baz 2.0
baz 2.0
baz 2.0
foo 3.0
foo 3.0
foo 3.0
foo 3.0
foo 3.0
foo 3.0
dtype: float64
答案 1 :(得分:2)
我认为你可以使用df.factorize
来避免在df.level(0)上循环:
In [40]: thresh = pd.factorize(df.index.get_level_values(0).values)[0] + 1
In [41]: mask = df.index.get_level_values(2) > thresh
In [42]: df['NEW'] = np.where(mask, df.col1, df.col2)
In [43]: df
Out[43]:
col1 col2 NEW
bar one 1 0.247222 -0.270104 -0.270104
two 2 0.429196 -1.385352 0.429196
one 3 0.782293 -1.565623 0.782293
two 1 0.392214 1.023960 1.023960
one 2 -1.628410 -0.484275 -1.628410
two 3 0.256757 0.529373 0.256757
baz one 1 -0.568608 -0.776577 -0.776577
two 2 2.142408 -0.815413 -0.815413
one 3 0.860080 0.501965 0.860080
two 1 -0.267029 -0.025360 -0.025360
one 2 0.187145 -0.063436 -0.063436
two 3 0.351296 -2.050649 0.351296
foo one 1 0.704941 0.176698 0.176698
two 2 -0.380353 1.027745 1.027745
one 3 -1.337364 -0.568359 -0.568359
two 1 -0.588601 -0.800426 -0.800426
one 2 1.513358 -0.616237 -0.616237
two 3 0.244831 1.027109 1.027109