这段代码显示了我想要创建的DataFrame
df = pd.DataFrame(index=pd.date_range(start='4/1/2012', periods=10))
df['foo'] = 7
df['what_i_want'] = [0,0,0,0,1,2,3,0,0,0]
结果如下:
foo what_i_want
2012-04-01 7 0
2012-04-02 7 0
2012-04-03 7 0
2012-04-04 7 0
2012-04-05 7 1
2012-04-06 7 2
2012-04-07 7 3
2012-04-08 7 0
2012-04-09 7 0
2012-04-10 7 0
我试图想出一种方法,我可以在一系列的任意切片上创建这些1,2,...,n
系列。 IE:df['2012-04-05':'2012-04-07'] = magic_function()
但我不确定如何在不使用循环的情况下执行此操作。
答案 0 :(得分:4)
IIUC,您可以使用loc
切片并指定range
。
df['what_i_want'] = 0
df.loc['2012-04-05':'2012-04-07', 'what_i_want'] = range(1, 4)
df
foo what_i_want
2012-04-01 7 0
2012-04-02 7 0
2012-04-03 7 0
2012-04-04 7 0
2012-04-05 7 1
2012-04-06 7 2
2012-04-07 7 3
2012-04-08 7 0
2012-04-09 7 0
2012-04-10 7 0
答案 1 :(得分:2)
首先通过Series
切片提取range
新length
的索引:
idx = df.loc['2012-04-05':'2012-04-07'].index
df['new'] = pd.Series(range(1, len(idx)+1), index=idx).reindex(df.index, fill_value=0)
或指定range
,但有必要替换NaN
并转换为int
:
l = len(df.loc['2012-04-05':'2012-04-07'].index)
df.loc['2012-04-05':'2012-04-07', 'new'] = range(1, l+1)
df['new'] = df['new'].fillna(0).astype(int)
print (df)
foo new
2012-04-01 7 0
2012-04-02 7 0
2012-04-03 7 0
2012-04-04 7 0
2012-04-05 7 1
2012-04-06 7 2
2012-04-07 7 3
2012-04-08 7 0
2012-04-09 7 0
2012-04-10 7 0
答案 2 :(得分:0)
你可以这样做:
df.loc['2012-04-08':'2012-04-10']['what_i_want']= \
df.loc['2012-04-08':'2012-04-10'].apply(lambda x:1, axis=1).cumsum()
在将所选值转换为1后,使用所选值的累积和。