我有一个看起来像这样的数据框:
df.ix[1:3]
Val endDay startDay
1 2.20 1996-04-01 1996-03-31
2 5.15 1997-04-05 1997-04-01
startDay
从9 am
小时开始,一直持续到结束日期8 am
。
我正在寻找以下输出:
startDay Hour Val
1996-03-31 9 2.20
1996-03-31 10 2.20
........
1996-03-31 24 2.20
1996-04-01 1 2.20
........
1996-04-01 7 2.20
1996-04-01 8 2.20
1997-04-01 9 5.15
1997-04-01 10 5.15
........
1997-04-01 24 5.15
1997-04-05 1 5.15
........
1997-04-05 7 5.15
1997-04-05 8 5.15
我只是使用.....
来表示从11到23和2到6的连续时间。我不确定如何通过Python进行堆叠。
答案 0 :(得分:2)
创建日期时间列表后只需使用unnesting
<!-- valid HTML -->
<body>
<div>
<p>Salamanders are a group of amphibians with a lizard-like appearance, including short legs and a tail in both larval and adult forms.</p>
<aside>
<p>The Rough-skinned Newt defends itself with a deadly neurotoxin.</p>
</aside>
<p>Several species of salamander inhabit the temperate rainforest of the Pacific Northwest, including the Ensatina, the Northwestern Salamander and the Rough-skinned Newt. Most salamanders are nocturnal and hunt for insects, worms and other small creatures.</p>
</div>
</body
请注意,这里我没有用df['day']=[pd.date_range(x+' 09:00:00',y+' 08:00:00',freq='H') for x , y in zip(df.startDay,df.endDay)]
yourdf=unnesting(df,['day']).drop_duplicates('day')
yourdf
Out[909]:
day Val endDay startDay
1 1996-03-31 09:00:00 2.20 1996-04-01 1996-03-31
1 1996-03-31 10:00:00 2.20 1996-04-01 1996-03-31
1 1996-03-31 11:00:00 2.20 1996-04-01 1996-03-31
1 1996-03-31 12:00:00 2.20 1996-04-01 1996-03-31
...
和date
拆分两列,而这可以用hour
yourdf.day.dt.hour; yourdf.dt.date