我有以下数据框:
dt binary
2016-01-01 00:00:00 False
2016-01-01 00:00:01 False
2016-01-01 00:00:02 False
2016-01-01 00:00:03 False
2016-01-01 00:00:04 True
2016-01-01 00:00:05 True
2016-01-01 00:00:06 True
2016-01-01 00:00:07 False
2016-01-01 00:00:08 False
2016-01-01 00:00:09 True
2016-01-01 00:00:10 True
我想总算binary
为True
时经过的时间。我正在分享我的解决方案,它实现了它,但有些东西告诉我应该有一个更简单的方法,因为它是时间序列数据的一个非常基本的功能。请注意,数据很可能是等距的,但我不能依赖它。
df['binary_grp'] = (df.binary.diff(1) != False).astype(int).cumsum()
# Throw away False values
df = df[df.binary]
groupby = df.groupby('binary_grp')
df = pd.DataFrame({'timespan': groupby.dt.last() - groupby.dt.first()})
return df.timespan.sum().seconds / 60.0
最棘手的部分可能是第一行。它做什么,它基本上为每个连续的块分配一个递增的数字。以下是数据之后的情况:
dt binary binary_grp
2016-01-01 00:00:00 False 1
2016-01-01 00:00:01 False 1
2016-01-01 00:00:02 False 1
2016-01-01 00:00:03 False 1
2016-01-01 00:00:04 True 2
2016-01-01 00:00:05 True 2
2016-01-01 00:00:06 True 2
2016-01-01 00:00:07 False 3
2016-01-01 00:00:08 False 3
2016-01-01 00:00:09 True 4
2016-01-01 00:00:10 True 4
有没有更好的方法来实现这一目标?我想这个代码是高性能的,我的担心是可读性。
答案 0 :(得分:2)
在我看来,你的解决方案很好。
另一种解决方案:
将shift
ed值与ne
进行比较,按cumsum
获取分组。
过滤后,可以使用apply
选择df['binary_grp'] = (df.binary.ne(df.binary.shift())).cumsum()
df = df[df.binary]
s = df.groupby('binary_grp')['dt'].apply(lambda x: x.iloc[-1] - x.iloc[0])
print (s)
binary_grp
2 00:00:02
4 00:00:01
Name: dt, dtype: timedelta64[ns]
all_time = s.sum().seconds / 60.0
print (all_time)
0.05
来使用DataFrame
:
all_time
在您的解决方案中,如果只需要groupby = df.groupby('binary_grp')
s = groupby.dt.last() - groupby.dt.first()
all_time = s.sum().seconds / 60.0
print (all_time)
0.05
:
Series
s
但如果需要,可以通过iloc
从df1 = s.to_frame('timestamp')
print (df1)
timestamp
binary_grp
2 00:00:02
4 00:00:01
Call<ListData> listDataCall = youTubeApi.getlists(channelId, apiKey);
listDataCall.enqueue(new Callback<ListData>() {
@Override
public void onResponse(Call<PlaylistData> call, Response<PlaylistData> response) {
int statusCode = response.code();
ListData listData = response.body();
Log.d("list", "onResponse: " + statusCode);
ListAdapter adapter = new ListAdapter(listData.getListItems()); //Error is here
recyclerView.setAdapter(adapter);
}
@Override
public void onFailure(Call<PlaylistData> call, Throwable t) {
Log.d("Playlist", "onResponse: " + t.getMessage());
}
});
创建:
private final Context context;
private List<ListItem> myPlayList;
public ListAdapter(List<ListItem> cPlaylist, Context _context) {
myPlayList = cPlaylist;
context = _context;
}
答案 1 :(得分:2)
IIUC:
您希望查找binary
为True
的整个系列的时间总和。
但是,我们必须做出一些选择或假设
dt binary
0 2016-01-01 00:00:00 False
1 2016-01-01 00:00:01 False
2 2016-01-01 00:00:02 False
3 2016-01-01 00:00:03 False
4 2016-01-01 00:00:04 True # <- This where time starts
5 2016-01-01 00:00:05 True
6 2016-01-01 00:00:06 True
7 2016-01-01 00:00:07 False # <- And ends here. So this would
8 2016-01-01 00:00:08 False # be 00:00:07 - 00:00:04 or 3 seconds
9 2016-01-01 00:00:09 True # <- Starts again
10 2016-01-01 00:00:10 True # <- But ends here because
# I don't have another Timestamp
根据这些假设,我们可以使用diff
,乘以及sum
df.dt.diff().shift(-1).mul(df.binary).sum()
Timedelta('0 days 00:00:04')
然后我们可以将此概念与groupby
# Use xor and cumsum to identify change in True to False and False to True
grps = (df.binary ^ df.binary.shift()).cumsum()
mask = df.binary.groupby(grps).first()
df.dt.diff().shift(-1).groupby(grps).sum()[mask]
binary
1 00:00:03
3 00:00:01
Name: dt, dtype: timedelta64[ns]
或没有面具
pd.concat([df.dt.diff().shift(-1).groupby(grps).sum(), mask], axis=1)
dt binary
binary
0 00:00:04 False
1 00:00:03 True
2 00:00:02 False
3 00:00:01 True