我需要按网站对数据进行分组,并获得特定日期范围内的平均观看次数。我的数据如下所示:
date website amount_views
1/1/2021 a 23
1/2/2021 a 17
1/3/2021 a 10
1/4/2021 a 25
1/5/2021 a 2
1/1/2021 b 12
1/2/2021 b 7
1/3/2021 b 5
1/4/2021 b 17
1/5/2021 b 2
所以我需要看看 a 和 b 网站在两个日期范围内的平均值是多少(1/1/2021 - 1/3/2021(前)和 1/3/2021 - 1/5/2021(邮政)) 所需的输出是:
date website avg_amount_views
pre a 31.5
post a 35.6
pre b 15.5
post b 22.6
答案 0 :(得分:4)
您可以使用 np.where 和 date.between 来分配前后状态并按相同和网站分组并求平均值。
在一行中(虽然不那么可读):
df['date']=pd.to_datetime(df['date'])
df.groupby([np.where(df['date'].between('1/1/2021','1/3/2021'),'pre'\
,'post'),'website'])['amount_views'].mean().to_frame('mean')
循序渐进(更易读):
df['date']=pd.to_datetime(df['date'])
df['status']=np.where(df['date'].between('1/1/2021','1/3/2021'),'pre','post')
df.groupby(['status','website'])['amount_views'].mean().to_frame('mean')
mean
status website
post a 13.500000
b 9.500000
pre a 16.666667
b 8.000000
答案 1 :(得分:3)
pandas.Grouper
并将 private ArrayList<String> folderNames;
static ArrayList<MusicFiles> musicFiles;
private Context mContext;
public FolderAdapter(ArrayList<String> folderNames, ArrayList<MusicFiles> musicFiles, Context mContext) {
this.folderNames = folderNames;
this.musicFiles = musicFiles;
this.mContext = mContext;
}
@NonNull
@Override
public MyHolder onCreateViewHolder(@NonNull ViewGroup parent, int viewType) {
View view = LayoutInflater.from(mContext).inflate(R.layout.folder_item, parent,
false);
return new MyHolder(view);
}
@Override
public void onBindViewHolder(@NonNull MyHolder holder, final int position) {
int index = folderNames.get(position).lastIndexOf("/");
String folder = folderNames.get(position).substring(index + 1);
holder.folder.setText(folder);
holder.counterFiles.setText(String.valueOf(NumberOffFiles(folderNames.get(position))));
Bitmap image = getAlbum(musicFiles.get(position).getPatch());
if (image != null) {
Glide.with(mContext).asBitmap()
.load(image)
.into(holder.cover);
} else {
Glide.with(mContext)
.load(R.drawable.img_music)
.into(holder.cover);
}
holder.itemView.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
Intent intent = new Intent(mContext,VideoFolderActivity.class);
intent.putExtra("folderNames", folderNames.get(position));
mContext.startActivity(intent);
}
});
}
@Override
public int getItemCount() {
return folderNames.size();
}
static class MyHolder extends RecyclerView.ViewHolder {
TextView folder, counterFiles;
ImageView capa_da_pasta;
public MyHolder(@NonNull View itemView) {
super(itemView);
folder = itemView.findViewById(R.id.folderName);
counterFiles = itemView.findViewById(R.id.count_files_folder);
cover = itemView.findViewById(R.id.folderImage);
}
}
int NumberOffFiles(String folderNames){
int countFiles = 0;
for (MusicFiles musicFiles : musicFiles) {
if (musicFiles.getPatch()
.substring(0, musicFiles.getPatch().lastIndexOf("/"))
.endsWith(folderNames)) {
countFiles++;
}
}
return countFiles;
}
参数指定为 freq
,表示每周。'W'
答案 2 :(得分:3)
使用:
dates = pd.to_datetime(df['date'])
new_df = (df.groupby(['website', np.select((dates.between('1/1/2021', '1/3/2021'),
dates.between('1/3/2021', '1/5/2021')),
('pre', 'pos'))])
.amount_views
.mean()
.rename_axis(('website', 'date'))
.reset_index(name='avg_amount_views'))
print(new_df)
website date avg_amount_views
0 a pos 13.500000
1 a pre 16.666667
2 b pos 9.500000
3 b pre 8.000000
答案 3 :(得分:3)
您可以使用 pd.cut 来定义 'pre' 和 'post':
grp = pd.cut(df['date'], bins=[pd.Timestamp(2021, 1, 1),
pd.Timestamp(2021, 1, 3),
pd.Timestamp(2021, 1, 6)], labels=['pre', 'post'],
right=False)
df.groupby([grp, 'website'])['amount_views'].agg(['mean','count']).reset_index()
输出:
date website mean count
0 pre a 20.000000 2
1 pre b 9.500000 2
2 post a 12.333333 3
3 post b 8.000000 3