我有一些代码可以生成一个数据帧。我希望能够基于df1和df2中的“组”分类选择每n个。
import pandas as pd
data = [['Orange',11], ['Orange',12], ['Orange',13], ['Orange',14]
['Orange',15], ['Orange',16], ['Orange',17], ['Orange',18],
['StrawBerry',22], ['StrawBerry',23], ['StrawBerry',24],
['StrawBerry',25], ['StrawBerry',26], ['StrawBerry',27]]
df = pd.DataFrame(data,columns=['Fruit', 'Score'])
df
#Here I start from the 1st row and then move to the fourth and then
#and so forth by Fruit.
Output1data = [['Orange',11], ['Orange',14], ['Orange',17],
['StrawBerry',22], ['StrawBerry',25]]
df1 = pd.DataFrame(Output1data,columns=['Fruit','Score'])
df1
#Here I start from the second row onwards and then move to the fifth and
#then eighth and so forth by Fruit
Output2data = [['Orange',12], ['Orange',15], ['Orange',18],
['StrawBerry',23], ['StrawBerry',26]]
df2 = pd.DataFrame(Output2data,columns=['Fruit','Score'])
df2
请告诉我是否有一种使用组分类的方法,因为我需要为给定的起点选择每第n行?
非常感谢。真的很感激。
答案 0 :(得分:5)
将GroupBy.cumcount
与3
进行模运算以使用助手Series
,然后按boolean indexing
进行过滤:
s = df.groupby('Fruit')['Fruit'].cumcount() % 3
print (s)
0 0
1 1
2 2
3 0
4 1
5 2
6 0
7 1
8 0
9 1
10 2
11 0
12 1
13 2
dtype: int64
df1 = df[s == 0]
print (df1)
Fruit Score
0 Orange 11
3 Orange 14
6 Orange 17
8 StrawBerry 22
11 StrawBerry 25
df2 = df[s == 1]
print (df2)
Fruit Score
1 Orange 12
4 Orange 15
7 Orange 18
9 StrawBerry 23
12 StrawBerry 26
答案 1 :(得分:2)
您可以尝试此代码,并修改参数(start和step)
$query = new Posts();
if (count(Auth::user()->tags) > 0) {
$query = $query->whereHas('tags', function ($q) {
$i = 0;
foreach (Auth::user()->tags as $tag) {
if ($i == 0) {
$q->where('title', '=', $tag->title);
} else {
$q->orWhere('title', '=', $tag->title);
}
$i++;
}
});
}
$posts = $query->where('isTemplate', true)->orderBy($key, $order)->paginate(15);