按大熊猫分组选择第n个观察点

时间:2018-08-21 09:47:25

标签: python pandas

我有一些代码可以生成一个数据帧。我希望能够基于df1和df2中的“组”分类选择每n个。

    import pandas as pd
    data = [['Orange',11], ['Orange',12], ['Orange',13], ['Orange',14] 
            ['Orange',15], ['Orange',16], ['Orange',17], ['Orange',18], 
            ['StrawBerry',22], ['StrawBerry',23], ['StrawBerry',24], 
            ['StrawBerry',25], ['StrawBerry',26], ['StrawBerry',27]]
    df = pd.DataFrame(data,columns=['Fruit', 'Score'])
    df

    #Here I start from the 1st row and then move to the fourth and then 
    #and so forth by Fruit.
    Output1data = [['Orange',11], ['Orange',14], ['Orange',17], 
                  ['StrawBerry',22], ['StrawBerry',25]]
    df1 = pd.DataFrame(Output1data,columns=['Fruit','Score'])
    df1

    #Here I start from the second row onwards and then move to the fifth and 
    #then eighth and so forth by Fruit 
    Output2data = [['Orange',12], ['Orange',15], ['Orange',18], 
                   ['StrawBerry',23], ['StrawBerry',26]]
    df2 = pd.DataFrame(Output2data,columns=['Fruit','Score'])
    df2

请告诉我是否有一种使用组分类的方法,因为我需要为给定的起点选择每第n行?

非常感谢。真的很感激。

2 个答案:

答案 0 :(得分:5)

GroupBy.cumcount3进行模运算以使用助手Series,然后按boolean indexing进行过滤:

s = df.groupby('Fruit')['Fruit'].cumcount() % 3
print (s)
0     0
1     1
2     2
3     0
4     1
5     2
6     0
7     1
8     0
9     1
10    2
11    0
12    1
13    2
dtype: int64

df1 = df[s == 0]
print (df1)
         Fruit  Score
0       Orange     11
3       Orange     14
6       Orange     17
8   StrawBerry     22
11  StrawBerry     25

df2 = df[s == 1]
print (df2)
         Fruit  Score
1       Orange     12
4       Orange     15
7       Orange     18
9   StrawBerry     23
12  StrawBerry     26

答案 1 :(得分:2)

您可以尝试此代码,并修改参数(start和step)

$query = new Posts();
if (count(Auth::user()->tags) > 0) {
   $query = $query->whereHas('tags', function ($q) {
        $i = 0;
        foreach (Auth::user()->tags as $tag) {
            if ($i == 0) {
                $q->where('title', '=', $tag->title);
            } else {
                $q->orWhere('title', '=', $tag->title);
            }
            $i++;
        }
    });
}
$posts = $query->where('isTemplate', true)->orderBy($key, $order)->paginate(15);