尝试使用groupby找到每月5个最大值

时间:2017-10-22 23:12:17

标签: python pandas group-by

我试图显示每月//... // create a promise with the search in the page searchElement(selector) => new Promise( function (resolve, reject){ try{ var elemAwait = await page.$eval(selector, el => el); resolve(selector,elemAwait); }catch{ reject(selector,"Element doesn't exist"); } }); // now, search element in the headless page searchElement('#security') .then(function(selector,el){ // element is here prompt.get(schema, function (err, result) { console.log(selector+' is here with value:' + el.value); }); }) .catch(function(selector,msg){ // element isn't here prompt.get(schema, function (err, result) { console.log(selector+" - "+msg); }); }); 的前三个值。我尝试使用using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; using System.Diagnostics; using System.Drawing; using System.IO; using System.Linq; using System.Text; using System.Threading.Tasks; using System.Windows.Forms; namespace cmd_commands { public partial class Form1 : Form { string[] commands = new string[] { @"test", @"test1", @"test2" }; int command = 0; public Form1() { InitializeComponent(); } public void runCmd(string command) { ProcessStartInfo cmdsi = new ProcessStartInfo("cmd.exe"); cmdsi.Arguments = command; Process cmd = Process.Start(cmdsi); cmd.WaitForExit(); } private void Form1_Load(object sender, EventArgs e) { } private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e) { runCmd(commands[command]); backgroundWorker1.ReportProgress(0, command); } private void button1_Click(object sender, EventArgs e) { backgroundWorker1.RunWorkerAsync(); } private void backgroundWorker1_ProgressChanged(object sender, ProgressChangedEventArgs e) { label1.Text = "Working on command number: " + e.UserState.ToString(); } private void backgroundWorker1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e) { command++; runCmd(commands[command]); } } } 但是到目前为止还没有这样做。

原始数据:

nc_type

转换为:

n_largest

转化数据:

     area                                     nc_type    occurred_date  
0     Filling                                 x          12/23/2015 0:00   
1     Filling                                 f          12/22/2015 0:00   
2     Filling                                 s          9/11/2015 0:00   
3     Filling                                 f          2/17/2016 0:00   
4     Filling                                 s          5/3/2016 0:00   
5     Filling                                 g          8/29/2016 0:00   
6     Filling                                 f          9/9/2016 0:00   
7     Filling                                 a          6/1/2016 0:00

2 个答案:

答案 0 :(得分:1)

情景1
MultiIndex系列

occurred_date  nc_type
1.0            x           3
               y           4
               z          13
               w          24
               f          34
12.0           d          18
               g          10
               w          44
               a          27
               g          42
Name: test, dtype: int64

致电sort_values + groupby + head

df.sort_values(ascending=False).groupby(level=0).head(2)

occurred_date  nc_type
12.0           w          44
               g          42
1.0            f          34
               w          24
Name: test, dtype: int64

根据您的情况将head(2)更改为head(5)

或者,使用nlargest扩展我的comment,您可以:

df.groupby(level=0).nlargest(2).reset_index(level=0, drop=1)

occurred_date  nc_type
1.0            f          34
               w          24
12.0           w          44
               g          42
Name: test, dtype: int64

情景2
3-col dataframe

   occurred_date nc_type  value
0            1.0       x      3
1            1.0       y      4
2            1.0       z     13
3            1.0       w     24
4            1.0       f     34
5           12.0       d     18
6           12.0       g     10
7           12.0       w     44
8           12.0       a     27
9           12.0       g     42

您可以使用sort_values + groupby + head

df.sort_values(['occurred_date', 'value'], 
        ascending=[True, False]).groupby('occurred_date').head(2)

   occurred_date nc_type  value
4            1.0       f     34
3            1.0       w     24
7           12.0       w     44
9           12.0       g     42

为您的方案将head(2)更改为head(5)

情景3
MultiIndex Dataframe

                       test
occurred_date nc_type      
1.0           x           3
              y           4
              z          13
              w          24
              f          34
12.0          d          18
              g          10
              w          44
              a          27
              g          42

或者nlargest

df.groupby(level=0).test.nlargest(2)\
              .reset_index(level=0, drop=1)

occurred_date  nc_type
1.0            f          34
               w          24
12.0           w          44
               g          42
Name: test, dtype: int64

答案 1 :(得分:1)

我要加group_keys=False

df.groupby('occurred_date', group_keys=False).nlargest(3)

occurred_date  nc_type
1.0            f          34
               w          24
               z          13
12.0           w          44
               g          42
               a          27
Name: value, dtype: int64