R,dplyr订单条目组

时间:2018-04-26 14:32:20

标签: r dplyr

考虑以下exaple数据

PSE YEAR    ID
1   2014    3
3   2007    1
3   2004    4
3   2002    4
1   2013    3
3   2006    1
2   2016    2
2   2017    2
1   2015    3
3   2003    4
  

编辑:   错了   如果我使用dplyr arrange(PSE, YEAR,ID)我得到

     

更正了一个:   如果我使用dplyr arrange(PSE,ID, YEAR)我得到

PSE YEAR    ID
1   2013    3
1   2014    3
1   2015    3
2   2016    2
2   2017    2
3   2006    1 
3   2007    1
3   2002    4
3   2003    4
3   2004    4

我想要的是

PSE YEAR    ID
1   2013    3
1   2014    3
1   2015    3
2   2016    2
2   2017    2
3   2002    4
3   2003    4
3   2004    4
3   2006    1
3   2007    1

可能很容易,但不知怎的,我无法正确分类。

Mhhm,不知怎的,现在可以使用,但不是我的原始数据。以下是发生错误的小数据样本。

structure(list(PSE = c(11280282, 11280282, 11280282, 11280282, 
11630646, 11630646, 11630646, 11630646, 11630646, 11630646, 11630646, 
11630646, 11630646, 12106438, 12106438, 12106438, 12106438, 12106438, 
12106438, 12106438, 12106438, 12106438, 12106438, 12106438, 12106438, 
12106438, 12106438, 12106438, 12106438, 12106438, 12106438, 12106438, 
12106438, 12106438, 12106438, 12106438, 12106438, 12106438, 12106438, 
12106438, 12106438, 12106438, 12335813, 12335813, 12335813, 12335813, 
12475347, 12475347, 12475347, 12475347, 12475347, 12475347, 12475347, 
12475347, 12475347, 12475347, 12475347, 12475347, 12475347, 12475347, 
12475347, 12475347, 12475347, 12475347, 12475347, 12475347, 12475347, 
12475347, 12912674, 12912674, 12912674, 12912674), YEAR = c(20152, 
20161, 20162, 20171, 20101, 20102, 20111, 20112, 20121, 20122, 
20131, 20132, 20141, 20042, 20042, 20051, 20051, 20052, 20052, 
20061, 20061, 20062, 20062, 20071, 20071, 20072, 20072, 20081, 
20081, 20082, 20082, 20091, 20092, 20101, 20101, 20101, 20102, 
20102, 20102, 20111, 20111, 20111, 20152, 20171, 20171, 20171, 
20102, 20102, 20111, 20111, 20112, 20112, 20121, 20121, 20122, 
20122, 20131, 20131, 20132, 20132, 20141, 20141, 20142, 20142, 
20162, 20162, 20171, 20171, 20022, 20031, 20032, 20041), ID = c(8171, 
8171, 8171, 8171, 4801, 4801, 4801, 4801, 4801, 4801, 4801, 4801, 
4801, 1646, 1758, 1646, 1758, 1646, 1758, 1646, 1758, 1646, 1758, 
1646, 1758, -1, 3465, -1, 3465, -1, 3465, 3465, 3465, 3465, 4712, 
4770, 3465, 4712, 4770, 3465, 4712, 4770, 8217, 9236, 9237, 9238, 
4947, 4952, 4947, 4952, 5433, 5568, 5433, 5568, 5433, 5568, 5433, 
5568, 5433, 5568, 5433, 5568, 5433, 5568, 6797, 6813, 7123, 7215, 
474, 474, 474, 474)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -72L), .Names = c("PSE", "YEAR", "ID"))

PSE 12106438存在问题。

这就是我得到的:

12106438    20072   -1
12106438    20081   -1
12106438    20082   -1
12106438    20042   1646
12106438    20051   1646
12106438    20052   1646
12106438    20061   1646
12106438    20062   1646
12106438    20071   1646
12106438    20042   1758
12106438    20051   1758
12106438    20052   1758
12106438    20061   1758
12106438    20062   1758
12106438    20071   1758
12106438    20072   3465
12106438    20081   3465
12106438    20082   3465
12106438    20091   3465
12106438    20092   3465
12106438    20101   3465
12106438    20102   3465
12106438    20111   3465
12106438    20101   4712
12106438    20102   4712
12106438    20111   4712
12106438    20101   4770
12106438    20102   4770
12106438    20111   4770

我想要的是

12106438    20042   1646
12106438    20051   1646
12106438    20052   1646
12106438    20061   1646
12106438    20062   1646
12106438    20071   1646
12106438    20042   1758
12106438    20051   1758
12106438    20052   1758
12106438    20061   1758
12106438    20062   1758
12106438    20071   1758
12106438    20072   -1
12106438    20081   -1
12106438    20082   -1
12106438    20072   3465
12106438    20081   3465
12106438    20082   3465
12106438    20091   3465
12106438    20092   3465
12106438    20101   3465
12106438    20102   3465
12106438    20111   3465
12106438    20101   4712
12106438    20102   4712
12106438    20111   4712
12106438    20101   4770
12106438    20102   4770
12106438    20111   4770

0 个答案:

没有答案