data.table sum和subset

时间:2015-05-12 02:10:21

标签: r sum filtering data.table

我有一个data.table,我想聚合

library(data.table)
dt1 <- data.table(year=c("2001","2001","2001","2002","2002","2002","2002"),
                  group=c("a","a","b","a","a","b","b"), 
                  amt=c(20,40,20,35,30,28,19))

我希望sum按年份分组,然后筛选出任何给定组的总和amt大于100的位置。

我已经获得了data.table sum。

dt1[, sum(amt),by=list(year,group)]

   year group V1
1: 2001     a 60
2: 2001     b 20
3: 2002     a 65
4: 2002     b 47

我的最终过滤级别遇到了问题。

我正在寻找的最终结果是:

   year group V1
1: 2001     a 60
2: 2002     a 65

a) 60 + 65 > 100b) 20 + 47 <= 100

关于如何实现这一点的任何想法都会很棒。

我看了一下这个data.table sum by group and return row with max value,并想知道他们是否是一个同样雄辩的解决我的问题的方法。

4 个答案:

答案 0 :(得分:12)

data.table中的单线:

dt1[, lapply(.SD,sum), by=list(year,group)][, if (sum(amt) > 100) .SD, by=group]

#   group year amt
#1:     a 2001  60
#2:     a 2002  65

答案 1 :(得分:3)

你可以这样做:

norm

给出了:

public class MainActivity extends Activity implements OnClickListener {

    private Calendar calendar;
    private int day;
    private int month;
    private int year;

    @Override  
    protected void onCreate(Bundle savedInstanceState) {
            super.onCreate(savedInstanceState);

            setContentView(R.layout.activity);    

            date = (EditText) findViewById(R.id.date);
            calendar = Calendar.getInstance();
            day = calendar.get(Calendar.DAY_OF_MONTH);
            month = calendar.get(Calendar.MONTH);
            year = calendar.get(Calendar.YEAR);
            date.setOnClickListener(this);
    }

    @Override
    protected Dialog onCreateDialog(int id) {
           // TODO Auto-generated method stub
           return new DatePickerDialog(this, datePickerListener, year, month, day);     
    }

    private DatePickerDialog.OnDateSetListener datePickerListener = new DatePickerDialog.OnDateSetListener() {               
        public void onDateSet(DatePicker view, int selectedYear, int selectedMonth, int selectedDay) {

        date.setText(selectedYear + " - " + (selectedMonth + 1) + " - " + selectedDay);
        }
    };

    @Override
    public void onClick(View v) {
        // TODO Auto-generated method stub
        showDialog(0);
    }
}

答案 2 :(得分:3)

这可能不是一个想法的解决方案,但我会在以下几个步骤中这样做:

dt2=dt1[, sum(amt),by=list(year,group)]
dt3=dt1[, sum(amt)>100,by=list(group)]
dt_result=dt2[group %in% dt3[V1==TRUE]$group,]

答案 3 :(得分:2)

这是一个双线。找到您想要的组的子集

big_groups <- dt1[,sum(amt),by=group][V1>100]$group
dt1[group%in%big_groups,sum(amt),by=list(year,group)]