R

时间:2016-02-12 17:53:34

标签: r statistics

我有这样的数据集

id <- 1:12
b <- c(0,0,1,2,0,1,1,2,2,0,2,2)
c <- rep(NA,3)
d <- rep(NA,3)

df <-data.frame(id,b)
newdf <- data.frame(c,d)

我想做简单的数学运算。如果x == 1或x == 2计算它们并写入此数据集中有多少1和2。但是我不想计算整个数据集,我希望我的函数可以计算四到四个。

我想得到这样的结果:

> newdf
  one two
1   1   1
2   2   1
3   0   3

我尝试了很多变化,但我没有成功。

afonk <- function(x) {
  ifelse(x==1 | x==2, x, newdf <- (x[1]+x[2]))
} 

afonk(newdf$one)

lapply(newdf, afonk)

提前致谢!

伊斯梅尔

2 个答案:

答案 0 :(得分:2)

我们可以使用dcast中的data.table。使用%/%创建分组变量,然后使用dcast从'long'格式创建分组变量。

library(data.table)
dcast(setDT(df)[,.N ,.(grp=(id-1)%/%4+1L, b)],
      grp~b, value.var='N', fill =0)[,c(2,4), with=FALSE]

或者稍微更紧凑的版本会将fun.aggregate用作length

res <- dcast(setDT(df)[,list((id-1)%/%4+1L, b)][b!=0], 
                    V1~b, length)[,V1:=NULL][]
res
#   1 2
#1: 1 1
#2: 2 1
#3: 0 3

如果我们需要列名称为'one','two'

library(english)
names(res) <- as.character(english(as.numeric(names(res))))

答案 1 :(得分:2)

Fun with base R:

# counting function
countnum <- function(x,num){
  sum(x == num)
}

# make list of groups of 4
df$group <- rep(1:ceiling(nrow(df)/4),each = 4)[1:nrow(df)]
dfl <- split(df$b,f = df$group)

# make data frame of counts
newdf <- data.frame(one = sapply(dfl,countnum,1),
                    two = sapply(dfl,countnum,2))

Edit based on comment:

# make list of groups of 4
df$group <- rep(1:ceiling(nrow(df)/4),each = 4)[1:nrow(df)]

table(subset(df, b != 0L)[c("group", "b")])

Which you prefer depends on what type of result you need. A table will work for a small visual count, and you can likely pull the data out of the table, but if it is as simple as your example, you might opt for the data.frame.