Question

我有一个看起来如下的数据集：

set.seed(1)
DF <- data.table(panelID = sample(50,50),                                                    # Creates a panel ID
                      Country = c(rep("A",30),rep("B",50), rep("C",20)),                      
                      Group = c(rep(1,20),rep(2,20),rep(3,20),rep(4,20),rep(5,20)),
                      Time = rep(seq(as.Date("2010-01-03"), length=20, by="1 month") - 1,5),
                      norm = round(runif(100)/10,2),
                      Income = sample(100,100),
                      Happiness = sample(10,10),
                      Sex = round(rnorm(10,0.75,0.3),2),
                      Age = round(rnorm(10,0.75,0.3),2),
                      Educ = round(rnorm(10,0.75,0.3),2))           
DF [, uniqueID := .I]     
DF <- as.data.table(DF)                                                 # Make sure it is a data.table 
DF [, uniqueID := .I]                                                   # Add a unique ID
cols = sapply(DF, is.numeric)                                           # Check numerical columns
DFm <- melt(DF[, cols, with = FALSE][, !"uniqueID"], id = "panelID")    # https://stackoverflow.com/questions/57406654/speeding-up-a-function/57407959#57407959
DFm[, value := c(NA, diff(value)), by = .(panelID, variable)]           # https://stackoverflow.com/questions/57406654/speeding-up-a-function/57407959#57407959
DF <- dcast(DFm, panelID + rowidv(DFm, cols = c("panelID", "variable")) ~ variable, value.var = "value") # ""
DF <- DF[DF[, !Reduce(`&`, lapply(.SD , is.na)), .SDcols = 3:ncol(DF)]] # Removes T1 for which there is no difference

现在我想做的很简单。我想将每列的平均值存储在一个列中。

我尝试过：

mean_of_differences <- DF [, mean(sapply(.SD, is.numeric), na.rm=TRUE)]   
mean_of_differences <- DF[,.SD[mean(sapply(.SD, is.numeric), na.rm=TRUE)]]

但是我似乎无法正确地做到这一点。我最后遇到的是NA或错误。

我俯瞰什么？

使用另一个数据集创建单列数据框

0 个答案: