为其中包含其他函数的数据框创建函数

时间:2018-02-22 15:02:08

标签: r function dataframe

我正在创建一个函数,我需要在其中创建其他函数。 我正在尝试在函数中转换的所有列重复计算。

看起来像:

multi_choice<- function(data,var1){

newfreq <- function(data,var1,var2){
T<-table(data[[var1]],data[[var2]])
T1<-as.data.frame.matrix(T)
T1[,"Industry"]<-row.names(T1)
T1
}

lst1 <- lapply(names(Q21[,2:ncol(Q21)]), newfreq)
lst1 <- lst1[!sapply(lst1, is.null)]


merge.all <- function(x, y) {
    merge(x, y, all = TRUE, by = "Industry")
}

T3 <- Reduce(merge.all, lst1)



T3[,"N"]<- apply(T3[,2:ncol(T3)],1,max)

T4<-rbind(c("All",colSums(T3[,2:ncol(T3)])),T3)  


T4[,2:ncol(T4)]<- sapply(T4[,2:ncol(T4)],as.numeric)



  for(col in names(T4)[c(-1,-ncol(T4))]){
    T4[col]=(T4[col]*100)/(T4[,ncol(T4)])

  }


  for(t in names(T4)[c(-1,-ncol(T4))]){
    T4[t]=ifelse(T4[,ncol(T4)]<5,"--",paste(round(T4[,t],0),"%"))}

    T4
} 

我创建了函数&#34; newfreq&#34;对Q21的al列进行计算。 例如,我为一列Q21_1运行它,它给出了:

> newfreq(Q21,"Q7_1","Q21_1")
                           Too expensive                   Industry
Banking/Financial Services             0 Banking/Financial Services
Chemicals                              0                  Chemicals
Consumer Goods                         0             Consumer Goods
Energy                                 0                     Energy
High Tech                              1                  High Tech
Insurance/Reinsurance                  0      Insurance/Reinsurance
Life Sciences                          0              Life Sciences
Logistics                              0                  Logistics
Mining & Metals                        1            Mining & Metals
Other Manufacturing                    0        Other Manufacturing
Other Non-Manufacturing                1    Other Non-Manufacturing
Retail & Wholesale                     0         Retail & Wholesale
Services (Non-Financial)               2   Services (Non-Financial)
Transportation Equipment               1   Transportation Equipment
> 

当我测试特定值时,其中的所有操作都在工作。但总的来说,这个功能给出了错误。 有什么想法让它更紧凑吗?

dput(Q21)
tructure(list(Q7_1 = structure(c(5L, 5L, 14L, 1L, 9L, 13L, 1L, 
3L, 13L, 13L, 13L, 12L, 2L, 11L, 13L, 10L, 11L, 1L, 4L, 5L, 5L, 
4L, 5L, 9L, 2L, 4L, 13L, 10L, 13L, 13L, 11L, 1L, 11L, 5L, NA, 
1L, 9L, 3L, 1L, 5L, NA, 2L, NA, 6L, 14L, NA, NA, 14L, 8L, 11L, 
8L, 12L, 13L, NA, 3L, 11L, 11L, NA, 10L, 6L, 5L, 13L, 13L), .Label = c("Banking/Financial Services", 
"Chemicals", "Consumer Goods", "Energy", "High Tech", "Insurance/Reinsurance", 
"Life Sciences", "Logistics", "Mining & Metals", "Other Manufacturing", 
"Other Non-Manufacturing", "Retail & Wholesale", "Services (Non-Financial)", 
"Transportation Equipment"), class = "factor"), Q21_1 = structure(c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
1L, NA, NA, 1L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, 
NA, NA, NA, 1L, NA, NA, NA, 1L, NA, NA, NA, NA, 1L, NA), .Label = "Too expensive", class = "factor"), 
    Q21_4 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, 1L, NA, NA, NA, NA, 
    NA, 1L, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, 1L, NA, NA, 
    NA, 1L, NA, NA, NA, 1L, 1L, NA), .Label = "Inflexible", class = "factor"), 
    Q21_5 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, 1L, NA, 1L, NA, NA, 
    NA, 1L, NA, NA, NA, NA, NA, 1L, 1L, NA, NA, NA, 1L, NA, NA, 
    NA, 1L, NA, NA, 1L, NA, 1L, NA), .Label = "Outdated", class = "factor"), 
    Q21_6 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, 1L, 1L, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, 1L, 1L, NA, NA, NA, 1L, NA, NA, 
    NA, 1L, NA, NA, NA, 1L, 1L, NA), .Label = "Wrong tools", class = "factor"), 
    Q21_7 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, 1L, NA, NA, 
    NA, 1L, NA, NA, 1L, NA, NA, NA), .Label = "Low utilization rates", class = "factor")), .Names = c("Q7_1", 
"Q21_1", "Q21_4", "Q21_5", "Q21_6", "Q21_7"), class = c("data.table", 
"data.frame"), row.names = c(NA, -63L))

1 个答案:

答案 0 :(得分:0)

两项建议。 a)您可以在全局环境中定义newfreq,而不是在multi_choice函数内定义。 b)尝试避免将变量命名为“T”,它可能会将R与布尔值TRUE混淆。

现在,由于您没有提供所需的输出,我会猜测您可能需要的内容。

newfreq <- function(data,var2){
  T0<-table(data[["Q7_1"]],data[[var2]])
  T1<-as.data.frame.matrix(T0)
  T1[,"Industry"]<-row.names(T1)
  T1
}

multi_choice<- function(data,var1){

  lst1 <- lapply(names(data[,2:ncol(data)]), function(x) newfreq(data,x))
  lst1 <- lst1[!sapply(lst1, is.null)]

  merge.all <- function(x, y) {
    merge(x, y, all = TRUE, by = "Industry")
  }

  T3 <- Reduce(merge.all, lst1)

  T3[,"N"]<- apply(T3[,2:ncol(T3)],1,max)

  T4<-rbind(c("All",colSums(T3[,2:ncol(T3)])),T3)  


  T4[,2:ncol(T4)]<- sapply(T4[,2:ncol(T4)],as.numeric)

  for(col in names(T4)[c(-1,-ncol(T4))]){
    T4[col]=(T4[col]*100)/(T4[,ncol(T4)])

  }

  for(t in names(T4)[c(-1,-ncol(T4))]){
    T4[t]=ifelse(T4[,ncol(T4)]<5,"--",paste(round(T4[,t],0),"%"))}

  T4
} 

> multi_choice(Q21,"Q21_1")
                     Industry Too expensive Inflexible Outdated Wrong tools Low utilization rates  N
1                         All          50 %       67 %     83 %        75 %                  42 % 12
2  Banking/Financial Services            --         --       --          --                    --  1
3                   Chemicals            --         --       --          --                    --  1
4              Consumer Goods            --         --       --          --                    --  1
5                      Energy            --         --       --          --                    --  0
6                   High Tech            --         --       --          --                    --  2
7       Insurance/Reinsurance            --         --       --          --                    --  1
8               Life Sciences            --         --       --          --                    --  0
9                   Logistics            --         --       --          --                    --  1
10            Mining & Metals            --         --       --          --                    --  1
11        Other Manufacturing            --         --       --          --                    --  0
12    Other Non-Manufacturing            --         --       --          --                    --  1
13         Retail & Wholesale            --         --       --          --                    --  0
14   Services (Non-Financial)            --         --       --          --                    --  2
15   Transportation Equipment            --         --       --          --                    --  1

如果上述结果不是您想要的输出,请提供一个。 希望这有帮助