我正在创建一个函数,我需要在其中创建其他函数。 我正在尝试在函数中转换的所有列重复计算。
看起来像:
multi_choice<- function(data,var1){
newfreq <- function(data,var1,var2){
T<-table(data[[var1]],data[[var2]])
T1<-as.data.frame.matrix(T)
T1[,"Industry"]<-row.names(T1)
T1
}
lst1 <- lapply(names(Q21[,2:ncol(Q21)]), newfreq)
lst1 <- lst1[!sapply(lst1, is.null)]
merge.all <- function(x, y) {
merge(x, y, all = TRUE, by = "Industry")
}
T3 <- Reduce(merge.all, lst1)
T3[,"N"]<- apply(T3[,2:ncol(T3)],1,max)
T4<-rbind(c("All",colSums(T3[,2:ncol(T3)])),T3)
T4[,2:ncol(T4)]<- sapply(T4[,2:ncol(T4)],as.numeric)
for(col in names(T4)[c(-1,-ncol(T4))]){
T4[col]=(T4[col]*100)/(T4[,ncol(T4)])
}
for(t in names(T4)[c(-1,-ncol(T4))]){
T4[t]=ifelse(T4[,ncol(T4)]<5,"--",paste(round(T4[,t],0),"%"))}
T4
}
我创建了函数&#34; newfreq&#34;对Q21的al列进行计算。 例如,我为一列Q21_1运行它,它给出了:
> newfreq(Q21,"Q7_1","Q21_1")
Too expensive Industry
Banking/Financial Services 0 Banking/Financial Services
Chemicals 0 Chemicals
Consumer Goods 0 Consumer Goods
Energy 0 Energy
High Tech 1 High Tech
Insurance/Reinsurance 0 Insurance/Reinsurance
Life Sciences 0 Life Sciences
Logistics 0 Logistics
Mining & Metals 1 Mining & Metals
Other Manufacturing 0 Other Manufacturing
Other Non-Manufacturing 1 Other Non-Manufacturing
Retail & Wholesale 0 Retail & Wholesale
Services (Non-Financial) 2 Services (Non-Financial)
Transportation Equipment 1 Transportation Equipment
>
当我测试特定值时,其中的所有操作都在工作。但总的来说,这个功能给出了错误。 有什么想法让它更紧凑吗?
dput(Q21)
tructure(list(Q7_1 = structure(c(5L, 5L, 14L, 1L, 9L, 13L, 1L,
3L, 13L, 13L, 13L, 12L, 2L, 11L, 13L, 10L, 11L, 1L, 4L, 5L, 5L,
4L, 5L, 9L, 2L, 4L, 13L, 10L, 13L, 13L, 11L, 1L, 11L, 5L, NA,
1L, 9L, 3L, 1L, 5L, NA, 2L, NA, 6L, 14L, NA, NA, 14L, 8L, 11L,
8L, 12L, 13L, NA, 3L, 11L, 11L, NA, 10L, 6L, 5L, 13L, 13L), .Label = c("Banking/Financial Services",
"Chemicals", "Consumer Goods", "Energy", "High Tech", "Insurance/Reinsurance",
"Life Sciences", "Logistics", "Mining & Metals", "Other Manufacturing",
"Other Non-Manufacturing", "Retail & Wholesale", "Services (Non-Financial)",
"Transportation Equipment"), class = "factor"), Q21_1 = structure(c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
1L, NA, NA, 1L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA,
NA, NA, NA, 1L, NA, NA, NA, 1L, NA, NA, NA, NA, 1L, NA), .Label = "Too expensive", class = "factor"),
Q21_4 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, 1L, NA, NA, NA, NA,
NA, 1L, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, 1L, NA, NA,
NA, 1L, NA, NA, NA, 1L, 1L, NA), .Label = "Inflexible", class = "factor"),
Q21_5 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, 1L, NA, 1L, NA, NA,
NA, 1L, NA, NA, NA, NA, NA, 1L, 1L, NA, NA, NA, 1L, NA, NA,
NA, 1L, NA, NA, 1L, NA, 1L, NA), .Label = "Outdated", class = "factor"),
Q21_6 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, 1L, 1L, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 1L, 1L, NA, NA, NA, 1L, NA, NA,
NA, 1L, NA, NA, NA, 1L, 1L, NA), .Label = "Wrong tools", class = "factor"),
Q21_7 = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, 1L, NA, NA,
NA, 1L, NA, NA, 1L, NA, NA, NA), .Label = "Low utilization rates", class = "factor")), .Names = c("Q7_1",
"Q21_1", "Q21_4", "Q21_5", "Q21_6", "Q21_7"), class = c("data.table",
"data.frame"), row.names = c(NA, -63L))
答案 0 :(得分:0)
两项建议。 a)您可以在全局环境中定义newfreq,而不是在multi_choice函数内定义。 b)尝试避免将变量命名为“T”,它可能会将R与布尔值TRUE混淆。
现在,由于您没有提供所需的输出,我会猜测您可能需要的内容。
newfreq <- function(data,var2){
T0<-table(data[["Q7_1"]],data[[var2]])
T1<-as.data.frame.matrix(T0)
T1[,"Industry"]<-row.names(T1)
T1
}
multi_choice<- function(data,var1){
lst1 <- lapply(names(data[,2:ncol(data)]), function(x) newfreq(data,x))
lst1 <- lst1[!sapply(lst1, is.null)]
merge.all <- function(x, y) {
merge(x, y, all = TRUE, by = "Industry")
}
T3 <- Reduce(merge.all, lst1)
T3[,"N"]<- apply(T3[,2:ncol(T3)],1,max)
T4<-rbind(c("All",colSums(T3[,2:ncol(T3)])),T3)
T4[,2:ncol(T4)]<- sapply(T4[,2:ncol(T4)],as.numeric)
for(col in names(T4)[c(-1,-ncol(T4))]){
T4[col]=(T4[col]*100)/(T4[,ncol(T4)])
}
for(t in names(T4)[c(-1,-ncol(T4))]){
T4[t]=ifelse(T4[,ncol(T4)]<5,"--",paste(round(T4[,t],0),"%"))}
T4
}
> multi_choice(Q21,"Q21_1")
Industry Too expensive Inflexible Outdated Wrong tools Low utilization rates N
1 All 50 % 67 % 83 % 75 % 42 % 12
2 Banking/Financial Services -- -- -- -- -- 1
3 Chemicals -- -- -- -- -- 1
4 Consumer Goods -- -- -- -- -- 1
5 Energy -- -- -- -- -- 0
6 High Tech -- -- -- -- -- 2
7 Insurance/Reinsurance -- -- -- -- -- 1
8 Life Sciences -- -- -- -- -- 0
9 Logistics -- -- -- -- -- 1
10 Mining & Metals -- -- -- -- -- 1
11 Other Manufacturing -- -- -- -- -- 0
12 Other Non-Manufacturing -- -- -- -- -- 1
13 Retail & Wholesale -- -- -- -- -- 0
14 Services (Non-Financial) -- -- -- -- -- 2
15 Transportation Equipment -- -- -- -- -- 1
如果上述结果不是您想要的输出,请提供一个。 希望这有帮助