我从网址中提取了一个csv文件并将其清理干净。我现在正在尝试创建一个函数,它接受两个输入,一个向量和一个数字,并告诉你向量中低于数字的值的百分比。该向量将为dfStates $ jul2011,该数字将为dfStates $ jul2011的平均值。有人能指出我正确的方向来创建这个功能吗?
readStates <- read.csv(url("http://www2.census.gov/programs-
surveys/popest/tables/2010-2011/state/totals/nst-est2011-01.csv")) #Assign
readStates
readStates[6:10] <- list(NULL) #Remove Variables x.4-x.8
readStates <- readStates[-c(1, 2, 3, 4, 5, 6, 7, 8, 60, 61, 62, 63, 64, 65,
66, 67),] #Remove Rows
colnames(readStates)[1] <- "stateName"
#Fix Column Name
colnames(readStates)[2] <- "base2010"
#Fix Column Name
colnames(readStates)[3] <- "base2011"
#Fix Column Name
colnames(readStates)[4] <- "jul2010"
#Fix Column Name
colnames(readStates)[5] <- "jul2011"
#Fix Column Name
dfStates <-
as.data.frame(sapply(readStates,gsub,pattern="\\.",replacement=""))
#Remove Periods
dfStates <- as.data.frame(sapply(dfStates,gsub,pattern=",",replacement=""))
#Remove Commas
dfStates$base2010 <- as.numeric(as.character(dfStates$base2010))
#Set as Numeric
dfStates$base2011 <- as.numeric(as.character(dfStates$base2011))
#Set as Numeric
dfStates$jul2010 <- as.numeric(as.character(dfStates$jul2010))
#Set as Numeric
dfStates$jul2011 <- as.numeric(as.character(dfStates$jul2011))
#Set as Numeric
答案 0 :(得分:0)
你可以采取你的向量的平均值
me<-mean(dfStates$jul2011)
然后将其放入不同的向量并计算小于平均值和计数值的值。然后简单地计算百分比。
temp_vec<-dfStates$jul2011
len<-length(temp_vec)
temp_vec2<-temp_vec[temp_vec<me]
len_temp_vec2<-length(temp_vec2)
per<-(len_temp_vec2/len)*100
per