我有一个数据框,看起来像这样,但很大,所以我不能手动做任何事情:
Bank Country KeyItem Year Value
A AU Income 2010 1000
A AU Income 2011 1130
A AU Income 2012 1160
B USA Depth 2010 10000
我想要做的是创建一个函数,我可以选择Bank,Keyitem以及从哪一年开始,它返回一个数据帧,其值为第一个值的百分比。像这样:
Bank Country KeyItem Year Value
A AU Income 2010 100
A AU Income 2011 113
A AU Income 2012 116
提前谢谢!
答案 0 :(得分:3)
这是一个data.table解决方案,它应该快速且内存效率高。
DF <- read.table(text="Bank Country KeyItem Year Value
A AU Income 2010 1000
A AU Income 2011 1130
A AU Income 2012 1160
B USA Depth 2010 10000", header=TRUE, stringsAsFactors=FALSE)
library(data.table)
DT <- as.data.table(DF)
setkey(DT, Bank, KeyItem, Year)
DT[J("A", "Income")] #all entries where Bank is "A", and KeyItem is "Income"
DT[J("A", "Income")][Year >= 2010] #only those with year >= your year
DT[J("A", "Income")][Year >= 2010, Value/Value[1]] # result as vector
DT[J("A", "Income")][Year >= 2010, list(Value/Value[1])] # result as data.table
> DT[J("A", "Income")][Year >= 2010, pct:=list(Value/Value[1])] #result as data.table with all columns
Bank KeyItem Country Year Value pct
1: A Income AU 2010 1000 1.00
2: A Income AU 2011 1130 1.13
3: A Income AU 2012 1160 1.16
答案 1 :(得分:2)
我转而使用plyr
包仅用于此类任务:
library( "plyr" )
ddply( df, c("Bank", "KeyItem"), function(x) {
base <- x[ min( x$Year ) == x$Year, "Value" ]
x$Value <- 100 * x$Value / base
return( x[ , c("Country", "Year", "Value") ] )
})
答案 2 :(得分:2)
尝试以下方法:( df
是您的数据框)
选择标准:
bank <- "A"
keyItem <- "Income"
year <- 2011
创建子集:
dat <- subset(df, Bank == bank & KeyItem == keyItem & Year >= year)
计算百分比:
dat$Value <- dat$Value / dat$Value[1] * 100
作为一项功能:
myfun <- function(df, bank, keyItem, year) {
dat <- df[df$Bank == bank & df$KeyItem == keyItem & df$Year >= year, ]
"[[<-"(dat, "Value", value = dat$Value / dat$Value[1] * 100)
}
myfun(df, "A", "Income", 2011)