我遇到了一个棘手的问题,试图在一个数据集中多次解决因趋势导致的差异......
我的数据结构如下
x <- read.table(text = "
STA YEAR VALUE
a 1968 457
a 1970 565
a 1972 489
a 1974 500
a 1976 700
a 1978 650
a 1980 659
b 1968 457
b 1970 565
b 1972 350
b 1974 544
b 1976 678
b 1978 650
b 1980 690
c 1968 457
c 1970 565
c 1972 500
c 1974 600
c 1976 678
c 1978 670
c 1980 750 " , header = T)
我试图返回类似这样的内容
STA R-sq
a n1
b n2
c n3
其中n#是原始集合中位置数据的相应r平方值....
我试过了
fit <- lm(VALUE ~ YEAR + STA, data = x)
为主数据集中的VALUE提供多年来每个站的VALUE年度趋势模型....
任何帮助都将非常感谢....我真的很难过这个,我知道这只是对R问题的熟悉。
答案 0 :(得分:2)
要为每组VALUE
获得YEAR
〜STA
的r平方,您可以执行此previous answer,稍微修改并插入您的值:
# assuming x is your data frame (make sure you don't have Hmisc loaded, it will interfere)
models_x <- dlply(x, "STA", function(df)
summary(lm(VALUE ~ YEAR, data = df)))
# extract the r.squared values
rsqds <- ldply(1:length(models_x), function(x) models_x[[x]]$r.squared)
# give names to rows and col
rownames(rsqds) <- unique(x$STA)
colnames(rsqds) <- "rsq"
# have a look
rsqds
rsq
a 0.6286064
b 0.5450413
c 0.8806604
编辑:根据mnel的建议,这里有更有效的方法将r平方值转换为漂亮的表格(无需添加行和列名称):
# starting with models_x from above
rsqds <- data.frame(rsq =sapply(models_x, '[[', 'r.squared'))
# starting with just the original data in x, this is great:
rsqds <- ddply(x, "STA", summarize, rsq = summary(lm(VALUE ~ YEAR))$r.squared)
STA rsq
1 a 0.6286064
2 b 0.5450413
3 c 0.8806604
答案 1 :(得分:1)
#first load the data.table package
library(data.table)
#transform your dataframe to a datatable (I'm using your example)
x<- as.data.table(x)
#calculate all the metrics needed (r^2, F-distribution and so on)
x[,list(r2=summary(lm(VALUE~YEAR))$r.squared ,
f=summary(lm(VALUE~YEAR))$fstatistic[1] ),by=STA]
STA r2 f
1: a 0.6286064 8.462807
2: b 0.5450413 5.990009
3: c 0.8806604 36.897258
答案 2 :(得分:0)
只有一个r平方值,而不是三个..请编辑你的问题
# store the output
y <- summary( lm( VALUE ~ YEAR + STA , data = x ) )
# access the attributes of `y`
attributes( y )
y$r.squared
y$adj.r.squared
y$coefficients
y$coefficients[,1]
# or are you looking to run three separate
# lm() functions on 'a' 'b' and 'c' ..where this would be the first?
y <- summary( lm( VALUE ~ YEAR , data = x[ x$STA %in% 'a' , ] ) )
# access the attributes of `y`
attributes( y )
y$r.squared
y$adj.r.squared
y$coefficients
y$coefficients[,1]