线性回归中的群组r平方

时间:2018-03-12 18:16:57

标签: r subset linear-regression lm

我使用数据集的所有元素(24)计算了线性回归,结果模型为IP2。现在我想知道我的数据集中每个国家/地区的单个模型的拟合程度(r平方,我对斜率和截距不感兴趣)。可怕的方法是(我需要做以下200次)

Country <- c("A","A","A","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B","B","B","B","B")
IP <- c(55,56,59,63,67,69,69,73,74,74,79,87,0,22,24,26,26,31,37,41,43,46,46,47)
IP2 <- c(46,47,49,50,53,55,53,57,60,57,58,63,0,19,20,21,22,25,26,28,29,30,31,31)
summary(lm(IP[Country=="A"] ~ IP2[Country=="A"]))
summary(lm(IP[Country=="B"] ~ IP2[Country=="B"]))

有没有办法同时计算两个r平方?我尝试了Linear Regression and group by in R以及其他一些帖子(Fitting several regression models with dplyr),但它没有用,我得到了与我合作的四个小组相同的系数。 关于我做错了什么或如何解决问题的任何想法? 谢谢

2 个答案:

答案 0 :(得分:1)

基础R的几个选项:

c(by(data.frame(IP, IP2), Country, function(x) summary(lm(x))$r.sq))
#         A         B 
# 0.9451881 0.9496636 

sapply(split(data.frame(IP, IP2), Country), function(x) summary(lm(x))$r.sq)
#         A         B 
# 0.9451881 0.9496636 

$TaskService    = new-object -ComObject('Schedule.Service')
$TaskService.connect()
$TaskFolder = $TaskService.GetFolder('\')
$TaskFolder.gettasks(1) | ? {$_.state -eq 4}

答案 1 :(得分:0)

您可以使用split功能,然后使用mapply来完成此操作。

  • split采用向量并将其转换为包含k个元素的列表,其中k是(在本例中)国家/地区的不同级别。
  • mapply允许我们循环遍历多个输入。
  • getR2是一个简单的函数,它接受两个输入,拟合模型,然后提取R ^ 2值。

下面的代码示例

Country <- c("A","A","A","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B","B","B","B","B")
IP <- c(55,56,59,63,67,69,69,73,74,74,79,87,0,22,24,26,26,31,37,41,43,46,46,47)
IP2 <- c(46,47,49,50,53,55,53,57,60,57,58,63,0,19,20,21,22,25,26,28,29,30,31,31)

ip_split = split(IP,Country)
ip2_split = split(IP2,Country)

getR2 = function(ip,ip2){
  model = lm(ip~ip2)
  return(summary(model)$r.squared)
}

r2.values = mapply(getR2,ip_split,ip2_split)

r2.values
#>         A         B 
#> 0.9451881 0.9496636