terms.formula错误:“。”在公式中,没有用于线性回归的“数据”参数

时间:2018-07-08 09:13:12

标签: r regression linear-regression lm

我正在为R中的多元线性回归编写简单的代码。代码如下。

dataset$State = factor (dataset$State, 
                        levels = c ('New York','California','Florida'), 
                        labels = c ('1','2','3') ) 
#Splitting the dataset 
library(caTools) 
set.seed(123) 
split = sample.split(dataset$Profit, SplitRatio = 0.8) 
training_set = subset(dataset$Profit, split == TRUE)
test_set = subset(dataset$Profit, split == FALSE) 

#Fitting Multiple Linear Regression to the Training set 
regressor = lm(formula = Profit ~ ., data = training_set)

但是运行时出现此错误。

  

terms.formula(formula,data = data)中的错误:“。”在公式和   没有“数据”参数

为什么会给出这样的错误?


datasethttps://drive.google.com/drive/folders/1M5HAKs1s2ABYMEzVYMwWUaATlCw2ayZC?usp=sharing

1 个答案:

答案 0 :(得分:1)

感谢您可以复制此

training_set = subset(dataset$Profit, split == TRUE)
test_set = subset(dataset$Profit, split == FALSE)

问题出在子集上。替换

training_set = subset(dataset, subset = split)
test_set = subset(dataset, subset = !split)

lm(formula = Profit ~ ., data = training_set)

#Call:
#lm(formula = Profit ~ ., data = training_set)
#
#Coefficients:
#    (Intercept)        R.D.Spend   Administration  Marketing.Spend  
#      4.965e+04        7.986e-01       -2.942e-02        3.268e-02  
#         State2           State3  
#      1.213e+02        2.376e+02

使用

in.seq <- function(x) {
    # returns TRUE for elments within ascending sequences
    (c(diff(x, 1), NA) == 1 & c(NA, diff(x,2), NA) == 2)
    }

contractSeqs <-  function(x) {
    # returns string formatted with contracted sequences
    x[in.seq(x)] <- ""
    gsub(",{2,}", "-", paste(x, collapse=","), perl=TRUE)
    }

s <- "1,2,3,4,8,9,14,15,16,19"

s1 <- as.numeric(unlist(strsplit(s, ","))) # as earlier answers

# assumes: numeric vector, length > 2, positive integers, ascending sequences

contractSeqs(s1)
# [1] "1-4,8,9,14-16,19"