I have to perform a nonlinear multiple regression with data that looks like the following:
ID Customer Country Industry Machine-type Service hours**
1 A China mass A1 120
2 B Europe customized A2 400
3 C US mass A1 60
4 D Rus mass A3 250
5 A China mass A2 480
6 B Europe customized A1 300
7 C US mass A4 250
8 D Rus customized A2 260
9 A China Customized A2 310
10 B Europe mass A1 110
11 C US Customized A4 40
12 D Rus customized A2 80
Dependent variable: Service hours Independent variables: Customer, Country, Industry, Machine type
I did a linear regression, but because the assumption of linearity does not hold I have to perform a nonlinear regression.
I know nonlinear regression can be done with the nls function. How do I add the categorical variables to the nonlinear regression so that I get the statistical summary in R?
Column names after adding dummies: table with dummies
ID Customer.a Customer.b Customer.c Customer.d Country.China Country.Europe Country.Rus Country.US Industry.customized industry.Customized Industry.mass Machine type.A1 Machine type.A2 Machine type.A3 Service hours
1 1 0 0 0 1 0 0 0 0 0 1 1 0 0 120
2 0 1 0 0 0 1 0 0 1 0 0 0 1 0 400
3 0 0 1 0 0 0 0 1 0 0 1 0 0 1 60
4 0 0 0 1 0 0 1 0 0 0 1 1 0 0 250
5 1 0 0 0 1 0 0 0 1 0 0 0 0 1 480
6 0 1 0 0 0 1 0 0 0 1 0 1 0 0 300
7 0 0 1 0 0 0 0 1 0 0 1 0 0 1 250
8 0 0 0 1 0 0 1 0 1 0 0 0 1 0 260
9 1 0 0 0 1 0 0 0 0 0 1 0 1 0 210
10 0 1 0 0 0 1 0 0 1 0 0 0 1 0 110
11 0 0 1 0 0 0 0 1 0 0 1 0 0 1 40
12 0 0 0 1 0 0 1 0 0 0 1 1 0 0 80
答案 0 :(得分:0)
处理分类预测变量的方法取决于预测变量可以容纳的级别数。
对于性别等预测变量,只能采用2种形式(男性或女性),您可以简单地将它们表示为二进制(1,0)变量。
对于大于2级的预测变量,我们使用1-k的虚拟编码,其中k是特定变量所采用的级别数。有关有用的功能,请参阅dummies包。
在此之后,您可以使用公式拟合模型:
nls(Service.hours ~ predictor1 + predictor2 + predictorN, data = df)