我正在逐行构建DataFrame,然后对其运行回归。为简单起见,代码为:
using DataFrames
using GLM
df = DataFrame(response = Number[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
fit(LinearModel, @formula(response ~ 1), df)
我收到错误:
ERROR: LoadError: MethodError: Cannot `convert` an object of type Array{Number,1} to an object of type GLM.LmResp
This may have arisen from a call to the constructor GLM.LmResp(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] fit(::Type{GLM.LinearModel}, ::Array{Float64,2}, ::Array{Number,1}) at ~/.julia/v0.6/GLM/src/lm.jl:140
[2] #fit#44(::Dict{Any,Any}, ::Array{Any,1}, ::Function, ::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:72
[3] fit(::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:66
[4] include_from_node1(::String) at ./loading.jl:576
[5] include(::String) at ./sysimg.jl:14
while loading ~/test.jl, in expression starting on line 10
对线性回归的调用与regression in "Introducing Julia"非常相似:
linearmodel = fit(LinearModel, @formula(Y1 ~ X1), anscombe)
有什么问题?
答案 0 :(得分:0)
几个小时后,我意识到GLM需要具体的类型而Number是一个抽象类型(尽管documentation for GLM.LmResp在写这篇文章时对此几乎没有说明,只是“封装线性模型的响应” “)。解决方案是将声明更改为具体类型,例如Float64:
using DataFrames
using GLM
df = DataFrame(response = Float64[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
fit(LinearModel, @formula(response ~ 1), df)
输出:
StatsModels.DataFrameRegressionModel{GLM.LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
Formula: response ~ +1
Coefficients:
Estimate Std.Error t value Pr(>|t|)
(Intercept) 0.408856 0.0969961 4.21518 0.0023
类型必须具体,例如带有Real
的抽象类型df = DataFrame(response = Real[])
失败并显示更有用的错误消息:
ERROR: LoadError: `float` not defined on abstractly-typed arrays; please convert to a more specific type
或者,您可以在构建数据框后转换为Real
:
using DataFrames
using GLM
df = DataFrame(response = Number[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
df2 = DataFrame(response = map(Real, df[:response]))
fit(LinearModel, @formula(response ~ 1), df2)
这是有效的,因为转换为Real实际上转换为Float64:
julia> typeof(df2[:response])
Array{Float64,1}
我提交了issue with GLM来改进错误消息。