处理Mlogit中特定于替代项的NA值

时间:2019-07-08 17:07:14

标签: r mlogit

在模式选择模型中,变量通常随备选方案而变化(“通用变量”),但对于某些模式却未定义,这是很常见的。例如,公交车和轻轨车有过境费,但汽车和骑自行车的车费未定义。请注意,票价不是为零。

我正在尝试使用mlogit包为R进行这项工作。在此MWE中,我断言price对于在海滩钓鱼是不确定的。这会导致奇异错误。

library(mlogit)
#> Warning: package 'mlogit' was built under R version 3.5.2
#> Loading required package: Formula
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
#> Loading required package: lmtest

data("Fishing", package = "mlogit")
Fishing$price.beach <- NA
Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode")
head(Fish)
#>            mode   income     alt   price  catch chid
#> 1.beach   FALSE 7083.332   beach      NA 0.0678    1
#> 1.boat    FALSE 7083.332    boat 157.930 0.2601    1
#> 1.charter  TRUE 7083.332 charter 182.930 0.5391    1
#> 1.pier    FALSE 7083.332    pier 157.930 0.0503    1
#> 2.beach   FALSE 1250.000   beach      NA 0.1049    2
#> 2.boat    FALSE 1250.000    boat  10.534 0.1574    2

mlogit(mode ~ catch + price | income, data = Fish, na.action = na.omit)
#> Error in solve.default(H, g[!fixed]): system is computationally singular: reciprocal condition number = 3.92205e-24

reprex package(v0.2.1)于2019-07-08创建

price也移动到特定于替代变量的位置时,也会发生这种情况。我认为问题可能出在na.action函数参数中,但是除了基本文档标记之外,我找不到关于此参数的任何文档:

  

na.action :该功能指示当数据包含NA时应该怎么办

似乎没有任何示例可以显示该术语的用法和结果。有一个相关的未解答问题here

1 个答案:

答案 0 :(得分:0)

似乎发生了一些事情。

我不太确定na.action = na.omit的工作原理,但听起来像它会掉整行。我总是发现最好明确地执行此操作。

当您放下整行时,您将有选择机会,而没有做出选择。这是行不通的。记住,我们正在处理logit类型的概率。此外,如果不做出选择,就不会获得任何信息,因此我们需要完全放弃这些选择观察。结合执行这两个步骤,我可以运行您建议的模型。

这是一个经过评论的工作示例:

library(mlogit)

# Read in the data
data("Fishing", package = "mlogit")

# Set price for the beach option to NA
Fishing$price.beach <- NA

# Scale income
Fishing$income <- Fishing$income / 10000

# Turn into 'mlogit' data
fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode")

# Explicitly drop the alts with NA in price
fish <- fish[fish$alt != "beach", ]

# Dropping all NA also means that we now have choice occasions where no choice
# was made and we need to get rid of these as well
fish$choice_made <- rep(colSums(matrix(fish$mode, nrow = 3)), each = 3)

fish <- fish[fish$choice_made == 1, ]

fish <- mlogit.data(fish, shape = "long", alt.var = "alt", choice = "mode")

# Run an MNL model
mnl <- mlogit(mode ~ catch + price | income, data = fish)
summary(mnl)

通常,在使用这些模型时,我发现始终在运行模型之前进行所有数据转换而不是依赖诸如na.action之类的功能非常有用。