我想运行heckit
来校正样品选择偏差的结果。这是我的代码:
ht <- heckit(participation ~ log(friends+2) +log(followers+2) + subjectivity.bnk, delta_polarity ~ subjectivity.bnk , df))
我的IV是subjectivity.bnk
,我的DV是delta_polarity
。我有friends
和followers
作为我的赫克曼乐器。运行代码时,出现以下错误:
Error in binaryChoice(formula, ..., userLogLik = loglik, weights = weights) :
the left hand side of the 'formula' has to contain exactly two levels (e.g. FALSE and TRUE)
Here是我的数据的一个示例。
谢谢
答案 0 :(得分:1)
具有1的参与项的观察(行)也具有主观性的NA。因此,当您在第一部分中包含subjectivity.bnk时,最终只有0可以参与:
library(sampleSelection)
df = read.delim("ht.csv - ht.csv.tsv",stringsAsFactors=FALSE)
#participation is good
table(df$participation)
0 1
117 382
# see that those with NA for subjectivity.bnk also have 1 for participation
table(df$participation,is.na(df$subjectivity.bnk))
FALSE TRUE
0 0 117
1 382 0
#this works
ht <- heckit(participation ~ log(friends+2) +
log(followers+2), delta_polarity ~ subjectivity.bnk , df)