使用R中的一般社会调查选择未定义的列

时间:2016-05-20 17:38:54

标签: r indexing subset

我正在尝试使用R代码分析一般社会调查,并使用我在网上找到的脚本:

https://github.com/ajdamico/asdfree/blob/master/General%20Social%20Survey/cumulative%20cross-sectional%20-%20analysis%20examples.R

但是,我一直收到此消息,无法找到解决方案:

  

[.data.frame中的错误(frame ,, j,drop = drop):undefined   选择的列

以下是我使用的代码,略微改编自上面的链接:

options(digits = 8)
library(foreign) 
library(survey)  
library(memisc)
options( survey.lonely.psu = "adjust")

GSS.CS.file.location <- "http://gss.norc.org/documents/spss/GSS_spss.zip"

tf <- tempfile() ; td <- tempdir()
download.file(GSS.CS.file.location, tf, mode = "wb")
fn <- unzip(tf, exdir = td, overwrite = T)
print( fn[grep("sav$", fn)] )

dat.pov<-as.data.set(spss.system.file(fn[grep("sav$", fn)]))
z <- dat.pov
rm(dat.pov)
gc()
dat.pov <- z
rm(z)
gc()

save(dat.pov, file = "dat.pov.rda")
load("dat.pov.rda")
nrow(dat.pov)
ncol(dat.pov)
head(dat.pov)

将数据帧缩小到所需变量

KeepVars<- c("oversamp", "formwt", "wtssall", "sampcode", "sample", "sex", 
             "age","region","nateduc","nateducy", "nateducz", "natefare", 
             "natefarey","natefarez","race","res16","income","partyid",
             "polviews","educ","degree", "eqwlth","helpful","fair","trust",
             "jobfind","class","rank","satfin", "finalter","finrela","unemp",
             "getahead","parsol","kidssol","helppoor")

dat.pov2 <- dat.pov[,KeepVars]

任何帮助表示赞赏! 感谢

1 个答案:

答案 0 :(得分:3)

如果你检查所有那些KeepVars是否在&#34;&#34;在列名称中,您应该看到错误的位置:

> KeepVars %in% names(dat.pov)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
[14] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[27]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

您甚至可以反转逻辑向量以选择名称不是%in%的名称。

> KeepVars[ ! KeepVars %in% names(dat.pov)]
[1] "natefare"  "natefarey" "natefarez"

注意......我没有完成安东尼需要在他严重受内存限制的笔记本电脑上执行的双重赋值和gc()操作。 (在使用32 G的机器上执行此操作毫无意义,但我强烈怀疑在这种情况下可能会有所作为。)

这将返回具有&#34; nate&#34;的名称。在他们:

> names(dat.pov)[ grepl("nate", names(dat.pov))]
[1] "natenvir" "nateduc"  "natenrgy" "natenviy" "nateducy" "natenviz" "nateducz"

这列出了有&#34; fare&#34;在他们:

> names(dat.pov)[ grepl("fare", names(dat.pov))]
 [1] "natfare"  "natfarey" "natfarez" "farewhts" "farejews" "fareblks" "fareasns"
 [8] "farehsps" "fareso"   "workfare" "lessfare" "immfare"  "aidsfare" "welfare1"
[15] "welfare2" "welfare3" "welfare4" "welfare5" "welfare6"

最后一个字符向量似乎就是你会找到拼写错误的名字的地方。