导出Labeled double <dbl + lbl>数据类型

时间:2017-07-10 16:04:28

标签: r tidyverse tibble

导入SPSS .sav文件后,生成的tibble报告如下:

# A tibble: 88,528 x 7
       CRY12    CRYOX7   INDS07M  INECAC05    SOC10M    URESMC     GOR9D
   <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl>     <chr>
 1       997       578        NA        31        NA        11 E12000009
 2       921       926        NA        30        NA        11 E12000009
 3       921       926        NA        31        NA        11 E12000009
 4       372       372        NA        25        NA        11 E12000009
 5       372       372        17         1      2211        11 E12000009
 6       372       372        NA        34        NA        11 E12000009
 7       921       926        18         2      3411        11 E12000009
 8       921       926        NA        34        NA        11 E12000009
 9       997       392        NA        25        NA        11 E12000009
10       997       392         3         1      2136        11 E12000009
# ... with 88,518 more rows

如果我要求仅查看SOC10M列,则R报告该变量为<Labelled double>并向我显示标签:

> df$SOC10M[1:10]
<Labelled double>
 [1]   NA   NA   NA   NA 2211   NA 3411   NA   NA 2136

    Labels:
     value                                                        label
        -9                                               Does not apply
        -8                                                    No answer
      1115                   1115  'Chief executives and Snr officials'
      1116                 1116  'Elected officers and representatives'
      1121      1121  'Production mngrs and directors in manufacturing'
      1122       1122  'Production mngrs and directors in construction'
      1123  1123  'Production mngrs and directors in mining and energy'

我找不到任何特定于此数据类型的文档。

我想将其导出到每次都有label的csv,而不是value。 (即,在适当的情况下,CSV应该有字符串而不是数字。)

这可能吗?

1 个答案:

答案 0 :(得分:4)

我认为您可以使用haven找到此数据类型的文档和桥接 SPSS-R gab。

从文档中我已经制作了这个例子,我希望这是不言自明的。

# install.packages(c("haven"), dependencies = TRUE)
library(haven)
x1 <- labelled(c(1,NA, 5, 3, 5), c(Good = 1, Bad = 5) )
x2 <- labelled( c("M", "F", NA, "F", "M"),  c(Male = "M", Female = "F") )

df <- tibble(x1, x2)
df
#> # A tibble: 5 x 2
#>          x1        x2
#>   <dbl+lbl> <chr+lbl>
#> 1         1         M
#> 2        NA         F
#> 3         5      <NA>
#> 4         3         F
#> 5         5         M
#> > 

# kinda like you are doing
df$x1[1:3]
#> <Labelled double>
#> [1]  1 NA  5
#> 
#> Labels:
#>  value label
#>      1  Good
#>      5   Bad 

zap_labels(df$x1[1:3])
#> [1]  1 NA  5

as_factor(df$x2[1:3])
#> [1] Male   Female <NA>  
#> Levels: Female Male

zap_labels(df)
#> # A tibble: 5 x 2
#>      x1    x2
#>   <dbl> <chr>
#> 1     1     M
#> 2    NA     F
#> 3     5  <NA>
#> 4     3     F
#> 5     5     M