我已将.dta
文件(Stata)导入R。该数据集包含两个具有以下结构的变量:
$ S003 :Class 'labelled' atomic [1:341271] 392 392 392 392 392 392 392 392 392 392 ...
.. ..- attr(*, "label")= chr "Country/region"
.. ..- attr(*, "format.stata")= chr "%8.0g"
.. ..- attr(*, "labels")= Named num [1:199] NA NA NA NA NA 4 8 12 16 20 ...
.. .. ..- attr(*, "names")= chr [1:199] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
$ S003A :Class 'labelled' atomic [1:341271] 392 392 392 392 392 392 392 392 392 392 ...
.. ..- attr(*, "label")= chr "Country/regions [with split ups]"
.. ..- attr(*, "format.stata")= chr "%8.0g"
.. ..- attr(*, "labels")= Named num [1:199] NA NA NA NA NA 4 8 12 16 20 ...
.. .. ..- attr(*, "names")= chr [1:199] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
在R中,列中的条目显示为数字(即8),表示国家/地区名称。
我想将该国家名称提取为字符串,然后将其作为列添加到我的数据框中。像这样:
df <- extract(df, countryname, into = c("countryname"), "([a-zA-Z]+)")
有什么建议吗?