动态更改数据框的数据类型

时间:2016-12-06 00:57:04

标签: r

我有一组属于许多国家的数据框,包含3个变量( AI OAD )。 津巴布韦的示例如下所示,

>str(dframe_Zimbabwe_1955_1970)
'data.frame':   16 obs. of  3 variables:
 $ year: chr  "1955" "1956" "1957" "1958" ...
 $ AI  : chr  "11.61568161" "11.34114927" "11.23639317" "11.18841409" ...
 $ OAD : chr  "5.740789488" "5.775882473" "5.800441036" "5.822536579" ...

我正在尝试将数据框中变量的数据类型更改为以下,以便我可以使用lm(dframe_Zimbabwe_1955_1970$AI ~ dframe_Zimbabwe_1955_1970$year)对线性拟合进行建模。

>str(dframe_Zimbabwe_1955_1970) 
'data.frame':   16 obs. of  3 variables:
 $ year: int  1955 1956 1957 1958 ...
 $ AI  : num  11.61568161 11.34114927 11.23639317 11.18841409 ...
 $ OAD : num  5.740789488 5.775882473 5.800441036 5.822536579 ...

以下静态代码能够将 AI 从字符( chr )更改为数字( num )。

dframe_Zimbabwe_1955_1970$AI <- as.numeric(dframe_Zimbabwe_1955_1970$AI)

但是当我尝试自动化代码时, AI 仍然是字符( chr

countries <- c('Zimbabwe', 'Afghanistan', ...) 

for (country in countries) {
  assign(paste('dframe_',country,'_1955_1970$AI', sep=''), eval(parse(text = paste('as.numeric(dframe_',country,'_1955_1970$AI)', sep=''))))
}

你能告诉我可能做错了吗?

感谢。

2 个答案:

答案 0 :(得分:2)

42:您的代码无法按照书面形式运行,但会进行一些编辑。除了缺少括号和错误的sep之外,你不能在赋值中使用$'列名',但你还是不需要它

for (country in countries) {
  new_val <- get(paste( 'dframe_',country,'_1955_1970', sep=''))
  new_val[] <- lapply(new_val, as.numeric)  # the '[]' on LHS keeps dataframe
  assign(paste('dframe_',country,'_1955_1970', sep=''), new_val)
  remove(new_val)
}

证明它有效:

dframe_Zimbabwe_1955_1970 <- data.frame(year = c("1955", "1956", "1957"), 
                                         AI = c("11.61568161", "11.34114927", "11.23639317"),
                                         OAD = c("5.740789488", "5.775882473", "5.800441036"),
                                         stringsAsFactors = F)
str(dframe_Zimbabwe_1955_1970)
'data.frame':   3 obs. of  3 variables:
 $ year: chr  "1955" "1956" "1957"
 $ AI  : chr  "11.61568161" "11.34114927" "11.23639317"
 $ OAD : chr  "5.740789488" "5.775882473" "5.800441036"

 countries <- 'Zimbabwe'
 for (country in countries) {
 new_val <- get(paste( 'dframe_',country,'_1955_1970', sep=''))
   new_val[] <- lapply(new_val, as.numeric)  # the '[]' on LHS keeps dataframe
   assign(paste('dframe_',country,'_1955_1970', sep=''), new_val)
   remove(new_val)
 }

str(dframe_Zimbabwe_1955_1970)
'data.frame':   3 obs. of  3 variables:
 $ year: num  1955 1956 1957
 $ AI  : num  11.6 11.3 11.2
 $ OAD : num  5.74 5.78 5.8

答案 1 :(得分:1)

纯粹主义者会认为它是相当难看的代码,但也许这就是:

for (country in countries) {

    new_val <- get(paste('dframe_',country,'_1955_1970', sep=''))
    new_val[] <- lapply(new_val, as.numeric)  # the '[]' on LHS keeps dataframe
    assign(paste('dframe_',country,'_1955_1970', sep=''), new_val)
          }

使用get('obj_name')功能被认为比eval(parse(text=...))更干净。如果你将这些数据帧组合在一个列表中,它会得到更多的处理。