使用for循环通过耦合data.frame并对它们应用函数

时间:2014-09-14 12:17:48

标签: r function dataframe

我想循环通过情侣变量,有些错误,但我无法理解。

# education1,...trade1 are similar dataframe
# for example:
> education1
year   country  BeiJing   TianJin     HeBei    ShanXi  NeiMengGu  LiaoNing     JiLin       HeiLongJiang  ShangHai
1  2001  52920.47 1036.893  1975.061  1881.812  449.3267   198.8855  1551.363  361.7969     401.1776  3607.987
2  2002  65876.57 1367.643  2256.970  2329.523  648.7870   281.0629  1836.396  480.6575     499.5945  4228.255
3  2003  89227.20 1305.015  2841.379  4097.265  917.7571   497.8728  2560.662  616.4603     638.3171  5536.286
4  2004  92656.06 2282.841  3655.690  3853.677  416.8565   279.0049  2063.824  478.8450    -935.1350  7194.885
5  2005 167115.37 3464.530  4378.463  6926.047 2344.5597  1428.8180  5162.982 1258.4900    1108.1822  9837.540
6  2006 218827.79 4576.187  4971.573  9220.174 2918.0159  2340.6012  7613.243 1820.5200    1237.1062 11767.648

# I want to change all of dataframe's colname, say:
name <- c('a','b','c','d',.....)

for (x in vector(education1,fir1,inflation1,lq1,nonstatein1,patent1,tax1,trade1)) {
    names(x) <- name  # name is an array of rowname which defined before the loop
}

运行时,上面的代码显示没有错误,但每个数据帧的rowname都没有改变。

1 个答案:

答案 0 :(得分:1)

在R中,对象在赋值时被复制,因此您要修改data.frame的副本,而不是对原始对象的引用。

事实上,如果你这样做:

df <- data.frame(x=1,y=2)
v <- list(df)
names(v[[1]]) <- c('a','b')

> df
  x y
1 1 2

> v
[[1]]
  a b
1 1 2

正如您所见,df未被修改。

您可以使用getassign函数或eval(parse())(后者通常不鼓励)完成您要执行的操作。例如:

df1 <- data.frame(x=1,y=2)
df2 <- data.frame(x=3,y=4)
df3 <- data.frame(x=5,y=6)

newnames <- c('a','b')

# using get+assign
for(x in c('df1','df2','df3')){
  # get the object corresponding to name contained in x
  # N.B. tmp is a copy of the original object, not a reference to it
  tmp <- get(x)        
  # replace the col names of tmp
  names(tmp) <- newnames 
  # assign tmp to the variable corresponding to the name contained in x
  assign(x,tmp)      
}

# using eval+parse
for(x in c('df1','df2','df3')){
  # evaluate the expression: names(<text contained in x>) <- name
  eval(parse(text=paste0('names(',x,') <-','newnames')))
}

顺便说一句,从一开始就可以通过直接将data.frame保存到列表中来避免迭代环境变量(不良做法)的需要。
例如:

dataframes <- list()
dataframes$df1 <- data.frame(x=1,y=2)
dataframes$df2 <- data.frame(x=3,y=4)
dataframes$df3 <- data.frame(x=5,y=6)

# or if you prefer
# dataframes <- list(df1=data.frame(x=1,y=2),
#                    df2=data.frame(x=3,y=4),
#                    df3=data.frame(x=5,y=6))

newnames <- c('a','b')

# using for loop
for(x in names(dataframes)){
  names(dataframes[[x]]) <- newnames 
}

# using lapply
dataframes <- lapply(dataframes,FUN=function(x) { 
                                        names(x) <- newnames
                                        return(x)
                                    })