从查找表重命名列

时间:2016-01-11 15:47:58

标签: r

我正在尝试使用查找表中的匹配来重命名数据框中的列。

oldvars = c("mpg", "cyl" , "disp",  "hp", "drat", "wt", "qsec", "vs", "am", "gear", "carb")
newvars = c("Miles Per Gallon", "Cycle", "Displacement", "Horsepower", "Distance Rating", 
"Working Time", "Quick Second", "Versus", "America", "Gears", "Carbohydrates")

lookup = data.frame(oldvars, newvars)
mycars = mtcars

使用查找列表匹配oldvars并将其更改为newvars,以便names(mycars)输出"Miles Per Gallon", "Cycle", "Displacement", "Horsepower", "Distance Rating", "Working Time", "Quick Second", "Versus", "America", "Gears", "Carbohydrates"

我尝试使用colnames来更改名称,但它并没有像我期望的那样读取变量。以下

for(i in 1:length(newvars)) {
  colnames(mycars)[oldvars[i]] = newvars[i]
} 

只输出NA s

3 个答案:

答案 0 :(得分:7)

如果您知道他们按照相同的顺序(就像您的例子中那样)那么您可以这样做

names(mycars) = newvars

如果他们的顺序不同(我换了cylmpg):

oldvars = c("cyl" ,"mpg",  "disp",  "hp", "drat", "wt", "qsec", "vs", "am", "gear", "carb")
newvars = c( "Cycle", "Miles Per Gallon", "Displacement", "Horsepower", "Distance Rating", 
    "Working Time", "Quick Second", "Versus", "America", "Gears", "Carbohydrates")

然后match是您确定正确顺序的朋友:

## just demonstrating match, you don't need to run this
match(names(mycars), lookup$oldvars)
# [1]  2  1  3  4  5  6  7  8  9 10 11

所以可以使用

完成作业
names(mycars) = lookup$newvars[match(names(mycars), lookup$oldvars)]

如果您不喜欢names() =范例,那么setNames也有效:

mycars = setNames(mycars, lookup$newvars[match(names(mycars), lookup$oldvars)])

答案 1 :(得分:0)

用双循环解决它

for(i in 1:length(newvars)) {
  for(z in 1:length(newvars)) {
    if(colnames(mycars)[i] == oldvars[z]) {
      colnames(mycars)[i] = newvars[z]
    }
  }
}

效率极低但完成任务

答案 2 :(得分:0)

仅作为我的评论而添加为答案。

在@Gregor Thomas的正确答案中,我建议反转match()调用中的参数,以解决匹配oldvarsnames(mycars)的第一个元素开始不连续的情况。下面有完整的示例,因为我有空格。

mycars = head(mtcars, 2)    

oldvars = c("cyl" ,"mpg",  "wt",  "foo") #note change to variable selection in lookup
newvars = c( "Cycle", "Miles Per Gallon", "Weight", "bar")
name_match = match(names(mycars), oldvars)
name_match
[1]  2  1 NA NA NA  3 NA NA NA NA NA

#After omitting the `NA` elements, the match vector no longer properly aligns with
#the names(mycars) vector

names(mycars)[na.omit(name_match)] = newvars[!is.na(name_match)]
names(mycars) 
[1] "Miles Per Gallon" "Cycle"            NA   
[4] "hp"               "drat"             "wt"
[7] "qsec"             "vs"               "am"
[10] "gear"             "carb"

#instead, reverse the arguments in match() to find the data.frame names that appear in the lookup

name_match = match(oldvars, names(mycars))
name_match
[1]  2  1  6 NA
names(mycars)[na.omit(name_match)] = newvars[!is.na(name_match)]
names(mycars)
[1] "Miles Per Gallon" "Cycle"            "disp" 
[4] "hp"               "drat"             "Weight" 
[7] "qsec"             "vs"               "am"    
[10] "gear"             "carb"