R:按名称范围索引数据框列

时间:2015-02-16 22:19:17

标签: r indexing dataframe range columnname

我有大量庞大的数据框架。通常在这些数据框中,我有一组具有相似名称的列,这些列按顺序出现。以下是此类数据框的简化版本:

> tmp <- data.frame(ID = 1:25,
    Item1 = sample(x = 1:4, size = 25, replace = TRUE),
    Item2 = sample(x = 1:4, size = 25, replace = TRUE),
    Item3 = sample(x = 1:4, size = 25, replace = TRUE),
    Item4 = sample(x = 1:4, size = 25, replace = TRUE),
    Item5 = sample(x = 1:4, size = 25, replace = TRUE),
    Item6 = sample(x = 1:4, size = 25, replace = TRUE),
    Item7 = sample(x = 1:4, size = 25, replace = TRUE),
    Quest = rep(x = 20, times = 25))

我需要找到一种方法来按名称范围对这些列进行索引,按其位置。假设我需要将Item4Item7的列编入索引。我可以做到以下几点:

> tmp[ , c("Item4", "Item5", "Item6", "Item7")]

如果有数百个具有相似名称的列,那就不太好了。我想做点什么:

> tmp[ , c("Item4":"Item7")]

但它引发了一个错误:

Error in "Item1":"Item7" : NA/NaN argument
In addition: Warning messages:
1: In `[.data.frame`(tmp, , c("Item1":"Item7")) :
  NAs introduced by coercion
2: In `[.data.frame`(tmp, , c("Item1":"Item7")) :
  NAs introduced by coercion

此外,我想使用这种索引以某种方式操纵列的属性(使用前一种列出所有列名的方法)

> labels.Item4to7 <- c("Disagree", "Somewhat disagree",
  "Somewhat agree", "Agree")
> tmp[ , c("Item4", "Item5", "Item6", "Item7")] <- lapply(tmp[ , c("Item4",
  "Item5", "Item6", "Item7")], factor, labels = labels.Item4to7)

但是将列名称的范围定义为Item4:Item7

提前谢谢。

2 个答案:

答案 0 :(得分:2)

您可以使用paste

tmp[, paste0("Item", 4:7)]

答案 1 :(得分:2)

使用

的功能
tmp[,which(names(tmp)=="Item4"):which(names(tmp)=="Item7")]

可以通过以下方式更改项目4到7的值:

labels.Item4to7 <- c("Disagree", "Somewhat disagree",
  "Somewhat agree", "Agree")
tmp[,which(names(tmp)=="Item4"):which(names(tmp)=="Item7")]<-
   lapply(tmp[,which(names(tmp)=="Item4"):which(names(tmp)=="Item7")],
   factor,labels=labels.Item4to7)