用于子集化的动态列名称

时间:2014-08-03 05:33:38

标签: r subset

我正在尝试基于动态名称对数据框进行子集化。

我的硬编码版本看起来像这样......

persona_subset <- subset(newList, subset = (finalList$flag_analoggers =='1'))

我希望能够创建变量列表并遍历列表,每次替换不同的列名。例如......

//Run 1
persona_subset <- subset(newList, subset = (finalList$flag_1 =='1'))
//Run 2
persona_subset <- subset(newList, subset = (finalList$flag_2 =='1'))
//Run n...
persona_subset <- subset(newList, subset = (finalList$flag_n =='1'))

但是,每次我把变量放在那里,我都会得到“子集必须合乎逻辑”的字样。错误。我尝试将变量名称放入一个字符串中,但是没有得到正确的数据子集。

col_location <- paste("finalList$",toString(x))
persona_subset <- subset(newList, subset = (col_location =='1'))

如何动态迭代此变量列表?

1 个答案:

答案 0 :(得分:3)

如果我在你的位置,我会避免subset,并像这样处理问题。

xy <- data.frame(vals1 = runif(9), vals2 = runif(9), a = sample(1:3, 9, replace = TRUE), 
                 b = sample(1:3, 9, replace = TRUE), c = sample(1:3, 9, replace = TRUE), 
                 d = sample(1:3, 9, replace = TRUE))

iterate.vals <- names(xy)[!grepl("vals", names(xy))]
sapply(iterate.vals, FUN = function(x) {
  print(xy[xy[, x] == 1, ])
  # Run
})

      vals1     vals2 a b c d
2 0.6165867 0.3728094 1 1 2 1
3 0.2962395 0.9669952 1 3 1 2
7 0.5657228 0.7200541 1 3 2 3
8 0.7793529 0.8391430 1 1 1 1
      vals1     vals2 a b c d
1 0.6028678 0.9178560 2 1 1 3
2 0.6165867 0.3728094 1 1 2 1
5 0.7234325 0.8426445 2 1 1 1
6 0.5637070 0.1895586 2 1 2 3
8 0.7793529 0.8391430 1 1 1 1
      vals1     vals2 a b c d
1 0.6028678 0.9178560 2 1 1 3
3 0.2962395 0.9669952 1 3 1 2
4 0.9293780 0.3459115 2 2 1 3
5 0.7234325 0.8426445 2 1 1 1
8 0.7793529 0.8391430 1 1 1 1
      vals1     vals2 a b c d
2 0.6165867 0.3728094 1 1 2 1
5 0.7234325 0.8426445 2 1 1 1
8 0.7793529 0.8391430 1 1 1 1