我正在尝试基于动态名称对数据框进行子集化。
我的硬编码版本看起来像这样......
persona_subset <- subset(newList, subset = (finalList$flag_analoggers =='1'))
我希望能够创建变量列表并遍历列表,每次替换不同的列名。例如......
//Run 1
persona_subset <- subset(newList, subset = (finalList$flag_1 =='1'))
//Run 2
persona_subset <- subset(newList, subset = (finalList$flag_2 =='1'))
//Run n...
persona_subset <- subset(newList, subset = (finalList$flag_n =='1'))
但是,每次我把变量放在那里,我都会得到“子集必须合乎逻辑”的字样。错误。我尝试将变量名称放入一个字符串中,但是没有得到正确的数据子集。
col_location <- paste("finalList$",toString(x))
persona_subset <- subset(newList, subset = (col_location =='1'))
如何动态迭代此变量列表?
答案 0 :(得分:3)
如果我在你的位置,我会避免subset
,并像这样处理问题。
xy <- data.frame(vals1 = runif(9), vals2 = runif(9), a = sample(1:3, 9, replace = TRUE),
b = sample(1:3, 9, replace = TRUE), c = sample(1:3, 9, replace = TRUE),
d = sample(1:3, 9, replace = TRUE))
iterate.vals <- names(xy)[!grepl("vals", names(xy))]
sapply(iterate.vals, FUN = function(x) {
print(xy[xy[, x] == 1, ])
# Run
})
vals1 vals2 a b c d
2 0.6165867 0.3728094 1 1 2 1
3 0.2962395 0.9669952 1 3 1 2
7 0.5657228 0.7200541 1 3 2 3
8 0.7793529 0.8391430 1 1 1 1
vals1 vals2 a b c d
1 0.6028678 0.9178560 2 1 1 3
2 0.6165867 0.3728094 1 1 2 1
5 0.7234325 0.8426445 2 1 1 1
6 0.5637070 0.1895586 2 1 2 3
8 0.7793529 0.8391430 1 1 1 1
vals1 vals2 a b c d
1 0.6028678 0.9178560 2 1 1 3
3 0.2962395 0.9669952 1 3 1 2
4 0.9293780 0.3459115 2 2 1 3
5 0.7234325 0.8426445 2 1 1 1
8 0.7793529 0.8391430 1 1 1 1
vals1 vals2 a b c d
2 0.6165867 0.3728094 1 1 2 1
5 0.7234325 0.8426445 2 1 1 1
8 0.7793529 0.8391430 1 1 1 1