选择循环中的元素R

时间:2017-07-12 16:56:57

标签: r list for-loop

我试过通过论坛搜索来回答这个问题,但是找不到它。 我想滚动数据框列(IN_FID)的唯一值,并将与该值相关联的另一列(NEAR_FID)的值(可能有一个或多个)添加到列表中。然后IN_FID被添加到列表中。如果在此过程中之前已看到NEAR_FID中的值,则IN_FID不会添加到列表中。我知道我没有把它包含在代码中,但理想情况下我还想在随机而不是顺序循环IN_FID值。 我在这段代码中做错了什么?

eagle
   IN_FID NEAR_FID
1       2        1
2       2        2
3       2        3
4       8        4
5       9        2
6       9        7
7       9        8
8       9        9
9      16        2
10     16       11
11     21       12

p.good = list()
p.bad = list()
INFIDS = unique(eagle$IN_FID)
NEARFIDS = unique(eagle$NEAR_FID)
t.used = NEARFIDS

for (i in INFIDS) {
sub = eagle[eagle$IN_FID == i, ]
x = sub$NEAR_FID
if (all(x) %in% t.used){
    p.good = c(p.good, i)
    t.used[t.used != all(x)]

} else { 
    p.bad = c(p.bad, i)
}

所需的输出是:

p.good
[1] 2 8 21  (because NEAR_FID of 2 is present in 9 and 16)
p.bad
[1] 9 16
t.used
= empty because it will have used the values during the loop

2 个答案:

答案 0 :(得分:1)

您可以使用函数duplicated()

index_dup = which(duplicated(eagle$NEAR_FID))

p.bad = unique(eagle$IN_FID[index_dup])

index_bad = c()
for (i in p.bad){
  index_bad = c(index_bad,which(eagle$IN_FID == i))
}

p.good = unique(eagle$IN_FID[-index_bad])

对于随机化,您可以随机输入数据的行顺序,然后再次应用上面的代码

eagle_random <- eagle[sample(1:nrow(eagle)), ]

答案 1 :(得分:0)

而不是列表,声明为vector

p.good = NULL
p.bad = NULL

INFIDS = unique(eagle$IN_FID)
NEARFIDS = unique(eagle$NEAR_FID)
t.used = NEARFIDS

而不是min:max,迭代向量for (i in INFIDS)的元素:

for (i in INFIDS) {
     x = (eagle %>% filter(IN_FID == i))$NEAR_FID   # combine into single statement
     if (all(x %in% t.used)) {    # was all(x) %in% t.used before
        p.good = c(p.good, i)
        t.used = t.used[!(t.used %in% x)]  # was t.used != all(x)
    } else {
        p.bad = c(p.bad, i)  
    }
}

输出:

p.good
[1] 2  8 21

p.bad
[1] 9 16

t.used
[1] 7  8  9 11    # some values were not eliminated as you expected

---- 随机抽样 ----

更改for (i in INFIDS)

for (i in sample(INFIDS))。使用set.seed(1)来控制随机抽样。