R添加列,其中单元格值基于不同行中的值

时间:2018-03-27 22:10:35

标签: r

我有一个;; ----------------------------------------- ;; begin Point class ;; ----------------------------------------- (defrecord Point [x y methods] ) (def someMethods { :getX (fn [this] (:x this) ) :getY (fn [this] (:y this) ) :distance (fn [this other] (def dx (- (:x this) (:x other))) (def dy (- (:y this) (:y other))) (Math/sqrt (+ (* dx dx) (* dy dy) )) ) } ) ;; ;; Point constructor ;; (defn newPoint [x y] (Point. x y someMethods) ) ;; ----------------------------------------- ;; end Point class ;; ----------------------------------------- ;; ----------------------------------------- ;; helper to call methods ;; ----------------------------------------- (defn call ([obj meth] ((meth (:methods obj)) obj)) ([obj meth param1] ((meth (:methods obj)) obj param1)) ([obj meth param1 param2] ((meth (:methods obj)) obj param1 param2)) ) ;; ----------------------------------------- ;; main() ;; ----------------------------------------- (def p1 (newPoint 3 4)) (def p2 (newPoint 0 0)) (call p1 :getY) ;; == ((:getX (:methods p1)) p1) (call p1 :distance p2) ;; == ((:distance (:methods p1)) p1 p2) ,其中每行表示是否在某个特定位置发现了一只动物。

我想在此示例data.frame中创建一个标记为data.frame的新列。此值将为1或0,具体取决于是否在同一位置找到捕食者的猎物(每个位置都有唯一的"prey")。

问题是每只动物都有一个单独的行,所以关于猎物存在于与捕食者不同的行中的信息。这两个掠食者是狮子和猎豹。

对于这个例子,狮子的猎物是羚羊和斑马,所以:

  • 对于ID 1,由于在该位置发现了羚羊和狮子,猎物列的狮子座应该有1个。
  • 对于ID 2,未发现羚羊或斑马,所以狮子行的猎物列为0。
猎豹猎物是羚羊,瞪羚,黑斑羚。

下面是示例ID,我提出的解决方案非常低效,而且我正在寻找更快/更整洁的东西。

data.frame

2 个答案:

答案 0 :(得分:0)

考虑使用tidyr::spread来简化第一个数据框的结构。

df <- df %>% spread(species, present)

#>   ID antelope cheetah gazelles impala lion zebra
#>1   1        1       1        0      1    1     0
#>2   2        0       1        1      1    1     0

然后继续dplyr

df %>% 
  spread(species, present) %>%
  mutate(lion_prey = case_when(antelope == 1 | zebra == 1 ~ 1,
                               TRUE ~ 0),
         cheetah_prey = case_when(antelope == 1 | gazelles == 1  | impala == 1 ~ 1,
                               TRUE ~ 0)) %>%
  gather(species, present, -ID, -lion_prey, -cheetah_prey) %>%
  mutate(prey = case_when(species == "lion" ~ lion_prey,
                          species == "cheetah" ~ cheetah_prey,
                          TRUE ~ 0)) %>%
  select(-lion_prey, -cheetah_prey)

#>       ID  species present prey
#>    1   1 antelope       1    0
#>    2   2 antelope       0    0
#>    3   1  cheetah       1    1
#>    4   2  cheetah       1    1
#>    5   1 gazelles       0    0
#>    6   2 gazelles       1    0
#>    7   1   impala       1    0
#>    8   2   impala       1    0
#>    9   1     lion       1    1
#>    10  2     lion       1    0
#>    11  1    zebra       0    0
#>    12  2    zebra       0    0

答案 1 :(得分:0)

由于您描述的原因,这涉及一些混乱的逻辑表达式,但这是一种方法。这具有可推广的优点。如果您想添加捕食者,只需将它们添加到predators并相应地将它们的猎物添加到predators_preypredators_prey是一个容纳(在此处发生)具有不同猎物数量的捕食者的列表:

# define the predators
predators <- c("lion", "cheetah")

# create a list of their prey from which to programmatically extract
predators_prey <- list(lion = c("antelope", "zebra"), cheetah = c("antelope", "gazelles", "impala"))

# initialize the $prey column
df$prey <- 0

# use for loop because we're assigning a value in global env
for (predator in predators ){
  for (ID in unique(df$ID)){

    # is the predator here?
    predator_here = df[df$ID == ID & df$species == predator,]$present
    # is that predator's prey here?
    prey_here = any(df[df$ID == ID & df$present == 1,]$species %in% predators_prey[[predator]])

    # if both, then set $prey to 1
    if(predator_here & prey_here){
      df[df$ID == ID & df$species == predator,]$prey <- 1
    }
  }
}
# lets look at the result
df
#    ID  species present prey
# 1   1     lion       1    1
# 2   1 antelope       1    0
# 3   1    zebra       0    0
# 4   1  cheetah       1    1
# 5   1   impala       1    0
# 6   1 gazelles       0    0
# 7   2     lion       1    0
# 8   2 antelope       0    0
# 9   2    zebra       0    0
# 10  2  cheetah       1    1
# 11  2   impala       1    0
# 12  2 gazelles       1    0

数据:

df <- data.frame(ID=c(1,1,1,1,1,1, 2, 2, 2, 2, 2, 2),
                 species=c("lion", "antelope", "zebra", "cheetah", "impala", "gazelles", "lion", "antelope", "zebra", "cheetah", "impala", "gazelles"),
                 present=c(1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1),
                 stringsAsFactors=FALSE)