R:基于另一列的分类级别是否彼此相同或不同的新列

时间:2018-04-18 10:37:26

标签: r

我在数据中创建新列时遇到问题,其中列内容由不同列中的因子中的级别定义相同或不同,这取决于另外2列。

基本上,我有一堆不同身份证的奶牛,可以有不同的身份。该季度是受疾病影响的乳房,我想创建一个新的列,其结果基于季度是相同还是不同或发生一次。任何帮助,将不胜感激。下面的缩写数据框的代码/新列是我想要实现的。

AnimalID <- c(10,10,10,10,12,12,12,12,14)
Parity <- c(8,8,9,9,4,4,4,4,2)
Udder_quarter <- c("LH","LH","RH","RH","LH","RH","LF","RF","RF")
new_column <- c("same quarter","same quarter","different quarter","different quarter","different quarter","different quarter","different quarter","different quarter","one quarter")
quarters<- data.frame(AnimalID,Parity,Udder_quarter,new_column) 

structure(list(HerdAnimalID = c(100165, 100165, 100327, 100327, 
100450, 100450), Parity = c(6, 6, 5, 5, 3, 3), no_parities = c(1, 
1, 1, 1, 1, 1), case = c("1pathogen_lact", "1pathogen_lact", 
"1pathogen_lact", "1pathogen_lact", "1pathogen_lact", "1pathogen_lact"
), FARM = c(1, 1, 1, 1, 1, 1), `CASE NO` = c("101", "101", "638", 
"638", "593", "593"), MASTDATE = structure(c(1085529600, 1087689600, 
1097884800, 1101254400, 1106092800, 1106784000), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), QRT = c("LF", "LF", "RH", "LF", "LH", 
"LH"), MastitisDiagnosis = c("Corynebacterium spp", "Corynebacterium spp", 
"S. uberis", "S. uberis", "Bacillus spp", "Bacillus spp"), PrevCalvDate = 
structure(c(1075334400, 
1075334400, 1096156800, 1096156800, 1091145600, 1091145600), class = 
c("POSIXct", 
"POSIXt"), tzone = "UTC")), .Names = c("HerdAnimalID", "Parity", 
"no_parities", "case", "FARM", "CASE NO", "MASTDATE", "QRT", 
"MastitisDiagnosis", "PrevCalvDate"), row.names = c(NA, -6L), class = 
c("tbl_df", 
"tbl", "data.frame"))

2 个答案:

答案 0 :(得分:0)

希望这有帮助!

library(dplyr)

quarters %>%
  group_by(AnimalID) %>%
  mutate(new_column = ifelse(n()==1, 'one quarter', NA)) %>%
  group_by(Parity, add=T) %>%
  mutate(new_column=ifelse(length(unique(Udder_quarter))==1 & is.na(new_column), 
                           "same quarter", 
                           ifelse(length(unique(Udder_quarter))>1,
                                  "different quarter",
                                  new_column))) %>%
  data.frame()

输出为:

  AnimalID Parity Udder_quarter        new_column
1       10      8            LH      same quarter
2       10      8            LH      same quarter
3       10      9            RH      same quarter
4       10      9            RH      same quarter
5       12      4            LH different quarter
6       12      4            RH different quarter
7       12      4            LF different quarter
8       12      4            RF different quarter
9       14      2            RF       one quarter

示例数据:

quarters <- structure(list(AnimalID = c(10, 10, 10, 10, 12, 12, 12, 12, 14
), Parity = c(8, 8, 9, 9, 4, 4, 4, 4, 2), Udder_quarter = structure(c(2L, 
2L, 4L, 4L, 2L, 4L, 1L, 3L, 3L), .Label = c("LF", "LH", "RF", 
"RH"), class = "factor")), .Names = c("AnimalID", "Parity", "Udder_quarter"
), row.names = c(NA, -9L), class = "data.frame")

答案 1 :(得分:0)

我会使用ave来做到这一点:

f <- function(x) {
  if (length(x)==1) return("one")
  else if (all(x == x[1])) return("same")
  else return("different")
}

ave(Udder_quarter, interaction(AnimalID, Parity), FUN=f)

# [1] "same"      "same"      "same"      "same"      "different"
# [6] "different" "different" "different" "one"