我需要在R中编写一个函数,以返回列值大于0的系列中的第一个日期。我想在数据框中标识每年的日期。
例如,给定此示例数据...
Date Year Catch
3/12/2001 2001 0
3/19/2001 2001 7
3/24/2001 2001 9
4/6/2002 2002 12
4/9/2002 2002 0
4/15/2002 2002 5
4/27/2002 2002 0
3/18/2003 2003 0
3/22/2003 2003 0
3/27/2003 2003 15
我希望R返回每年的第一个日期,渔获> 0
Year Date
2001 3/19/2001
2002 4/6/2002
2003 3/27/2003
我一直在使用下面的min函数,但是它只返回行号,而我无法在数据框中返回每年的值。 min(which(data$Catch > 0))
我刚开始使用R编写自己的函数。对您的任何帮助将不胜感激。谢谢。
答案 0 :(得分:4)
library(dplyr)
df1 %>%
group_by(Year) %>%
slice(which.max(Catch > 0))
# # A tibble: 3 x 3
# # Groups: Year [3]
# Date Year Catch
# <date> <int> <int>
# 1 2001-03-19 2001 7
# 2 2002-04-06 2002 12
# 3 2003-03-27 2003 15
数据:
df1 <-
structure(list(Date = structure(c(11393, 11400, 11405, 11783,
11786, 11792, 11804, 12129, 12133, 12138), class = "Date"), Year = c(2001L,
2001L, 2001L, 2002L, 2002L, 2002L, 2002L, 2003L, 2003L, 2003L
), Catch = c(0L, 7L, 9L, 12L, 0L, 5L, 0L, 0L, 0L, 15L)), .Names = c("Date",
"Year", "Catch"), row.names = c(NA, -10L), class = "data.frame")
答案 1 :(得分:3)
这里是data.table
library(data.table)
setDT(df1)[, .SD[which.max(Catch > 0)], Year]
# Year Date Catch
#1: 2001 2001-03-19 7
#2: 2002 2002-04-06 12
#3: 2003 2003-03-27 15
df1 <- structure(list(Date = structure(c(11393, 11400, 11405, 11783,
11786, 11792, 11804, 12129, 12133, 12138), class = "Date"), Year = c(2001L,
2001L, 2001L, 2002L, 2002L, 2002L, 2002L, 2003L, 2003L, 2003L
), Catch = c(0L, 7L, 9L, 12L, 0L, 5L, 0L, 0L, 0L, 15L)), row.names = c(NA,
-10L), class = "data.frame")
答案 2 :(得分:1)
这是一个dplyr
解决方案。
df1 %>%
group_by(Year) %>%
mutate(Inx = first(which(Catch > 0))) %>%
filter(Inx == row_number()) %>%
select(-Inx)
## A tibble: 3 x 3
## Groups: Year [3]
# Date Year Catch
# <date> <int> <int>
#1 2001-03-19 2001 7
#2 2002-04-06 2002 12
#3 2003-03-27 2003 15
数据。
df1 <- read.table(text = "
Date Year Catch
3/12/2001 2001 0
3/19/2001 2001 7
3/24/2001 2001 9
4/6/2002 2002 12
4/9/2002 2002 0
4/15/2002 2002 5
4/27/2002 2002 0
3/18/2003 2003 0
3/22/2003 2003 0
3/27/2003 2003 15
", header = TRUE)
df1$Date <- as.Date(df1$Date, "%m/%d/%Y")
答案 3 :(得分:1)
df <- data.frame(Date = as.Date(c("3/12/2001", "3/19/2001", "3/24/2001",
"4/6/2002", "4/9/2002", "4/15/2002", "4/27/2002",
"3/18/2003", "3/22/2003", "3/27/2003"), "%m/%d/%Y"),
Year = c(2001, 2001, 2001, 2002, 2002, 2002, 2002, 2003, 2003, 2003),
Catch = c(0, 7, 9, 12, 0, 5, 0, 0, 0, 15))
如果不需要功能,可以尝试
library(dplyr)
df %>% group_by(Date) %>% filter(Catch > 0 ) %>% group_by(Year) %>% summarize(date = min(Date))
如果您确实想编写一个函数,也许
firstcatch <- function(yr) {
dd <- subset(df, yr == Year)
withcatches <- dd[which(dd$Catch > 0), ]
min(as.character(withcatches$Date))
}
yrs <- c(2001, 2002, 2003)
dates <- unlist(lapply(yrs, firstcatch))
ndt <- data.frame(Year = yrs, Date = dates)
答案 4 :(得分:0)
您可以尝试以下操作:
df <- data %>%
group_by(Year) %>%
mutate(newCol=Date[Catch>0][1]) %>%
distinct(Year, newCol)