如何使用日期作为过滤器

时间:2018-11-19 17:21:59

标签: r date dataframe matrix

我几乎不了解R和脚本。因此,我希望您对这个基本问题有耐心。

library(lubridate)
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)

使用此代码,我们将金额的总和作为矩阵,而机场作为行/列。现在,我只需要

的结果
  1. 2017
  2. 2017.01
  3. 直到2017.01

3 个答案:

答案 0 :(得分:0)

为什么在创建amount时不强迫"integer"df类?只需删除

中的双引号
amount <- c("1", "3", "1", "10", "5")

amount <- as.integer(c("1", "3", "1", "10", "5"))

这是因为as.integer(df$amount) 返回

c(1, 3, 1, 10, 5)

当您创建数据帧df时,矢量将被强制转换为类"factor",而您现在拥有的是

as.integer(df$amount)
#[1] 1 3 1 2 4

正确的方法是

as.integer(as.character(df$amount))
#[1]  1  3  1 10  5

或更简单地说:

date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c(1, 3, 1, 10, 5)
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)

现在是问题。

这基本上是一个子集问题。
子集提取所需年份和月份的数据,然后运行相同的xtabs命令。

df1 <- df[year(df$date.depature) == 2017, ]
df2 <- df1[month(df1$date.depature) == 1, ]
df3 <- cbind(df[year(df$date.depature) < 2017, ], df2)

现在xtabs,带有上面的子数据框。

xtabs(amount ~ airport.arrival + airport.departure, df1)
xtabs(amount ~ airport.arrival + airport.departure, df2)
xtabs(amount ~ airport.arrival + airport.departure, df3)

答案 1 :(得分:0)

您需要在xtabs调用中将date.departure子集化。对于== 2017年:

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[year(df$date.depature)==2017,])

对于year == 2017和month == 1:

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[year(df$date.depature)==2017 & month(df$date.departure)==1,])

对于2017年1月之前的任何内容:

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[df$date.depature<as_date("2017-01-01"),])

答案 2 :(得分:0)

由于您已经在使用lubridate,因此,我将向您展示使用dplyr(lubridate的tidyverse的一部分)的一种方法。

所有解决方案均适用。 filter的{​​{1}},monthyear函数与as_datelubridatextabs函数一起创建条件来过滤数据,然后使用pipe %>%传递较长的时间到library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union library(lubridate) #> #> Attaching package: 'lubridate' #> The following object is masked from 'package:base': #> #> date date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25") airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY") airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO") amount <- c("1", "3", "1", "10", "5") date.depature <- as_date(date.depature) df <- data.frame(date.depature, airport.departure, airport.arrival, amount) # For 2017 df %>% filter(year(date.depature) == 2017) %>% xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .) #> airport.departure #> airport.arrival CDG QNY QXO #> CDG 0 0 0 #> QNY 0 0 1 #> QXO 0 4 0 #> SYD 2 0 0 # 2017.01 df %>% filter(year(date.depature) == 2017, month(date.depature) == 1) %>% xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .) #> airport.departure #> airport.arrival CDG QNY QXO #> CDG 0 0 0 #> QNY 0 0 1 #> QXO 0 0 0 #> SYD 2 0 0 # until 2017.01 df %>% filter(date.depature <= as_date("2017.01.01")) %>% xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .) #> airport.departure #> airport.arrival CDG QNY QXO #> CDG 0 3 0 #> QNY 0 0 0 #> QXO 0 0 0 #> SYD 1 0 0

from collections import Counter

studentPerf = {('Jeffery','male','junior'):[0.81,0.75,0.74,0.8],
('Able','male','senior'):[0.87,0.79,0.81,0.81],
('Don','male','junior'):[0.82,0.77,0.8,0.8],
('Will','male','senior'):[0.86,0.78,0.77,0.78],
('John','male','junior'):[0.74,0.81,0.87,0.73]}

print(Counter(x[1] for x in studentPerf))
# Counter({'male': 5})

reprex package(v0.2.1)于2018-11-19创建