我几乎不了解R和脚本。因此,我希望您对这个基本问题有耐心。
library(lubridate)
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
使用此代码,我们将金额的总和作为矩阵,而机场作为行/列。现在,我只需要
的结果答案 0 :(得分:0)
为什么在创建amount
时不强迫"integer"
来df
类?只需删除
amount <- c("1", "3", "1", "10", "5")
或
amount <- as.integer(c("1", "3", "1", "10", "5"))
这是因为as.integer(df$amount)
不返回
c(1, 3, 1, 10, 5)
当您创建数据帧df
时,矢量将被强制转换为类"factor"
,而您现在拥有的是
as.integer(df$amount)
#[1] 1 3 1 2 4
正确的方法是
as.integer(as.character(df$amount))
#[1] 1 3 1 10 5
或更简单地说:
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c(1, 3, 1, 10, 5)
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
现在是问题。
这基本上是一个子集问题。
子集提取所需年份和月份的数据,然后运行相同的xtabs
命令。
df1 <- df[year(df$date.depature) == 2017, ]
df2 <- df1[month(df1$date.depature) == 1, ]
df3 <- cbind(df[year(df$date.depature) < 2017, ], df2)
现在xtabs
,带有上面的子数据框。
xtabs(amount ~ airport.arrival + airport.departure, df1)
xtabs(amount ~ airport.arrival + airport.departure, df2)
xtabs(amount ~ airport.arrival + airport.departure, df3)
答案 1 :(得分:0)
您需要在xtabs调用中将date.departure子集化。对于== 2017年:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[year(df$date.depature)==2017,])
对于year == 2017和month == 1:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[year(df$date.depature)==2017 & month(df$date.departure)==1,])
对于2017年1月之前的任何内容:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[df$date.depature<as_date("2017-01-01"),])
答案 2 :(得分:0)
由于您已经在使用lubridate
,因此,我将向您展示使用dplyr
(lubridate的tidyverse
的一部分)的一种方法。
所有解决方案均适用。 filter
的{{1}},month
和year
函数与as_date
,lubridate
和xtabs
函数一起创建条件来过滤数据,然后使用pipe %>%
传递较长的时间到library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
# For 2017
df %>%
filter(year(date.depature) == 2017) %>%
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .)
#> airport.departure
#> airport.arrival CDG QNY QXO
#> CDG 0 0 0
#> QNY 0 0 1
#> QXO 0 4 0
#> SYD 2 0 0
# 2017.01
df %>%
filter(year(date.depature) == 2017, month(date.depature) == 1) %>%
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .)
#> airport.departure
#> airport.arrival CDG QNY QXO
#> CDG 0 0 0
#> QNY 0 0 1
#> QXO 0 0 0
#> SYD 2 0 0
# until 2017.01
df %>%
filter(date.depature <= as_date("2017.01.01")) %>%
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .)
#> airport.departure
#> airport.arrival CDG QNY QXO
#> CDG 0 3 0
#> QNY 0 0 0
#> QXO 0 0 0
#> SYD 1 0 0
from collections import Counter
studentPerf = {('Jeffery','male','junior'):[0.81,0.75,0.74,0.8],
('Able','male','senior'):[0.87,0.79,0.81,0.81],
('Don','male','junior'):[0.82,0.77,0.8,0.8],
('Will','male','senior'):[0.86,0.78,0.77,0.78],
('John','male','junior'):[0.74,0.81,0.87,0.73]}
print(Counter(x[1] for x in studentPerf))
# Counter({'male': 5})
由reprex package(v0.2.1)于2018-11-19创建