我想计算两艘集装箱集装箱船的每周平均装载量。一艘船在周日航行,另一艘船在周三航行。我有一个很大的excel文件,需要进行预订。我将在以下链接中加载该文件的一小部分:https://docs.google.com/spreadsheets/d/1BxHTClTkrQzIzZzG5vXXnvKtV0_az83PGJ2ghBaAQr0/edit?usp=sharing
第一艘船将获得应在星期一(Mo),星期二(Di)和星期三(Mi)交付的集装箱。第二艘船应在周四(Do),周五(Fr),周六(Sa)和周日(So)交付另一个港口所需的集装箱。数据包含有关2017年1月1日至2018年7月31日的容器的信息。这是82个整周。我想制作一个长度为82的向量,每个数字等于该周的天数。例如,向量的第一个数字应该是第一周的星期一,星期二和星期三的容器需求。因此,我想创建一个矢量,每艘船一个,其中包含有关该船应装载的集装箱数量的信息。一个82周的向量,以查看我们需求低的那几周以及均值等。
有人可以帮助我吗?
Here is the beginning of my code:
containers <- "https://docs.google.com/spreadsheets/d/1BxHTClTkrQzIzZzG5vXXnvKtV0_az83PGJ2ghBaAQr0/edit?usp=sharing"
#Containers between Rotterdam and Duisburg
containersRTMDUI <- subset(containers, containers$Laadhaven == "Rotterdam" & containers$Loshaven == "Duisburg")
#I used to do this in subsets, because I could not make a loop
Week1 <- subset(containersRTMDUI, containersRTMDUI$Datum1 >= "2017-01-02" &
containersRTMDUI$Datum1 < "2017-01-09" & containersRTMDUI$Dag1 = "Mo" &
containersRTMDUI$Dag1 = "Di" &containersRTMDUI$Dag1 = "Mi")
Week2 <- subset(etc..)
当然,难点在于,有几天没有需求。
答案 0 :(得分:1)
我想我明白了。使用data.table的一种方法:
# read in data as a data.table
library(data.table)
dt <- data.table(read.csv("path/to/file", stringsAsFactors = F))
# rename variables to english (
# there are shorter ways to do this, but I like to keep track)
setnames(dt, old = "ISO", new = "containter_type")
setnames(dt, old = "F.E", new = "full_empty")
setnames(dt, old = "Gewicht", new = "weight")
setnames(dt, old = "Laadhaven", new = "pickup_port")
setnames(dt, old = "Laadterminal", new = "pickup_terminal")
setnames(dt, old = "Loshaven", new = "dropoff_port")
setnames(dt, old = "Losterminal", new = "dropoff_terminal")
setnames(dt, old = "Datum1", new = "pickup_date")
setnames(dt, old = "Dag1", new = "pickup_dow")
setnames(dt, old = "Datum2", new = "dropoff_date")
setnames(dt, old = "Dag2", new = "dropoff_dow")
# convert date variable to date-type (instead of factor/string)
dt[ , pickup_date := as.Date(pickup_date, "%d.%m.%Y")]
dt[ , dropoff_date := as.Date(dropoff_date, "%d.%m.%Y")]
# create a week variable
dt[ , week := lubridate::week(pickup_date)]
# create a variable (MTW) by day-of-week
# MTW=1 for mon, tues, wed; MTW=0 for thurs, fri, sat, sun
dt[ , MTW := pickup_dow %in% c("Mo", "Di", "Mi")]
# count the number of rows by week and MTW
result <- dt[ , .(nrows = .N), by=.(week, MTW)]
# print result
result
# fill in 0 weeks
dt2 <- data.table(week = rep(1:7, each=2), MTW = rep(c(T,F), each=7))
result <- merge(result, dt2, by=c("week", "MTW"), all=T)
result[is.na(nrows), nrows := 0]
# print updated result
result