晚上好,
我正在尝试构建正确的循环(无论是for
,while
还是if
),它会在给定特定日期和变量的情况下对一列中的值求和并将其存入数据框中的特定位置。我在循环中的所有尝试都是非常错误的,我显然没有得到正确的起点!
在下面的Sample.Frame
中,我想填充LON.NEW6
列的第一行,其中No.of.Rooms
的{{1}}总和为Sample.Bookings
1}}匹配与列标题匹配的Stay.Date
和Start.Date
。这需要在整个框架中重复。
部分填充的Legacy.Hotel.Code
数据框:
Sample.Frame
> Sample.Frame
Start.Date LON.NEW6 LON.CAP LON.CEN2 LON.MYH1 LON.SOS LON.MOW LON.HOLW LON.94V LON.FOU6 LON.949
E0-001-085571068-9 30/09/2015 NA NA NA NA NA NA NA NA NA NA
E0-001-086838711-7 07/11/2015 NA NA NA NA NA NA NA NA NA NA
E0-001-085536178-4@2015102019 20/10/2015 NA NA NA NA NA NA NA NA NA NA
E0-001-085466318-0 01/07/2016 NA NA NA NA NA NA NA NA NA NA
E0-001-085591039-5 30/01/2016 NA NA NA NA NA NA NA NA NA NA
E0-001-087500856-4 29/04/2016 NA NA NA NA NA NA NA NA NA NA
E0-001-079398784-2@2015092909 29/09/2015 NA NA NA NA NA NA NA NA NA NA
E0-001-086021337-5 14/10/2015 NA NA NA NA NA NA NA NA NA NA
E0-001-086639435-3 20/12/2015 NA NA NA NA NA NA NA NA NA NA
E0-001-087220018-9 27/10/2015 NA NA NA NA NA NA NA NA NA NA
数据框:
Sample.Bookings
我应该最终得到的东西开始如此:
> Sample.Bookings
City Booking.ID Legacy.Hotel.Code Star.Rating.ID Stay.Date No.of.Rooms
1146767 London 17480238 LON NEW6 2 30/09/2015 3
220037 London 18381583 LON CEN2 3 29/09/2015 1
668476 London 15184820 LON NEW6 2 07/11/2015 1
1073551 London 16414241 LON CEN2 3 01/07/2016 1
138695 London 554331 LON CAP 5 29/04/2016 1
301805 London 17134981 LON NEW6 2 30/09/2015 1
181300 London 193930 LON CEN2 3 01/07/2016 1
1204682 London 15154547 LON CAP 5 23/07/2015 1
1549067 London 14436933 LON NEW6 2 20/10/2015 1
832903 London 13796464 LON NEW6 2 20/10/2015 1
301778 London 16304861 LON NEW6 2 22/11/2015 1
399343 London 16855128 LON NEW6 2 07/11/2015 1
399337 London 14855974 LON NEW6 2 03/04/2015 1
1472157 London 18320357 LON NEW6 2 17/01/2016 1
1184525 London 18360304 LON CEN2 3 05/02/2016 1
1342678 London 17623052 LON CAP 5 01/02/2016 1
420443 London 18381583 LON CEN2 3 20/02/2016 1
1435511 London 15230186 LON NEW6 2 22/08/2015 3
1201521 London 16319154 LON NEW6 2 05/09/2015 1
1233528 London 15460211 LON NEW6 2 28/07/2015 1
如果没有与该日期和变量对应的预订,则所有元素都使用总和填充或保留为NA。这是一个小样本,但我正在使用的集合可能有数千行和数百列。
我提前感谢您的帮助!
答案 0 :(得分:0)
你可以通过dplyr / tidyr获得你想要的东西:
library(dplyr)
library(tidyr)
Counts <- group_by(Sample.Bookings, Stay.Date, Legacy.Hotel.Code) %>%
summarise(n=n()) %>%
spread(Legacy.Hotel.Code, n)
您可以从Sample.Frame中添加行名称(看起来就像使用full_join所需的唯一信息一样:
Sample.Frame$Row.Name<-row.names(Sample.Frame)
full_join(Sample.Frame[,c("Row.Name", "Start.Date")], Counts, by= c("Start.Date"="Stay.Date"))
输出:
Row.Name Start.Date LON.CAP LON.CEN2 LON.NEW6
1 E0-001-085571068-9 30/09/2015 NA NA 2
2 E0-001-086838711-7 07/11/2015 NA NA 2
3 E0-001-085536178-4@2015102019 20/10/2015 NA NA 2
4 E0-001-085466318-0 01/07/2016 NA 2 NA
5 E0-001-085591039-5 30/01/2016 NA NA NA
6 E0-001-087500856-4 29/04/2016 1 NA NA
7 E0-001-079398784-2@2015092909 29/09/2015 NA 1 NA
8 E0-001-086021337-5 14/10/2015 NA NA NA
9 E0-001-086639435-3 20/12/2015 NA NA NA
10 E0-001-087220018-9 27/10/2015 NA NA NA
11 <NA> 01/02/2016 1 NA NA
12 <NA> 03/04/2015 NA NA 1
13 <NA> 05/02/2016 NA 1 NA
14 <NA> 05/09/2015 NA NA 1
15 <NA> 17/01/2016 NA NA 1
16 <NA> 20/02/2016 NA 1 NA
17 <NA> 22/08/2015 NA NA 1
18 <NA> 22/11/2015 NA NA 1
19 <NA> 23/07/2015 1 NA NA
20 <NA> 28/07/2015 NA NA 1
答案 1 :(得分:0)
我使用dcast
和reshape2
merge
函数执行了此操作。从Sample.Frame
data.frame,您真正需要保留的是Start.Date
和可能是行名称(如果有相关信息)
library(reshape2)
# all you need is the Start.Date and the row names
Sample.Frame <- data.frame(Start.Date = c("30/09/2015", "07/11/2015", "27/10/2015"))
Sample.Frame$row.names <- row.names(Sample.Frame)
# test data.frame
Sample.Bookings <- data.frame(Legacy.Hotel.Code = c("LON.NEW6", "LON.CAP"),
Stay.Date = c("30/09/2015", "07/11/2015"),
No.of.Rooms = c(3, 1))
#use dcast to go back to wide format
Sample.Bookings.cast <- dcast(Sample.Bookings, Stay.Date ~ Legacy.Hotel.Code,
value.var = "No.of.Rooms")
#Change Stay.Date to Start.Date so we can merge on that column
colnames(Sample.Bookings.cast)[which(colnames(Sample.Bookings.cast) == "Stay.Date")] <- "Start.Date"
#merge on Start.Date
Final.Date <- merge(Sample.Frame, Sample.Bookings.cast, by = "Start.Date", all.x = TRUE)
Final.Date
Start.Date row.names LON.CAP LON.NEW6
1 07/11/2015 2 1 NA
2 27/10/2015 3 NA NA
3 30/09/2015 1 NA 3