我有一个称为“数据”的数据框,其中具有“日期,月份,排放量和排放站”列。另一个具有“月,W1_Percentile和B1_Percentile”列的数据框称为“ perc”。 W1_Percentile和B1_Percentile是每个计量站的每月百分比值。我希望我的最终输出具有与df(data)相同的列,并带有“ Percentile”的附加列,该列将具有相应月份和计量站的百分位数值(相应月份的每个计量站的百分数值存储在df(perc))。我应该遵循什么步骤?
以下是输入数据的示例:
date <- as.Date(c('1950-03-12','1954-03-23','1991-06-27','1997-09-04','1991-06-27','1987-05-06','1987-05-29','1856-07-08','1993-06-04', '2001-09-19','2001-05-06','2001-05-27'))
month <- c('Mar','Mar','Jun','Sep','Jun','May','May','Jul','Jun','Sep','May','May')
disch <- c(125,1535,1654,154,4654,453,1654,145,423,433,438,6426)
station <- c('W1','W1','W1','W1','W1','W1','B1','B1','B1','B1','B1','B1')
data <- data.frame("Date"= date, "Month" = month,"Discharge"=disch,"station"=station)
Date Month Discharge station
1 1950-03-12 Mar 125 W1
2 1954-03-23 Mar 1535 W1
3 1991-06-27 Jun 1654 W1
4 1997-09-04 Sep 154 W1
5 1991-06-27 Jun 4654 W1
6 1987-05-06 May 453 W1
7 1987-05-29 May 1654 B1
8 1856-07-08 Jul 145 B1
9 1993-06-04 Jun 423 B1
10 2001-09-19 Sep 433 B1
11 2001-05-06 May 438 B1
12 2001-05-27 May 6426 B1
Month <- c('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec')
W1 <- c(106,313,531.40,164.10,40,23.39,18.30,24,16,16,12,34)
B1 <- c(1330,1550,1948,1880,1260,853.15,680.15,486.10,503,625,738,1070)
perc <- data.frame("Month"=Month,"W1_Percentile"=W1,"B1_Percentile"=B1)
Month W1_Percentile B1_Percentile
1 Jan 106.00 1330.00
2 Feb 313.00 1550.00
3 Mar 531.40 1948.00
4 Apr 164.10 1880.00
5 May 40.00 1260.00
6 Jun 23.39 853.15
7 Jul 18.30 680.15
8 Aug 24.00 486.10
9 Sep 16.00 503.00
10 Oct 16.00 625.00
11 Nov 12.00 738.00
12 Dec 34.00 1070.00
这是我希望最终输出看起来像的样子:
Date Month Discharge station Percentile
1 1950-03-12 Mar 125 W1 531.40
2 1954-03-23 Mar 1535 W1 531.40
3 1991-06-27 Jun 1654 W1 23.39
4 1997-09-04 Sep 154 W1 16.00
5 1991-06-27 Jun 4654 W1 23.39
6 1987-05-06 May 453 W1 40.00
7 1987-05-29 May 1654 B1 1260.00
8 1856-07-08 Jul 145 B1 680.15
9 1993-06-04 Jun 423 B1 853.15
10 2001-09-19 Sep 433 B1 503.00
11 2001-05-06 May 438 B1 1260.00
12 2001-05-27 May 6426 B1 1260.00
答案 0 :(得分:0)
我们需要首先将您的perc
数据转换为长格式,以便我们拥有要添加到data
的列,然后这是一个简单的联接:
library(tidyr)
library(dplyr)
# make the column names the same as the values in data
names(perc)[2:3] = c("W1", "B1")
# convert to long format
perc_long = gather(perc, key = "station", value = "percentile", W1, B1)
# join
left_join(data, perc_long)
# Joining, by = c("Month", "station")
# Date Month Discharge station percentile
# 1 1950-03-12 Mar 125 W1 531.40
# 2 1954-03-23 Mar 1535 W1 531.40
# 3 1991-06-27 Jun 1654 W1 23.39
# 4 1997-09-04 Sep 154 W1 16.00
# 5 1991-06-27 Jun 4654 W1 23.39
# 6 1987-05-06 May 453 W1 40.00
# 7 1987-05-29 May 1654 B1 1260.00
# 8 1856-07-08 Jul 145 B1 680.15
# 9 1993-06-04 Jun 423 B1 853.15
# 10 2001-09-19 Sep 433 B1 503.00
# 11 2001-05-06 May 438 B1 1260.00
# 12 2001-05-27 May 6426 B1 1260.00
执行这些操作的方法有很多,实质上是两个R-FAQ的组合。有关其他参考,请参见