数据样本:
Date Value GeographyName Newdate
<chr> <dbl> <chr> <int>
1 2011/12 0.698 NHS Wigan Borough CCG 2012
2 2011/12 0.674 NHS Gateshead CCG 2012
3 2012/13 0.775 NHS North Hampshire CCG 2013
4 2012/13 0.686 NHS St Helens CCG 2013
5 2012/13 0.716 NHS Wakefield CCG 2013
6 2012/13 0.750 NHS West Lancashire CCG 2013
7 2012/13 0.722 NHS Hull CCG 2013
8 2013/14 0.746 NHS Brent CCG 2014
9 2013/14 0.776 NHS Hambleton, Richmondshire and Whitby CCG 2014
10 2013/14 0.686 NHS Barnsley CCG 2014
我希望将2012
向量中的年份Newdate
复制三次,共计六个新的重复行。但是,我希望其中两个新行的Newdate
值为2011
,另外两行的值为2010
,最后两个新行的值为2009
值SELECT *
FROM (SELECT salesman_id,
CASE
WHEN sales_region IN ('Oranage', 'Purple') THEN 'Special'
ELSE sales_region
END AS sales_region,
supervisor,
ROW_NUMBER() OVER (PARTITION BY CASE
WHEN sales_region IN ('Oranage', 'Purple')
THEN 'Special'
ELSE sales_region
END
ORDER BY dbms_random.value) AS num_row
FROM sales_table) t
WHERE (sales_region = 'Special' AND num_row <= 18) OR (num_row <= 3)
。有没有办法在复制过程中实现这一目标?
答案 0 :(得分:1)
dplyr::bind_rows
提供了绑定多个数据帧行的灵活性。首先可以过滤df
以包含Newdate == 2012
的行,然后使用bind_rows
将其合并多次。通过OP修改每个描述的Newdate
,然后将其与原始df
合并。
library(dplyr)
df %>% filter(Newdate == 2012) %>%
bind_rows(., ., .) %>% #Duplicating rows 3 times
mutate(Newdate = Newdate - (row_number()+1) %/% 2) %>%
bind_rows(df, .)
# Date Value GeographyName Newdate
# 1 2011/12 0.698 NHS Wigan Borough CCG 2012
# 2 2011/12 0.674 NHS Gateshead CCG 2012
# 3 2012/13 0.775 NHS North Hampshire CCG 2013
# 4 2012/13 0.686 NHS St Helens CCG 2013
# 5 2012/13 0.716 NHS Wakefield CCG 2013
# 6 2012/13 0.750 NHS West Lancashire CCG 2013
# 7 2012/13 0.722 NHS Hull CCG 2013
# 8 2013/14 0.746 NHS Brent CCG 2014
# 9 2013/14 0.776 NHS Hambleton, Richmondshire and Whitby CCG 2014
# 10 2013/14 0.686 NHS Barnsley CCG 2014
# 11 2011/12 0.698 NHS Wigan Borough CCG 2011
# 12 2011/12 0.674 NHS Gateshead CCG 2011
# 13 2011/12 0.698 NHS Wigan Borough CCG 2010
# 14 2011/12 0.674 NHS Gateshead CCG 2010
# 15 2011/12 0.698 NHS Wigan Borough CCG 2009
# 16 2011/12 0.674 NHS Gateshead CCG 2009
数据:强>
df <- read.table(text =
"Date Value GeographyName Newdate
1 2011/12 0.698 'NHS Wigan Borough CCG' 2012
2 2011/12 0.674 'NHS Gateshead CCG' 2012
3 2012/13 0.775 'NHS North Hampshire CCG' 2013
4 2012/13 0.686 'NHS St Helens CCG' 2013
5 2012/13 0.716 'NHS Wakefield CCG' 2013
6 2012/13 0.750 'NHS West Lancashire CCG' 2013
7 2012/13 0.722 'NHS Hull CCG' 2013
8 2013/14 0.746 'NHS Brent CCG' 2014
9 2013/14 0.776 'NHS Hambleton, Richmondshire and Whitby CCG' 2014
10 2013/14 0.686 'NHS Barnsley CCG' 2014",
stringsAsFactors = FALSE, header = TRUE)