我目前有一个名为清算的数据框,我想从中运行30个随机样本,每个样本1000个,指定哪个帐户来自哪个样本,然后将其合并到一个新数据框中,所有30个样本合并:
以下是我在使用dplyr软件包进行随机抽样时手动完成的方法,但希望将其简化为可重复性:
Sample_1 <- liquidation %>%
sample_n(1000)
Sample_1$Obs <- 1
Sample_2 <- liquidation %>%
sample_n(1000)
Sample_2$Obs <- 2
Sample_3 <- liquidation %>%
sample_n(1000)
Sample_3$Obs <- 3
....
Sample_30 <- liquidation %>%
sample_n(1000)
Sample_30$Obs <- 30
然后我将它们组合成一个组合数据框:
Combined <- rbind(Sample_1, Sample_2, Sample_3, Sample_4, Sample_5, Sample_6, Sample_7, Sample_8, Sample_9, Sample_10,
Sample_11, Sample_12, Sample_13, Sample_14, Sample_15, Sample_16, Sample_17, Sample_18, Sample_19,
Sample_20, Sample_21, Sample_22, Sample_23, Sample_24, Sample_25, Sample_26, Sample_27, Sample_28,
Sample_29, Sample_30)
str(Combined)
'data.frame': 30000 obs. of 31 variables:
答案 0 :(得分:3)
以下是使用 function dateConvert(dateobj,format){
var year = dateobj.getFullYear();
var month= ("0" + (dateobj.getMonth()+1)).slice(-2);
var date = ("0" + dateobj.getDate()).slice(-2);
var hours = ("0" + dateobj.getHours()).slice(-2);
var minutes = ("0" + dateobj.getMinutes()).slice(-2);
var seconds = ("0" + dateobj.getSeconds()).slice(-2);
var day = dateobj.getDay();
var months = ["JAN","FEB","MAR","APR","MAY","JUN","JUL","AUG","SEP","OCT","NOV","DEC"];
var dates = ["SUN","MON","TUE","WED","THU","FRI","SAT"];
var converted_date = "";
switch(format){
case "YYYY-MM-DD":
converted_date = year + "-" + month + "-" + date;
break;
case "YYYY-MMM-DD DDD":
converted_date = year + "-" + months[parseInt(month)-1] + "-" + date + " " + dates[parseInt(day)];
break;
}
return converted_date;
}
var date = input.VIP_2bParsed;
var format = "YYYY-MMM-DD DDD";
var converted_day = dateConvert(date,format);
output={converted_day: converted_day}
的示例(随机选择5行,10次)
mtcars
我们使用基函数Combined <- bind_rows(replicate(10, mtcars %>% sample_n(5), simplify=F), .id="Obs")
多次重复采样。然后我们使用replicate()
&#39; s dplyr
合并样本并跟踪它们来自哪个样本。
答案 1 :(得分:1)
你应该能够把它包装成一个函数(假设Sample_20等是暂时的,你以后不需要它们)
sampling <- function(x, nSamples = 30, nRows = 1000) {
do.call('rbind', lapply(seq_along(1:nSamples), function(n) {
x %>% sample_n(nRows) %>% mutate(Obs=n)
}))
}
然后可以运行:
combined <- sampling(liquidation)