下午好,我的问题如下:
我有一个名为friends的数据库:
friends <- data_frame(
name = c("Nicolas", "Thierry", "Bernard", "Jerome", "peter", "yassine", "karim"),
age = c(27, 26, 30, 31, 31, 38, 39),
height = c(180, 178, 190, 185, 187, 160, 158),
married = c("M", "M", "N", "N", "N", "M", "M")
)
i <- Intervals(
matrix(
c(0,5000,
0,5000,
7000,10000,
7000,10000,
7000,10000,
10000,15000,
10000,15000
),
byrow = TRUE,
ncol = 2
),
closed = c( TRUE, TRUE ),
type = "R"
)
我需要创建一个以该数据库为参数的函数。
该函数将对一行进行采样(例如,仅对第四行进行一次采样,该函数将不选择该行进行其他执行),然后它将执行某些特性。
sampling_fct<-function(data){
data[sample(nrow(data), 1), ]
# sample a given row only one time
}
如果我们有5行,则选择应类似于:
数据[3]
数据[2]
数据[5]
数据[4]
数据[1]
其中数据=朋友。
我不应该有重复的结果like these。
我希望我的问题很清楚。
谢谢你!
答案 0 :(得分:1)
我想您正在寻找这样的东西:
#Input data
friends <- data.frame(
name = c("Nicolas", "Thierry", "Bernard", "Jerome", "peter", "yassine", "karim"),
age = c(27, 26, 30, 31, 31, 38, 39),
height = c(180, 178, 190, 185, 187, 160, 158),
married = c("M", "M", "N", "N", "N", "M", "M")
)
#Random row draw function
#Takes the dataframe and a list of forbidden row values as input
tst_func <- function(data, verbot_list){
if(length(verbot_list) == nrow(data)){
stop("ERROR: no possible rows left to be sampled.")
} else {
repeat{
curnum <- as.integer(sample(1:nrow(data), 1))
if(!(curnum %in% verbot_list)){
break
}
}
verbot_list <- c(verbot_list, curnum)
#data[curnum, ]
return(list(data[curnum, ], verbot_list))
}
}
#Initialization of empty list in parent env. that maintains rows that cannot be drawn from anymore
rm_list <- c()
#Example run
tstval <- tst_func(friends, rm_list)
tstrow <- tstval[[1]]
tstrow
# name age height married
# 1 Nicolas 27 180 M
rm_list <- tstval[[2]]
rm_list
# [1] 1
如果(随机)绘制了所有可能的行:
rm_list
# [1] 1 5 3 4 6 2 7
该函数退出并出现错误:
tstval <- tst_func(friends, rm_list)
# Error in tst_func(friends, rm_list) :
# ERROR: no possible rows left to be sampled.
(要重复绘制随机行,只需在循环内实现该功能。)
答案 1 :(得分:0)
要确保仅对给定行进行一次采样,可以使用sample(replace=FALSE)
(re:R Examples of sample())。
给出数据集,请考虑使用:
friends <- data.frame(
name = c("Nicolas", "Thierry", "Bernard", "Jerome", "peter", "yassine", "karim"),
age = c(27, 26, 30, 31, 31, 38, 39),
height = c(180, 178, 190, 185, 187, 160, 158),
married = c("M", "M", "N", "N", "N", "M", "M")
)
sampling_fct<-function(data){
data[sample(nrow(data), size = 6, replace = TRUE), ]
}
mylist <- list(friends, friends, friends)
mylist_sampled <- lapply(mylist,sampling_fct)