具有固定行和列和的自举4x4矩阵

时间:2014-05-11 03:08:20

标签: r

我想知道是否有可能在保持常量行和列总和的同时改组4x4数据集。不可否认,我是编程的初学者,所以下面包含的代码可能并不容易。

任何帮助都将不胜感激,谢谢。

PS:如果你必须知道,数据集是基于种族的汽车偏好调查。

CarPreference <- read.table ( text = "
African 3 0 1 1
Asian 2 1 0 1
Hispanic 0 1 3 1
White 0 1 4 1
" )

row.names(CarPreference) <- CarPreference[,1]
colnames(CarPreference) <-c("Car Type","Car","Truck","SUV","Motorcycle")

CarPreference <- CarPreference[,-1]
as.matrix(CarPreference)

observed <- rbind(c(3,0,1,1),c(2,1,0,1),c(0,1,3,1),c(0,1,4,1))
deals=10000
observed.boot = array(NA,c(4,4,deals))
H0 <- c(rep(1,colSums(observed)[1]),rep(0,colSums(observed)[2]),rep(1,colSums(observed)[3]),rep(0,colSums(observed)[4]))
for (i in 1:deals)
{
data.boot <- sample(H0,sum(observed),replace=FALSE)

row1.boot <- data.boot[1:rowSums(observed)[1]]
row2.boot <- data.boot[(rowSums(observed)[1]+1):(rowSums(observed)[1]+rowSums(observed)[2])]
row3.boot <- data.boot[(rowSums(observed)[1]+rowSums(observed)[2]+1):(rowSums(observed)[1]+rowSums(observed)[2]+rowSums(observed)[3])]
row4.boot <- data.boot[(rowSums(observed)[1]+rowSums(observed)[2]+rowSums(observed)[3]+1):sum(observed)]

col1.boot <- data.boot[1:colSums(observed)[1]]
col2.boot <- data.boot[(colSums(observed)[1]+1):(colSums(observed)[1]+colSums(observed)[2])]
col3.boot <- data.boot[(colSums(observed)[1]+colSums(observed)[2]+1):(colSums(observed)[1]+colSums(observed)[2]+colSums(observed)[3])]
col4.boot <- data.boot[(colSums(observed)[1]+colSums(observed)[2]+colSums(observed)[3]+1):sum(observed)]

observed.boot[,,i] <- rbind(
c(sum(row1.boot),length(row1.boot)-sum(row1.boot), , ),
c(sum(row2.boot),length(row2.boot)-sum(row2.boot), , ),
c(sum(row3.boot),length(row3.boot)-sum(row3.boot), , ),
c(sum(row4.boot),length(row4.boot)-sum(row4.boot), , ))
}

1 个答案:

答案 0 :(得分:1)

将其煮沸,您希望随机地移动观察的行标签,同时保持其列标签相同。您可以通过构建所有列索引的向量y并反复对其进行混洗来执行此操作:

set.seed(144)
observed <- rbind(c(3,0,1,1),c(2,1,0,1),c(0,1,3,1),c(0,1,4,1))
x <- rep(1:nrow(observed), rowSums(observed))
y <- rep(1:ncol(observed), colSums(observed))
samples <- lapply(1:10000, function(a) table(x, sample(y)))

现在,samples包含一个自举表列表,其行和列总和与observed匹配。

samples[[1]] 
# x   1 2 3 4
#   1 1 1 2 1
#   2 0 0 2 2
#   3 2 0 2 1
#   4 2 2 2 0
samples[[10000]]
# x   1 2 3 4
#   1 1 1 2 1
#   2 2 1 1 0
#   3 1 1 2 1
#   4 1 0 3 2

这与从原始表格的行和列总和相同的列联表中随机抽样相同。