我是 R 新手,所以要简单!我有两个数据集,其中两个不同的样本(男性和女性)被问到相同的问题(列名相同)。我想运行一个 t 检验,比较每个数据集中任意两列的均值,但我不知道如何以一种有用的方式将它们合并到一个数据集中。我尝试了一些诸如合并和 rbind 之类的方法,但它们并没有按照我的意愿行事。
这是数据集 1 中的一列。我想将其与...进行比较
structure(list(UVRATE1 = c(6, 6, 3, 7, 7, 7, 4, 6, 6, 6, 6, 4,
7, 4, 1, 5, 6)), class = "data.frame", row.names = c(NA, -17L
))
... 数据集 2 中的这一列(如您所见,列名相同。
structure(list(UVRATE2 = c(4, 1, 3, 5, 6, 7, 7, 4, 7, 4, 7, 7,
4, 4, 5, 1, 4)), class = "data.frame", row.names = c(NA, -17L
))
答案 0 :(得分:4)
您可以创建一个数据框并使用 t.test
将其直接传递给未配对的双样本 t 检验:
dataset1 <- data.frame (UVRATE1 = c(38.9, 61.2, 73.3, 21.8, 63.4, 64.6, 48.4, 48.8, 48.5))
# dataset1$UVRATE1
# [1] 38.9 61.2 73.3 21.8 63.4 64.6 48.4 48.8 48.5
dataset2 <- data.frame (UVRATE1 = c(67.8, 60, 63.4, 76, 89.4, 73.3, 67.3, 61.3, 62.4))
# dataset2$UVRATE1
# [1] 67.8 60.0 63.4 76.0 89.4 73.3 67.3 61.3 62.4
# Create a merged data frame
my_data <- data.frame(
group = rep(c("Woman", "Man"), each = 9),
weight = c(dataset1$UVRATE1, dataset2$UVRATE1)
)
# my_data
# group weight
# 1 Woman 38.9
# 2 Woman 61.2
# 3 Woman 73.3
# 4 Woman 21.8
# 5 Woman 63.4
# 6 Woman 64.6
# 7 Woman 48.4
# 8 Woman 48.8
# 9 Woman 48.5
# 10 Man 67.8
# 11 Man 60.0
# 12 Man 63.4
# 13 Man 76.0
# 14 Man 89.4
# 15 Man 73.3
# 16 Man 67.3
# 17 Man 61.3
# 18 Man 62.4
# Compute t-test
res <- t.test(my_data[my_data$group == "Woman",]$weight,my_data[my_data$group == "Man",]$weight, var.equal = TRUE)
# Two Sample t-test
#
# data: my_data[my_data$group == "Woman", ]$weight and my_data[my_data$group == "Man", ]$weight
# t = -2.7842, df = 16, p-value = 0.01327
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -29.748019 -4.029759
# sample estimates:
# mean of x mean of y
# 52.10000 68.98889
不要忘记检查假设。
答案 1 :(得分:0)
代码:
# dataframe 1
dataset_1 <- data.frame(UVRATE1= c(6, 6, 3, 7, 7, 7, 4, 6, 6, 6, 6, 4, 7, 4, 1, 5, 6))
# dataframe 2
dataset_2 <- data.frame(UVRATE1= c(4, 1, 3, 5, 6, 7, 7, 4, 7, 4, 7, 7, 4, 4, 5, 1, 4))
# change name of column in dataset2
colnames(dataset_2)[1] = "UVRATE2"
# combine to one dataframe
df <- cbind(dataset_1, dataset_2)
# t-test
t.test(df$UVRATE1,df$UVRATE2)
输出:
Welch Two Sample t-test
data: df$x and df$y
t = 1.0394, df = 31.128, p-value = 0.3066
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.622388 1.916506
sample estimates:
mean of x mean of y
5.352941 4.705882