对于下面的数据框,我想对多列执行kolmogorov-smirnov测试。列ID是记录ID,A-D是由2个级别组成的因子('其他'和A,B,C,D。我的测试变量在E列。
现在我想进行4次KS测试:
实际上,我有80列,所以我正在寻找一种方法来“同时”执行这80项测试
ID A B C D E
1 1 O B C O 1
2 2 O O O O 3
3 3 O O O D 2
4 4 A O C D 7
5 5 A B O O 12
6 6 O O O O 4
7 7 O B O O 8
答案 0 :(得分:3)
我希望这能解决你的问题:
dat <- read.table("path/data.txt") # your data imported into my session.
cols <- c("A", "B", "C", "D") #these are the your columnss with categories. We leave the others out.
E <- dat$E # but save the E variable
lapply(cols, function(i){ # Evaluate E at each level of each column
x <- factor(dat[,i])
a <- E[x == levels(x)[1]]
b <- E[x == levels(x)[2]]
ks.test(a, b)
}) #you get a list with the results for each column