我仍然围绕着一些R语法,想要询问如何有效地执行以下分析,而不必将整个数据帧从长到大等等。
这是我的数据框:
> data.frame(SOExample)
StudyID TimePoint Group Conc
1 N0920235 BL Control 0.7998743
2 N1020555 BL Control 0.3839061
3 N1020621 BL Control 0.5446354
4 N1121951 BL Control 0.5146689
5 N1122107 BL Control 0.5431685
6 N1122225 BL Control 0.5775356
7 N1122221 BL Control 0.9474015
8 N1222611 BL Control 0.6194468
9 N1222745 BL Control 0.7110226
10 N1222781 BL Control 0.5347863
11 N1223363 BL Control 0.5079631
12 N1223541 BL Control 0.5054484
13 N1223579 BL Control 0.8162196
14 N1122171 BL Control 0.4997904
15 N0920198 BL Control 0.5924141
16 N0920367 BL Control 0.6244761
17 N1021085 BL Control 0.7759849
18 N1121329 BL Control 0.3845348
19 N1121389 BL Control 1.1695306
20 N1121475 BL Control 1.7254820
21 N1121871 BL Control 0.7080889
22 N1121875 BL Control 0.8214585
23 N1122021 BL Control 0.7384744
24 N1122103 BL Control 0.6026823
25 N1122283 BL Control 0.7581727
26 N1122321 BL Control 0.5282900
27 N1222493 BL Control 0.4258173
28 N1222529 BL Control 0.1538139
29 N1222587 BL Control 0.7663453
30 N1222705 BL Control 0.5873847
31 N1222693 BL Control 0.6584241
32 N1222761 BL Control 0.3321459
33 MP0001 BL Patient 0.8216681
34 MP0002 BL Patient 0.4800922
35 MP0007 BL Patient 0.8822297
36 MP0008 BL Patient 0.8975272
37 MP0010 BL Patient 0.7567058
38 MP0011 BL Patient 0.4893127
39 MP0017 BL Patient 0.5840319
40 MP0022 BL Patient 0.8053227
41 MP0023 BL Patient 0.7837370
42 MP0024 BL Patient 0.3938870
43 MP0027 BL Patient 0.6345636
44 MP0028 BL Patient 0.6234141
45 MP0029 BL Patient 0.7101115
46 MP0001 3M Patient 0.5415225
47 MP0002 3M Patient 0.3986928
48 MP0007 3M Patient 0.5722799
49 MP0008 3M Patient 0.5140331
50 MP0010 3M Patient 0.4913495
51 MP0011 3M Patient 0.5288351
52 MP0017 3M Patient 0.2931565
53 MP0023 3M Patient 0.2149173
54 MP0024 3M Patient 0.3794694
55 MP0028 3M Patient 0.6322568
56 MP0029 3M Patient 0.5297962
所以我想做的事情真的很简单。在TimePoint“BL”比较患者与对照。但出于某种原因,除了我的代码外,R不会出现:
t.test(Conc~Group[TimePoint=="BL"], data=SOExample)
这是我收到的错误消息:
Error in model.frame.default(formula = Conc ~ Group[TimePoint == "BL"], :
variable lengths differ (found for 'Group[TimePoint == "BL"]')
现在进一步下来,我想进行pairwise.t.test
比较BL患者与对照组和3M患者对照组。我觉得,像下面这样的东西会起作用,但你会看到R不喜欢它:
> pairwise.t.test(SOExample$Conc~Group|TimePoint, data=SOExample)
Error in factor(g) : argument "g" is missing, with no default
所以我也尝试了以下内容:
> t.test(Conc~Group, data=SOExample[SOExample$TimePoint=="BL",])
Welch Two Sample t-test
data: Conc by Group
t = -0.452, df = 36.94, p-value = 0.6539
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.1638470 0.1040813
sample estimates:
mean in group Control mean in group Patient
0.6518559 0.6817387
但是现在,当我想比较3M与对照的患者时,我收到了这样的信息:
> t.test(Conc~Group, data=SOExample[SOExample$TimePoint=="3M",])
Error in t.test.formula(Conc ~ Group, data = SOExample[SOExample$TimePoint == :
grouping factor must have exactly 2 levels
有什么想法吗?当然,我可以改变我的整个数据格式,但这只是一种痛苦。我不希望同一数据集有多个文本文件。
答案 0 :(得分:1)
我不完全确定你要求的是什么,因为问题措辞对我来说有点混乱,但这里有一些选择:
所有患者与TimePoint BL的所有对照:
t.test(Conc~Group, data=SOExample[SOExample$TimePoint=="BL",])
所有患者在3M与所有对照组在BL:
with(SOExample,t.test(Conc[TimePoint=="BL" & Group=="Control"],
Conc[TimePoint=="3M" & Group=="Patient"]))
3M患者与BL患者的成对比较(基于研究ID配对):
ID.3M <- SOExample[SOExample$TimePoint=="3M",]$StudyID
df <- SOExample[SOExample$StudyID %in% ID.3M,]
t.test(Conc~TimePoint, data=df, paired=T)