两个数据帧的比较

时间:2017-09-11 07:46:34

标签: r excel dataframe comparison find-occurrences

我有一个15200行的excel表,对应于分析其结构的树。我有列中的所有结构(48个结构),它们都被计算在每棵树上。例如,树12607具有3个结构CV11,1个结构IN12,并且所有结构的其余部分都没有(0)。因此,该表看起来像一个巨大的表,其中包含大量的0和树上结构的一些数量。最后一列是给予树的值,根据它上面的结构(每个结构通过它的存在给树提供了许多点)。

问题是:是否存在一些结构或结构组合,它们为树提供了高价值。当然,根据每个结构的值,我们可以看到哪个具有比其他结构更高的值(例如:结构CV11具有值15,结构IN12具有值4)。但我想知道的是,如果我们把所有树的最终值都高于100(我们创建一个新的数据帧“data100”),我们将它们与最终值低于100的树进行比较(我们创建另一个数据帧“ data0“),我们能否发现这些树上发现的结构数量和发生率存在显着差异?因为价值高的结构可能只在价值低于100的树上找到;因为,例如,此结构不允许在同一棵树上找到其他结构。

Voilà,我希望我已经提供了足够的细节......如果您对解决这个问题有任何想法或主张......那就太棒了!

以下是我的剧本。

    > data100
      CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13
1        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2        0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
4        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
5        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
6        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1
7        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
8        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
9        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
10       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
11       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
12       0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0
13       0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
14       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
15       0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
      IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32
1        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2        0    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    0    0    0    0    0    0    0
3        0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0
4        0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0
5        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
6        0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0
7        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
8        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
9        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
10       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
11       0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    2    0    0    0    0    0
12       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    3    0    0
13       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    3    0    0
14       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    3    0    0
15       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
      EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval
1        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
2        1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     56
3        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     10
4        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     10
5        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      4
6        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     24
7        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
8        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
9        0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
10       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0      0
11       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     18
12       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     63
13       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     77
14       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     54
15       0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     20
 [ reached getOption("max.print") -- omitted 60749 rows ]
> sortdata100<-data100[order(data100[,64],decreasing=T),]

> rsortdata100<-sortdata100[sortdata100$ecoval>100,]
> rsortdata100<-na.omit(rsortdata100)#181 lignes
> rsortdata100
      CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13
1291     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1083     0    4    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3919     0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0    0
14685    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
4021     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
5452     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
14686    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0
4022     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0
1013     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2895     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
4719     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    0    0
682      0    3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0
3444     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1299     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0
2713     0    0    0    4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    1    0
      IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32
1291     0    0    0    0    0    0    0    0   30    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1083     3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3919     0    0    1    0    2    0    0    0    2    0    0    0    3    0    0    0    0    0    0   11    0    0    0
14685    0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    0
4021     0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    0
5452     0    0    1    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0    0
14686    0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    2
4022     0    0    0    0    0    0    0    0   11    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1013     0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2895     0    0    0    1    0    0    0    0    4    0    0    3    0    4    3    0    0    0    0    0    0    0    0
4719     0    0    0    0    0    0    0    0   10    0    0    0    0    0    0    0    0    0    0    0    0    0    0
682      0    0    0    0    0    0    0    0    0    0    0    0    0    2    1    0    0    0    0    0    0    0    0
3444     0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
1299     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0
2713     0    0    0    2    0    3    0    0    2    0    0    0    1    5    1    0    0    0    0    0    0    0    0
      EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval
1291     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0   1192
1083     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    424
3919     1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    380
14685    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    370
4021     0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    358
5452     0    0    0    0    0    0    1    0    0   11    0    0    0    0    1    0    0    356
14686    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    354
4022     0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0    0    346
1013     0    8    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    326
2895     0    1    0    0    0    1    0    1    0    0    0    0    0    0    0    1    0    325
4719     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    324
682      0    0    0    6    0    0    0    0    0    0    0    0    0    0    0    0    0    311
3444     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    306
1299     0    8    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    302
2713     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    302
 [ reached getOption("max.print") -- omitted 166 rows ]
> data0<-sortdata100[sortdata100$ecoval<100,]
> data0<-na.omit(data0)
> data0
      CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13
4728     0    0    0    1    0    0    0    3    0    0    0    0    0    0    0    0    0    0    0    1    1    0    0
5339     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0
11766    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
796      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3561     0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0
10581    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0
10618    0    0    0    0    0    0    0    0    0    0    0    1    0    1    0    1    0    1    0    0    0    0    0
14376    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0
14389    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0
790      0    0    0    1    0    0    0    0    1    0    0    2    0    0    0    0    0    0    0    0    1    0    0
3974     0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0
4739     0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    1    0    0    0    0    0    0
156      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2740     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
2950     0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    1    1    0    1    0
      IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32
4728     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0
5339     1    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0
11766    0    0    0    0    0    0    0    0    0    0    1    1    0    0    0    0    0    0    0    0    0    0    0
796      1    1    0    0    1    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0
3561     0    0    0    0    0    0    0    0    3    0    0    0    0    0    0    0    0    0    0    0    0    0    0
10581    0    0    0    1    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0
10618    0    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0
14376    1    0    0    0    0    0    0    0    1    0    0    0    0    2    0    0    0    0    0    0    0    0    0
14389    0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    1    0    0    0    0    0    0    0
790      0    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    0    0
3974     0    0    0    0    0    0    0    0    1    0    0    0    4    0    0    0    1    0    0    0    0    0    0
4739     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
156      0    0    0    0    0    3    0    0    0    0    0    0    0    0    0    0    0    2    0    0    0    0    0
2740     0    0    0    0    0    0    0    0    0    0    0    0    0    6    2    0    0    0    0    0    0    0    0
2950     0    1    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
      EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval
4728     0    0    1    0    0    1    0    0    0    0    0    0    0    0    0    0    0     99
5339     0    1    0    0    0    0    1    0    0    0    0    0    0    0    0    0    0     99
11766    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    1     99
796      1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
3561     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
10581    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0    1    0     98
10618    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0    0     98
14376    2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
14389    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     98
790      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     97
3974     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     97
4739     0    0    0    0    0    0    2    0    0    0    0    0    0    0    0    1    0     97
156      0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     96
2740     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1    0     96
2950     0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0     96
 [ reached getOption("max.print") -- omitted 14984 rows ]

1 个答案:

答案 0 :(得分:0)

也许是这样的?

library(dplyr)
data %>% group_by(ecoval > 100) %>% summarize_all(mean)

应该为您提供ecoval ><=到100

每列的平均值