Question

我已将来自多个试验的数值数据结合起来，我想仅查看ggplot中所有试验中具有完整数据的组。我的数字数据是cd，组是这样做的：

+-------+-----+-------+
| trial | did |  cd   |
+-------+-----+-------+
|     1 |   1 | 12.07 |
|     2 |   1 | 16.8  |
|     3 |   1 | 11.83 |
|     1 |   2 | 13.43 |
|     2 |   2 | 12.54 |
|     1 |   3 | 8.3   |
|     1 |   4 | 6.39  |
|     2 |   4 | 5.6   |
|     3 |   4 | 6.46  |
|     1 |   5 | 11.63 |
|     1 |   6 | 7.77  |
|     2 |   6 | 10.87 |
|     3 |   6 | 12.15 |
|     2 |   7 | 24.23 |
|     3 |   7 | 7.72  |
|     1 |   8 | 8.71  |
+-------+-----+-------+

这是我制作图表的代码：

f <- ggplot(data, aes(x = trial, y = cd, group = did))

f  + geom_line(aes(color=did,group=did), show.legend = F)

我想排除缺少试用数据的确实。

click here to see my plot

Answer 1

正如@Mike H指出的，更好的方法是首先完成组的子集。一种方法是filter使用dplyr：

library(dplyr)
library(ggplot2)

data %>%
  group_by(did) %>%
  filter(n()==3) %>%
  ggplot(aes(x = trial, y = cd, group = did)) + 
  geom_line(aes(color=did,group=did), show.legend = F)

数据：

data = structure(list(trial = c(1, 2, 3, 1, 2, 1, 1, 2, 3, 1, 1, 2, 3, 2, 3, 1), did = c(1, 1, 1, 2, 2, 3, 4, 4, 4, 5, 6, 6, 6, 7, 7, 8), cd = c(12.07, 16.8, 11.83, 13.43, 12.54, 8.3, 6.39, 5.6, 6.46, 11.63, 7.77, 10.87, 12.15, 24.23, 7.72, 8.71)), .Names = c("trial", "did", "cd"), class = "data.frame", row.names = c(NA, -16L))

如何在ggplot中排除包含不完整数据的组？

1 个答案: