ggplot只有x,y值填充条件z = value

时间:2016-04-19 12:17:26

标签: r ggplot2

我有一个这样的数据框:

> str(kk_max100)
'data.frame':   134750 obs. of  15 variables:
 $ TP     : num  1850 1850 1850 1850 1850 1850 1850 1850 1850 1850 ...
 $ TN     : int  26 26 26 26 26 26 26 26 26 26 ...
 $ FP     : int  74 74 74 74 74 74 74 74 74 74 ...
 $ FN     : int  0 0 0 0 0 0 0 0 0 0 ...
 $ TotP   : num  1850 1850 1850 1850 1850 1850 1850 1850 1850 1850 ...
 $ TotN   : int  100 100 100 100 100 100 100 100 100 100 ...
 $ TOTAL  : num  1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 ...
 $ MCC    : Factor w/ 5 levels "0.5","0.6","0.7",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ SEN    : num  1 1 1 1 1 1 1 1 1 1 ...
 $ SPC    : num  0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 ...
 $ ACC    : num  0.962 0.962 0.962 0.962 0.962 0.962 0.962 0.962 0.962 0.962 ...
 $ PPV    : num  0.962 0.962 0.962 0.962 0.962 0.962 0.962 0.962 0.962 0.962 ...
 $ NPV    : num  1 1 1 1 1 1 1 1 1 1 ...
 $ B      : num  0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 ...
 $ VARCOST: num  -1 -0.826 -0.652 -0.478 -0.304 -0.13 0.044 0.218 0.392 0.566 ...

我希望绘制填充条件B = 0.5的MCC - VARCOST值,如何指定条件?

这是我正在使用的ggplot代码:

p <- ggplot(kk_max100, aes(x=MCC, y=VARCOST)) + geom_violin(trim=FALSE)
p + geom_boxplot(width=0.1)

感谢

2 个答案:

答案 0 :(得分:2)

您可以像Jimbou推荐的那样进行分组,这将是......推荐的方式。你可以愚弄,例如使用dplyr包。示例来自neighboring thread

library(ggplot2)
library(tidyr)
library(dplyr)

set.seed(9)
d1 = data.frame(d1 = rnorm(100, mean=5))
d2 = data.frame(d2 = rnorm(50, mean=7))
xy <- data.frame(d1 = d1, d2 = d2)

xy <- gather(xy)

# recommended way, you can specify a vector of levels, d1, d2...
ggplot(xy[xy$key %in% c("d1", "d2")], aes(x = value)) +
  geom_density()

# or for only one level
ggplot(xy[xy$key == "d1"], aes(x = value)) +
  geom_density()

# using dplyr
xy %>%
  filter(key == "d1") %>%
  ggplot(., aes(x = value)) +
  geom_density()

答案 1 :(得分:2)

简单地将数据子集化:

kk_max100[ kk_max100$B == 0.5, ]

或者从RomanLuštrik的帖子中启发你可以使用不同的因素和facet_grid来显示不同的B值。我使用了ggplot2包中包含的mtcars数据:

# Set factors
mtcars$grouping_factor <- ifelse(mtcars$gear>3,1,2)
# or a binning approach
mtcars$grouping_factor <- .bincode(mtcars$disp,c(0,100,300,max(mtcars$disp)))
# the plot
p <- ggplot(mtcars, aes(x=factor(cyl), y=mpg))
p + geom_violin() + facet_grid(~grouping_factor)