我有一些看起来像这样的数据
# A tibble: 8 x 2
name value
<chr> <dbl>
1 age -1.14
2 daysInHospital 0.371
3 X...lymphocyte 0.469
4 neutrophils... 0.829
5 rfv_age 41
6 rfv_daysInHospital 5
7 rfv_X...lymphocyte 6.2
8 rfv_neutrophils... 91
我想使用ggplot
绘制单个列,其中y-axis
后跟age
,daysInHospital
,X...lymphocyte
和{{1 }}。然后将neurphils
(或类似的东西)的标签作为包含geom_col
我似乎无法仅画出我想要的四个观测值。以下不是我想要的。
rfv_...
期望的输出将是一个堆叠的列,其中包含来自上述四个字符的值,然后注释来自d %>%
ggplot(aes(x = name, y = value)) +
geom_col()
列中的数字,这些数字与包含字符value
的变量相对应。
也就是说,rfv
,age
,daysInHospital
和X...lymphocyte
的值来自模型,并且这些值包含neurphils
(原始特征值)是该观察值的实际值。
数据:
rfv
答案 0 :(得分:2)
这是使用tidyr::extract
的一种方法,因为您的数据有些混乱。
library(tidyr)
library(dplyr)
library(ggplot2)
d %>%
tidyr::extract(col = name, into = c("type","variable"),
regex = "(rfv)?_?(.*)") %>%
mutate(type = replace_na(type,"value")) %>%
pivot_wider(id_cols = variable, values_from = value, names_from = type)
# A tibble: 4 x 3
variable value rfv
<chr> <dbl> <dbl>
1 age -1.14 41
2 daysInHospital 0.371 5
3 X...lymphocyte 0.469 6.2
4 neutrophils... 0.829 91
然后我们可以使用geom_bar
绘制数据:
d %>%
tidyr::extract(col = name, into = c("type","variable"), regex = "(rfv)?_?(.*)") %>%
mutate(type = replace_na(type,"value")) %>%
pivot_wider(id_cols = variable, values_from = value, names_from = type) #%>%
ggplot(aes(x = as.factor(1), y = value, fill = variable)) +
geom_bar(stat = "identity") +
geom_text(aes(label = rfv, x = 1.5), position = position_stack(vjust = 0.5)) +
labs(x = "")
答案 1 :(得分:2)