我正在使用ggplot绘制气井随时间推移的产量图。
GAS_PRODUCTION_CURVE <- RawdataTest %>% ggplot(mapping=aes(x=DaysOn, y=GasProd_MCF, color=WellID)) +
geom_line(size=0.5) + theme_bw() +
scale_color_manual(values = cols) + scale_y_continuous(label=comma) +
coord_cartesian(xlim = c(0, max(RawdataTest$DaysOn)), ylim = c(0,max(RawdataTest$GasProd_MCF))) +
theme(legend.position="none") + xlab("Days On") +
ylab("Gas Rate [MCF]")
这使我得到了想要的绘图(注意:这只是数据的一部分)。但是,我想绘制井眼数据,但是用变量“ RSOperator”着色。换句话说,我希望所有具有相同RSOperator的孔都具有相同的颜色。这样,用户可以区分孔之间的性能差异。有没有办法调整我的代码来实现这一目标?
答案 0 :(得分:0)
我模拟了一些希望看起来像您的数据,并且您可以看到如何为通用的RSOperator获得相同的颜色。
RawdataTest = data.frame(
DaysOn = rep(1:10,6),
GasProd_MCF = c(rep(1:10,3),rep(2*(1:10),3))+rnorm(60,3,1),
WellID = rep(1:3,each=10,times=2),
RSOperator = rep(letters[1:2],each=30)
)
# create a uniq identifier for observation
RawdataTest <- RawdataTest %>%
mutate(uniq_id=paste(RSOperator,WellID,sep=""))
# create mapping for uniq id to color, depends on RSOperator
MAPPING <- RawdataTest %>% distinct(RSOperator,uniq_id)
RS_COLS = brewer.pal(9,"Set1")
RS_COLS = RS_COLS[1:n_distinct(MAPPING$RSOperator)]
names(RS_COLS) = unique(MAPPING$RSOperator)
PLOT_COLS = RS_COLS[MAPPING$RSOperator]
names(PLOT_COLS) = MAPPING$uniq_id
ggplot(RawdataTest,mapping=aes(x=DaysOn, y=GasProd_MCF,col=uniq_id)) +
geom_line(size=0.5) + theme_bw() +
scale_color_manual(values = PLOT_COLS)
但是,您看到很难将a1与a2等区分开。您可能需要考虑将其与线型结合使用,但是一旦您拥有很多行,它就会变得疯狂:
LINETYPE = rep(1:3,2)
names(LINETYPE) = MAPPING$uniq_id
ggplot(RawdataTest,mapping=aes(x=DaysOn, y=GasProd_MCF,linetype=uniq_id,col=uniq_id)) +
geom_line(size=0.5) + theme_bw() +
scale_color_manual(values = PLOT_COLS) +
scale_linetype_manual(values=LINETYPE)
答案 1 :(得分:0)
根据我从 StupidWolf 的回答中学到的,我们只需要输入一个命名向量到:
ggplot(RawdataTest,mapping=aes(x=DaysOn, y=GasProd_MCF,col=uniq_id)) +
geom_line(size=0.5) + theme_bw() +
scale_color_manual(values = NAMED_VECTOR)
其中每个元素是一种颜色,其名称是标识绘图中每一行的列的值。这同样适用于线型。
我冒昧地创建了一个深受 StupidWolff 在他的回答中所做的启发的函数,但是为了清晰起见,它更多地使用了 %>% 运算符并具有一些附加功能(允许为每个价值)。我必须承认这个函数开始变得更短了。
#id: character vector with the id variable
#variable: character vector with the variable we are going to use for styling
#styles: style options (Preferably the same length or longer as unique variable values - Throws a warning)
#named_style: TRUE or FALSE. Specifies whether to create a named vector from values and styles.
#values: values from a variable for which styles are specified. Must be in the same order than each respective style. Other values will be styled according to other_styles.
#other_styles: applies when values specified. Specifies styles for other (non specified) values.
#Plot style mapper - Generates a named vector that can be used for Ggplot styles and colors
ggplot_style_mapper <- function(df, id, variable, styles, values = NULL, other_styles = NULL) {
variablequo <- enquo(variable)
#Style_by_variable
style_by_variable <- if(is.null(values)) {FALSE} else {TRUE}
#Warning
if((n_distinct(df[[variable]]) > length(styles))&style_by_variable == FALSE) {warning("style vector is shorter than unique id-variables")}
styles <- if(style_by_variable == TRUE&length(styles) > length(values)) {styles[1:length(values)]} else (styles)
#Other styles
other_styles <- if(!is.null(other_styles)) {other_styles[!other_styles %in% styles]} else {NULL}
if((length(other_styles) == 0|is.null(other_styles))&style_by_variable == TRUE&(length(values) < length(unique(df[[variable]])))) {warning("Either other_styles necessary but not specified, or other_styles %in% styles")}
#Named_style = TRUE
named_vector <- if(style_by_variable == TRUE) {
mapped <- df %>%
distinct_at(.vars = c(variable)) %>%
filter(., .data[[!!variablequo]] %in% values) %>%
{if(nrow(.) > length(styles)) add_column(., style = c(rep(styles, length.out = nrow(.)))) else
add_column(., style = styles[1:nrow(.)])}
dataframe <- df %>%
distinct_at(., .vars = c(id, variable)) %>%
select(., all_of(c(id, variable))) %>%
left_join(., mapped, by = variable)
NAs <- dataframe %>%
filter(., is.na(style)) %>%
select(., all_of(c(id, variable))) %>%
{if(nrow(.) == 0) . else if (nrow(.) > length(other_styles)) add_column(., style = c(rep(other_styles, length.out = nrow(.)))) else
add_column(., style = other_styles[1:nrow(.)])}
dataframe %>%
filter(., !is.na(style)) %>%
bind_rows(., NAs) %>%
pull(., .data[["style"]], name = .data[[id]])
} else {
mapped <- df %>%
distinct_at(.vars = c(variable)) %>%
{if(nrow(.) > length(styles)) add_column(., style = c(rep(styles, length.out = nrow(.)))) else
add_column(., style = styles[1:nrow(.)])}
dataframe <- df %>%
distinct_at(., .vars = c(id, variable)) %>%
select(., all_of(c(id, variable))) %>%
left_join(., mapped, by = variable) %>%
pull(., .data[["style"]], name = .data[[id]])}
named_vector }
该函数使我们可以在创建图形之前更轻松地使用样式定义命名向量:
#Create named vectors and graph
#Create named vectors
named_colors <- ggplot_style_mapper(RawdataTest, id = "uniq_id", variable = "RSOperator",
styles = RColorBrewer::brewer.pal(9,"Set1"))
named_linetype <- ggplot_style_mapper(RawdataTest, id = "uniq_id", variable = "WellID",
styles = c(1,2,3))
#Graph
ggplot(RawdataTest,mapping=aes(x=DaysOn, y=GasProd_MCF,linetype=uniq_id,col=uniq_id)) +
geom_line(size=0.5) + theme_bw() +
scale_color_manual(values = named_colors) +
scale_linetype_manual(values= named_linetype)
我们也可以一步完成:
#Alternatively, create the graph in one step (shorter but messier)
ggplot(RawdataTest,mapping=aes(x=DaysOn, y=GasProd_MCF,col=uniq_id, linetype=uniq_id)) +
geom_line(size=0.5) + theme_bw() +
scale_color_manual(values = ggplot_style_mapper(RawdataTest, id = "uniq_id", variable = "RSOperator",
styles = RColorBrewer::brewer.pal(9,"Set1"))) +
scale_linetype_manual(values= ggplot_style_mapper(RawdataTest, id = "uniq_id", variable = "WellID",
styles = c(1,2,3)))
我添加的额外功能是该函数允许我们指定相关变量的每个特定值的样式。它甚至允许我们为几个值指定样式,并将其余的归为一个样式。最后,我们可以为一个或几个值指定样式,并为其余的值指定一组样式。
#Style for each specific value
named_colors <- ggplot_style_mapper(df = RawdataTest, id = "uniq_id", variable = "RSOperator",
styles = c("blue", "red"), values = c("a", "b"))
named_linetype <- ggplot_style_mapper(df = RawdataTest, id = "uniq_id", variable = "WellID",
styles = c(3,2,1), values = c(1,2,3))
#Style for a few values and other style for the rest
named_colors <- ggplot_style_mapper(df = RawdataTest, id = "uniq_id", variable = "RSOperator",
styles = c("blue"), values = c("a"), other_styles = "black")
named_linetype <- ggplot_style_mapper(df = RawdataTest, id = "uniq_id", variable = "WellID",
styles = c(3), values = c(1,2), other_styles = 6)
#Style for a few values and style pallette for the rest
named_colors <- ggplot_style_mapper(df = RawdataTest, id = "uniq_id", variable = "RSOperator",
styles = c("blue"), values = c("a"), other_styles = c("black"))
named_linetype <- ggplot_style_mapper(df = RawdataTest, id = "uniq_id", variable = "WellID",
styles = c(3), values = c(1), other_styles = c(6,7))
希望这对某人有所帮助!
pd:数据
pacman::p_load(RColorBrewer, tidyverse)
#Create dataframe as StupidWolf
#Create data
RawdataTest = data.frame(
DaysOn = rep(1:10,6),
GasProd_MCF = c(rep(1:10,3),rep(2*(1:10),3))+rnorm(60,3,1),
WellID = rep(1:3,each=10,times=2),
RSOperator = rep(letters[1:2],each=30)
)
# create a uniq identifier for observation
RawdataTest <- RawdataTest %>%
mutate(uniq_id=paste(RSOperator,WellID,sep=""))