我有2个不同大小的数据帧 - 一个约300行,另一个约30行。尺寸将根据所选输入而变化。我已经成功构建了R代码,它将为每个输入绘制一个循环的输出,但我无法弄清楚如何将所有迭代放到一个图表上。有很多关于多个图的文章,但到目前为止我没有取得任何成功,即不同大小的数据框,在一个图表上绘制所有迭代(不同大小)(一页上不是多个图表 - 一页上有一个图表) 。下面是用于生成单个图表的代码 - 我无法弄清楚如何将它们全部放在同一个图表上....
WellS <- rep(WellSelect[i], length(EW))
WellC <- rep(WellSelect[i], length(X))
dfSurvey <- data.frame(Well = WellS, MD = MD, EW = EW, NS = NS, TVD = TVD)
dfCalc <- data.frame(Well = WellC, Perf = P, X = X, Y = Y, TVDp = TVDp)
上面的代码将此处未显示的计算编译为数据帧dfSurvey和dfCalc。请注意,“WellSelect”是驱动计算原始数据输入的主要变量。可能有2到4000多个独特的“WellSelect”可能性,每个都有2个第一句中提到的大小的数据帧 - 所有这些都是“WellSelect”独有的。除了提到的图表外,一切都有效。我试图绑定数据帧,但不知道如何在不同大小的df上进行。
pname <- paste0(dfSurvey$Well[i])
p <- ggplot() + geom_point(data = dfSurvey, aes(x=EW, y=NS), shape = 2, size = 2, color = "blue1") +
geom_point(data = dfCalc, aes(x=X, y=Y), shape = 17, size = 5, color = "Chartreuse3") +
ggtitle(pname)
ggsave(paste0(pname, ".png"), p)
print(p)
请注意,“dfSurvey”是较大的数据帧,“dfCalc”较小。我很感激一些指导。
已编辑包含数据集和示例图
以下是我目前通过编码获得的情节:
Each "WellSelect" currently generates its own plot
这就是我想要实现的目标:
Combined plot, note red triangles represent "dfCalc" and solid lines are "dfSurvey"
这些链接上有“DS”和“Perf”的缩写示例数据集:
DS(请注意,绘制的变量是EW与NS): https://drive.google.com/open?id=0B5pFHCTpv6BWTUh3MWJoaVhaT0kxZzJFVWJ4QTFaM0Q5S29j
逆足: https://drive.google.com/open?id=0B5pFHCTpv6BWMjhLZnF3Zk9mM0hZaXYxLWVKUlBnWXlPQ0xB
我已经包含了下面代码的全部广度,这些代码应该使用上面的文件运行,其高潮是如前所示的各个图。我知道,代码效率不高,但我是新手,所以我只需要现在有用的东西。
library(ggplot2)
DS <- read.csv(file = "DirectionalSurveys.csv")
Perf <- read.csv(file = "Perforation.csv")
colnames(DS) <- c "IDWELL", "API", "WellName", "Division", "MD", "INCL", "AZIM", "NS", "EW", "TVD", "DLS")
colnames(Perf) <- c("IDWELL", "API", "WellName", "County", "MidPerfMD", "MidPerfTVD")
WellSelect <- c("LINDA GREATHOUSE BRK 1", "LINDA GREATHOUSE BRK 3", "LINDA GREATHOUSE BRK 5", "LINDA GREATHOUSE BRK 205",
"BARRY GREATHOUSE A 5", "BARRY GREATHOUSE A 10", "BARRY GREATHOUSE B 3")
for(i in seq_along(WellSelect)) {
S <- DS$MD[DS$WellName == WellSelect[i]]
P <- Perf$MidPerfMD[Perf$WellName == WellSelect[i]]
INCL <- DS$INCL[DS$WellName == WellSelect[i]]
AZIM <- DS$AZIM[DS$WellName == WellSelect[i]]
NS <- DS$NS[DS$WellName == WellSelect[i]]
EW <- DS$EW[DS$WellName == WellSelect[i]]
TVD <- DS$TVD[DS$WellName == WellSelect[i]]
#Subset to get the survey depths deeper than "P"
resultGT <- outer(S, P, '>=')
resultGT[resultGT == FALSE] <- 50
rownames(resultGT) <- paste0(S)
colnames(resultGT) <- paste0("P=", P)
minGT <- as.numeric(rownames(resultGT)[apply(resultGT , 2, which.min)])
#P is mid-perf MD for each stage, Deep is Survey depth below P, Shallow is Survey depth above P
deep <- S[match(minGT, S)]
shallow <- S[match(minGT, S) - 1]
#Subset "DS" to WellSelect
Sub1 <- DS[DS$WellName == WellSelect[i], ]
#Subset Sub1 to get the Survey data
Sub2 <- Sub1[ , 5]
#Match deep and shallow to the Survey depths to get location in DS
deepRow <- match(deep, Sub2)
shallowRow <- match (shallow, Sub2)
#Pull the other data for deep and shallow from DS
deepData <- Sub1[deepRow, ]
shallowData <- Sub1[shallowRow, ]
#Calculate Survey Variables
AA29 <- 2*3.1416/360
AY <- shallowData[ , "INCL"] + ((P - shallowData[ , "MD"]) / (shallowData[ , "MD"] - deepData[ , "MD"]) * (shallowData[ , "INCL"] - deepData[ , "INCL"] ))
AZ <- shallowData[ , "AZIM"] + ((P - shallowData[ , "MD"]) / (shallowData[ , "MD"] - deepData[ , "MD"]) * (shallowData[ , "AZIM"] - deepData[ , "AZIM"] ))
BA <- 0.000001 + acos(cos(AY * AA29 - shallowData[ , "INCL"] * AA29) - sin(shallowData[ , "INCL"] * AA29) * sin(AY * AA29) * (1 - cos(shallowData[ , "AZIM"] * AA29 - AZ * AA29)))
BB <- 2 / BA * (tan(BA / 2))
##NOTE: "X" and "Y" below are the plotted variables for the red triangles shown on the plots previously##
Y <- (P - shallowData[ , "MD"]) * ((sin(AY * AA29) * cos(AZ * AA29)) + (sin(shallowData[ , "INCL"] * AA29) * cos(shallowData[ , "AZIM"] * AA29))) / 2 * BB + shallowData[ , "NS"]
X <- (P - shallowData[ , "MD"]) * ((sin(AY * AA29) * sin(AZ * AA29)) + (sin(shallowData[ , "INCL"] * AA29) * sin(shallowData[ , "AZIM"] * AA29))) / 2 * BB + shallowData[ , "EW"]
TVDp <- (P - shallowData[ , "MD"]) * (cos(AY * AA29) + cos(shallowData[ , "INCL"] * AA29)) / 2 * BB + shallowData[ , "TVD"]
#***********************************************************#
#Calculations all done, now on to the graphing process......#
#***********************************************************#
#fill in "WellSelect to match length of dataframe
WellS <- rep(WellSelect[i], length(EW))
WellC <- rep(WellSelect[i], length(X))
#build dataframes for plots
dfSurvey <- data.frame(Well = WellS, MD = S, EW = EW, NS = NS, TVD = TVD)
dfCalc <- data.frame(Well = WellC, Perf = P, X = X, Y = Y, TVDp = TVDp)
dfSurvey <- dfSurvey[order(dfSurvey$Well, dfSurvey$MD), ]
dfCalc <- dfCalc[order(dfCalc$Well, dfCalc$Perf), ]
###WORKS!!!! but just coded to save each plot and not combine
pname <- paste0(dfSurvey$Well[i])
p <- ggplot() + geom_point(data = dfSurvey, aes(x=EW, y=NS), shape = 2, size = 2, color = "blue1") +
geom_point(data = dfCalc, aes(x=X, y=Y), shape = 17, size = 5, color = "Chartreuse3") + ggtitle(pname)
ggsave(paste0(pname, ".png"), p)
print(p)
}
希望这很有用。如果您还有其他需要,请告诉我。谢谢你的帮助!
答案 0 :(得分:0)
考虑在已编译的单数据框架中绑定所有数据框,并使用ggplot
的 group 和 color 参数:
专门替换for
循环:
for(i in seq_along(WellSelect)) {
...
}
使用lapply
构建数据框列表并删除所有绘图线(稍后完成):
df_lists <- lapply(seq_along(WellSelect), function(i) {
# ... same code
# build dataframes for plots
dfSurvey <- data.frame(Well = WellS, MD = S, EW = EW, NS = NS, TVD = TVD)
dfCalc <- data.frame(Well = WellC, Perf = P, X = X, Y = Y, TVDp = TVDp)
dfSurvey <- dfSurvey[order(dfSurvey$Well, dfSurvey$MD), ]
dfCalc <- dfCalc[order(dfCalc$Well, dfCalc$Perf), ]
return(list(dfSurvey, dfCalc))
}
# COMPILED DATAFRAMES
dfSurveyAll <- do.call(rbind, lapply(df_lists, "[[", 1))
dfCalcAll <- do.call(rbind, lapply(df_lists, "[[", 2))
然后使用 group 和 color 参数运行一个单一的图
p <- ggplot() +
geom_point(data = dfSurveyAll, aes(x=EW, y=NS, group="Well", colour="Well"),
shape = 2, size = 2) +
geom_point(data = dfCalcAll, aes(x=X, y=Y, group="Well", colour="Well"),
shape = 17, size = 5) + ggtitle(pname)
p
当你通过 WellName 因子对 DS 数据框进行子集化时,甚至可以使用by
。所以在for
循环中的块下面:
for(i in seq_along(WellSelect)) {
S <- DS$MD[DS$WellName == WellSelect[i]]
P <- Perf$MidPerfMD[Perf$WellName == WellSelect[i]]
INCL <- DS$INCL[DS$WellName == WellSelect[i]]
AZIM <- DS$AZIM[DS$WellName == WellSelect[i]]
NS <- DS$NS[DS$WellName == WellSelect[i]]
EW <- DS$EW[DS$WellName == WellSelect[i]]
TVD <- DS$TVD[DS$WellName == WellSelect[i]]
...
Sub1 <- DS[DS$WellName == WellSelect[i], ]
...
WellS <- rep(WellSelect[i], length(EW))
WellC <- rep(WellSelect[i], length(X)
...
}
可以替换为by
,其参数 sub 是一个子集化的数据帧,但 Perf (一个单独的数据帧)除外。在这里,by
返回两个数据文件的内部列表的命名列表或上面的lapply
等效结构。
df_lists <- by(DS, DS$WellName, FUN=function(sub) {
S <- sub$MD
P <- Perf$MidPerfMD[Perf$WellName == sub$WellName[1]]
INCL <- sub$INCL
AZIM <- sub$AZIM
NS <- sub$NS
EW <- sub$EW
TVD <- sub$TVD
...
Sub1 <- sub
...
WellS <- rep(sub$WellName[1], length(EW))
WellC <- rep(sub$WellName[1], length(X)
# build dataframes for plots
# ... same as lapply above
})