嗨R用户和程序员, 我有一个由4563个氨基酸的蛋白质组成的数据集。使用三种不同的处理和两种不同的氧化剂,该蛋白质中的氨基酸被氧化。我想根据治疗情况在图表中绘制这些氧化的位置。不同的线尺寸将代表不同的氧化剂浓度,线型(虚线和实线)将代表不同类型的氧化剂。我想在每1000个氨基酸处打破轴心。 我用excel和gimp创建了一个类似的模板(这很费时,可能不合适!)。模板中的0.33是行高。 http://dl.dropbox.com/u/58221687/Chakraborty_Figure1.png。 这是数据集: http://dl.dropbox.com/u/58221687/AA-Position-template.xls
提前致谢。 Sourav
答案 0 :(得分:7)
我将在基本图形中执行此操作,但我确信其他人可以在格子或ggplot2中执行相同或更好的操作。我认为,您需要做的主要事情是轻松地使用您的数据制作这种情节,重新设计并重新考虑数据需要采用何种格式才能进行绘图。如果1)它是长格式的,2)基于颜色,线型,宽度等的变量可用作额外的列,我会用你的数据做到这一点。如果您有这样的数据,那么您可以将其减少为仅包含需要绘制线段的氨基酸。我已经模拟了一个类似于你的数据集。您应该能够修改此代码以适合您的情况: 首先是数据集:
set.seed(1)
# make data.frame just with info for the lines you'll actually draw
# your data was mostly zeros, no need for those lines
position <- sort(sample(1:4563,45,replace = FALSE))
# but the x position needs to be shaved down!
# modulars are the real x positions on the plot:
xpos <- position%%600
# line direction appeared in your example but not in your text
posorneg <- sample(c(-1,1),45,replace = TRUE,prob=c(.05,.95))
# oxidant concentration for line width- just rescale the oxidant concentration
# values you have to fall between say .5 and 3, or whatever is nice and visible
oxconc <- (.5+runif(45))^2
# oxidant type determines line type- you mention 2
# just assign these types to lines types (integers in R)
oxitype <- sample(c(1,2),45,replace = TRUE)
# let's say there's another dimension you want to map color to
# in your example png, but not in your description.
color <- sample(c("green","black","blue"),45,replace=TRUE)
# and finally, which level does each segment need to belong to?
# you have 8 line levels in your example png. This works, might take
# some staring though:
level <- 0
for (i in 0:7){
level[position %in% ((i*600):(i*600+599))] <- 8-i
}
# now stick into data.drame:
AminoData <-data.frame(position = position, xpos = xpos, posorneg = posorneg,
oxconc = oxconc, oxitype = oxitype, level = level, color = color)
好的,想象一下,你可以将数据简化为这么简单的事情。绘图中的主要工具(基础)将是segment()。它是矢量化的,因此不需要循环或幻想:
# now we draw the base plot:
par(mar=c(3,3,3,3))
plot(NULL, type = "n", axes = FALSE, xlab = "", ylab = "",
ylim = c(0,9), xlim = c(-10,609))
# horizontal segments:
segments(0,1:8,599,1:8,gray(.5))
# some ticks: (also not pretty)
segments(rep(c((0:5)*100,599),8), rep(1:8,each=7)-.05, rep(c((0:5)*100,599),8),
rep(1:8,each=7)+.05, col=gray(.5))
# label endpoints:
text(rep(10,8)+.2,1:8-.2,(7:0)*600,pos=2,cex=.8)
text(rep(589,8)+.2,1:8-.2,(7:0)*600+599,pos=4,cex=.8)
# now the amino line segments, remember segments() is vectorized
segments(AminoData$xpos, AminoData$level, AminoData$xpos,
AminoData$level + .5 * AminoData$posorneg, lty = AminoData$oxitype,
lwd = AminoData$oxconc, col = as.character(AminoData$color))
title("mostly you just need to reshape and prepare\nyour data to do this easily in base")
对于某些人的口味来说,这可能过于手工制作,但这是我进行特殊绘图的方式。