数据子集的PCA双标图

时间:2014-06-17 09:02:56

标签: r plot pca vegan

我试图为数据子集生成pca biplots。 在相同的主成分环境中,我想仅基于水分水平绘制子集。

# Packages
library(vegan)

# Sample data
data(dune, dune.env)
dd <- cbind(dune.env, dune)

# Runnig PCA
pca1 <- rda(dd[, -c(1:5)], scale=T)

# Plot
plot(pca1, scaling=3)

# Now, instead the plot above, I'd like to get 5 different plots (one per dd$Moisture level) where I see the principal components scores but only for the subsets based on:
levels(dd$Moisture)

非常感谢任何投入!!

2 个答案:

答案 0 :(得分:1)

素食主义 ggvegan 包中使用ordixyplot()执行这些分面图的方法稍微容易一些(目前有时需要完成alpha这么简单的事情)用手)。

实施例

# Packages
library("vegan")
library("ggvegan")

# Sample data
data(dune, dune.env)

# PCA
pca1 <- rda(dune, scale = TRUE)

ggvegan 版本

正如我所说,你需要手工做一些事情,但是你可以通过ggplot免费获得传奇。

## use fortify method to extract scores in ggplot-friendly format
scrs <- fortify(pca1, scaling = 3)
## take only site scores for this
scrs <- with(scrs, scrs[Score == "sites", ])
## add on Moisture variable
scrs <- cbind(scrs, Moisture = dune.env$Moisture)

## for all points on one plot, Moisture coded by colour
plt <- ggplot(scrs, aes(x = Dim1, y = Dim2, colour = Moisture)) + 
         geom_point() + coord_fixed()
plt

## to facet that plot into multiple panels
plt + facet_wrap( ~ Moisture)

ordixyplot()版本

更多的工作是在ordixyplot()内完成,而不是 ggvegan 版本,但你必须更加努力地添加密钥(图例),我永远不会记得如何用Lattice做到这一点。

## Single plot
ordixyplot(pca1, data = dune.env, formula = PC1 ~ PC2, group = Moisture)

## Facet plot
ordixyplot(pca1, data = dune.env, formula = PC1 ~ PC2 | Moisture)

基本图形

对于基本图形,有一个更简单的版本可以为单个图上的点着色。一个版本是

scrs <- scores(pca1, display = "sites")
cols <- c("red","green","blue","orange","purple")
plot(scrs[,1], scrs[,2], pch = 19, 
     col = cols[as.numeric(as.character(dune.env$Moisture))])
legend("topright", legend = 1:5, title = "Moisture", pch = 19, 
       col = cols, bty = "n")

您可以在我几年前撰写的博客post中使用基本图形,以此方式找到更多关于这种情节的信息。

答案 1 :(得分:0)

# Packages
library("vegan")

# Preparing data
data(dune, dune.env)
dd <- cbind(dune.env, dune)
# Order data frame based on dd$Moisture
dd <- dd[order(dd$Moisture),]
str(dd)

# Runnig PCA
pca1 <- rda(dd[, -c(1:5)], scale=T)

# Plot
biplot(pca1, scaling=3, type = c("text", "points"))

# I need to get 5 different plots (one per dd$Moisture level)
# grab the scores on axes required
site.scr <- scores(pca1, display = "sites", scaling = 3) # x
spp.scr <- scores(pca1, display = "species", scaling = 3) # y

## generate x,y lims
xlims <- range(site.scr[,1], spp.scr[,1])
ylims <- range(site.scr[,2], spp.scr[,2])

## plot what you want, but leave out sites
biplot(mod, display = "species", 
       xlim=xlims, ylim=ylims, 
       scaling = 3)

## now add in scores as you want, to simulate we plot:
# Moisture - All together but can be in independetn plots
points(site.scr[1:7,], col = "blue", pch = 2)
points(site.scr[8:11,], col = "green", pch = 2)
points(site.scr[0:0,], col = "orange", pch = 2) # Level 3 not present
points(site.scr[12:13,], col = "grey", pch = 2)
points(site.scr[14:20,], col = "black", pch = 2)