有没有一种方法可以将总和添加到fviz_eig图中?

时间:2020-05-31 14:08:57

标签: r ggplot2 pca

我正在尝试实现一个很好的PC图以及解释的累积方差。 我正在处理的数据框位于https://www.kaggle.com/miroslavsabo/young-people-survey?select=responses.csv

df.responses <- read.csv("Data/responses.csv")
pref <- colnames(df.responses[0:63]) #columns for Music, Movies and Hobbies preferences
for(i in 1:length(pref)){
  df.responses[is.na(df.responses[,i]), i] <- median(df.responses[,i], na.rm = TRUE)
}
df.movies <- data.frame(df.responses[20:31])

在上面我刚刚加载了df,删除了我感兴趣的col的na,然后选择了我要进行PCA的子集。

library(ggplot2)
library(factoextra)

pca.movies <- prcomp(df.movies, scale = TRUE,)
pca.movies$rotation <- -pca.movies$rotation
pca.movies$x <- -pca.movies$x

fviz_pca_var(pca.movies,
             col.var = "contrib",
             gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
             repel = TRUE   
)

pv.movies <- pca.movies$sdev^2 
pvp.movies <- pv.movies/sum(pv.movies)

pvp.movies

fviz_eig(pca.movies,
         addlabels = T, 
         barcolor = "#E7B800", 
         barfill = "#E7B800", 
         linecolor = "#00AFBB", 
         choice = "variance", 
         ylim=c(0,25))

plot(cumsum(pvp.movies),xlab = "Cumulative proportion of Variance Explained", ylim=c(0,1),type = 'b') 

通过上面的内容,我设法获得了两个不错的PCA图,我想在第二个图上添加累计和线(第三个难看的图所示) 有没有办法将这样的线添加到fviz_eig图? 我知道此PCA并非真正有效,我只是通过一些dataviz挑战自己。

1 个答案:

答案 0 :(得分:1)

fviz_eig返回的对象是ggplot对象,因此您可以按以下步骤合并两个图:

p <- fviz_eig(pca.movies,
         addlabels = T, 
         barcolor = "#E7B800", 
         barfill = "#E7B800", 
         linecolor = "#00AFBB", 
         choice = "variance", 
         ylim=c(0,25))

df <- data.frame(x=1:length(pvp.movies),
                 y=cumsum(pvp.movies)*100/4)
p <- p + 
     geom_point(data=df, aes(x, y), size=2, color="#00AFBB") +
     geom_line(data=df, aes(x, y), color="#00AFBB") +
     scale_y_continuous(sec.axis = sec_axis(~ . * 4, 
                                   name = "Cumulative proportion of Variance Explained") )
print(p)

enter image description here