我有csv"名人死亡"和列#34;死亡原因"。我想在ggplot2图表中制作,当时我有十大死亡原因。 当我使用csv中的日期时,我不知道如何计算R.
我的csv看起来像: https://i.imgur.com/WFTpzDE.png
而且我认为我将需要向量将导致所有原因。但我不知道如何将它们分组到前10名。
答案 0 :(得分:0)
我将尝试用你提供的基础R和ggplot2提供最好的东西。
library(ggplot2)
celeb <- c("Kim Kardashian", "The chubby kid from stand by me", "The bassist from the local Clash cover band", "One of L. Ron Hubbard's polyps", "Frank Zappa", "Dweezil Zappa", "Moonunit Zappa", "Scott Evil")
death <- c("Gored by rhino", "Eaten by Compies", "Choked on funyun", "Gored by rhino", "Gored by rhino", "Eaten by Compies", "Gored by rhino", "Failed to meet dad's expectations")
df <- cbind(celeb, death)
df <- as.data.frame(df)
所以,我的感觉是,你只想对死亡原因进行排名,然后对它们进行描绘。这太复杂了,但我想我会告诉你一步一步的做法。
#first get counts of deaths
deathcounts <- as.data.frame(table(df$death))
#next put them in decreasing order
topfour <- deathcounts[order(deathcounts$Freq, decreasing=T)[1:4],]
#cool, so rhinos are dangerous mofos. Let's plot these results
deathplot <- ggplot(topfour, aes(x=Var1, y=Freq)) + geom_bar(stat="identity")