如何使用从csv文件读取的数据框中的3列数据绘制条形图?我尝试使用以下代码进行操作,但在获取所需输出时遇到了一些困难:
setwd("\\path\\to\\csv")
df = read.csv("xxxx.csv")
# All hospitals in AL
AL = df[grep("AL", df$State),]
hos <-subset(AL,Hospital.Name=='COOPER GREEN MERCY HOSPITAL')
# Gives me "Error in -0.01 * height : non-numeric argument to binary operator"
hos <- data.frame (HeartAttack=hos$Heart.Attack.Mortality,HeartFailure=hos$Heart.Failure..Mortality,
Pneumonia=hos$Pneumonia.Mortality)
# Gives me the graph without displaying the x-axis values
# but completely defeats the purpose of reading from a csv file since the values are hard-written
#hos <- data.frame (HeartAttack=c(1),HeartFailure=c(5),Pneumonia=c(10))
barplot(t(as.matrix(hos)),main='Mortality Rate in Cooper Green Mercy Hospital',
xlab='Illness',ylab='Mortality Rate',beside=TRUE)
csv文件具有10个标头(从左到右):Hospital.Name,City,State,County.Name, Heart.Attack.Mortality ,Heart.Attack.Readmission, Heart.Failure..Mortality ,Heart.Failure.Readmission,肺炎,死亡率和肺炎。粗体是我感兴趣的列。
注意:我已经看过这些two SO questions,但是它们并不能完全解决我的问题。
答案 0 :(得分:1)
您的数据在数字列中具有"Not Available"
而不是NA
,因此这些列将成为"factor"
类(如果为stringsAsFactors = TRUE
,则为默认列)或{{1}类}(如果"character"
)。因此,在读取数据后,我立即执行以下操作。
stringsAsFactors = FALSE
另一种更好的可能性是使用
读取数据df[] <- lapply(df, function(x) {
is.na(x) <- x == "Not Available"
x})
i <- sapply(df, function(x) {
y <- as.numeric(as.character(x))
!all(is.na(y))
})
df[i] <- lapply(df[i], function(x) as.numeric(as.character(x)))
然后是您的数据准备代码。
现在该情节。需要df = read.csv("xxxx.csv", stringsAsFactors = FALSE, na.strings = "Not Available")
参数为中间的条形标签腾出空间。
space