我想我在这里看不到明显的东西。我有一个多项选择题(date here),有5个答案类别。
我想将所有5个变量融合在一起,以获得一个带ggplot2的图形。这是我的代码:
mydata <- data.frame(data$Q006_01, data$Q006_02, data$Q006_03, data$Q006_04, data$Q006_05) # multiple choice question
md <- melt(mydata, id=c("data.Q006_01", "data.Q006_02", "data.Q006_03", "data.Q006_04", "data.Q006_05"))
luogo_lavoro <- factor(md[,1]) # error here?
ggplot(data, aes(x=luogo_lavoro)) + geom_histogram() + xlab("") + ylab("Number of participants") + ggtitle("If you had to choose now, where would you be willing to accept a job?") + theme(axis.text.y = element_text(colour = "black"), axis.text.x = element_text(colour = "black")) + scale_x_discrete(labels=str_wrap(c("in the district I live in", "in another district as long as reachable within a dayride", "in the north of Italy", "in the rest of Italy", "abroad", "NA"), width=30)) + ggsave((filename="luogo_lavoro.pdf"), scale = 1, width = par("din")[1], height = par("din")[2], units = c("in", "cm", "mm"), dpi = 300, limitsize = TRUE)
我在这里错了什么?
答案 0 :(得分:3)
喜欢这个吗?
library(ggplot2)
library(reshape2)
library(stringr)
data <- data.frame(id=1:nrow(data),data)
md <- melt(data,id="id")
ggplot(subset(md,value & !is.na(value)), aes(x=variable)) +
geom_histogram(colour="grey50",fill="lightgreen") + xlab("") + ylab("Number of participants") +
ggtitle("If you had to choose now, where would you be willing to accept a job?") +
theme(axis.text.y = element_text(colour = "black"),
axis.text.x = element_text(colour = "black")) +
scale_x_discrete(labels=str_wrap(c("in the district I live in",
"in another district as long as reachable within a dayride",
"in the north of Italy", "in the rest of Italy", "abroad", "NA"), width=30)) +
coord_flip()+
ggsave((filename="luogo_lavoro.pdf"), scale = 1, width = par("din")[1], height = par("din")[2],
units = c("in", "cm", "mm"), dpi = 300, limitsize = TRUE)
在melt(...)
中,id=...
参数必须指定一个区分不同行的列(相当于rownames)。所以我在数据中添加了一个id列并将其融合在一起。现在md
有三列:id
,variable
和value
。 variable
包含以前为列名称的内容,因此Q006_01
等,value
包含T
或F
,具体取决于响应。如果没有答案,value
也可以包含NA
。
因此,在调用ggplot(...)
时,我们使用md的子集,其中响应(value
)为TRUE,而不是NA
。执行此操作,geom_hist(...)
计算TRUEs
的数量。我在最后添加了coord_flip()
,以便标签更具可读性。
答案 1 :(得分:0)
您可能需要将md
传递给ggplot
而不是rawdata
。此外,最好将luogo_lavoro
作为md
的一部分:
md$luogo_lavoro <- factor(md[,1])