在zomato文件夹中,有4个csv(孟加拉,中文,意大利语和panjabi)文件。我正在使用基于lexicom的情绪分析将这些文件与正词和负词文本文件进行比较。一切工作正常,我也非常完美地获得了a,c,e,g的值。但是当涉及到绘图时,它给了我这个错误
不知道如何自动为列表类型的对象选择比例。默认为连续。 错误:geom_bar需要以下美感:y
library(stringr)
library(ggplot2)
library(tm)
src <- DirSource("./Data/foodwise/zomato")
docs <- Corpus(src)
docs <- tm_map(docs, removePunctuation)
docs <- tm_map(docs,content_transformer(tolower))
docs <- tm_map(docs, removeNumbers)
docs <- tm_map(docs, removeWords,stopwords("english"))
docs <- tm_map(docs, stripWhitespace)
docs <- tm_map(docs, stemDocument)
writeCorpus(docs, path="./Data")
texts <- readLines("./Data/zomato bengoli.csv.txt")
opinion.lexicon.pos <- scan('./Data/positive-word.txt', what='character', comment.char = ';')
opinion.lexicon.neg <- scan('./Data/negative-word.txt', what='character', comment.char = ';')
jj <- str_split(texts, pattern="\\s+")
a <- lapply(jj,function(x){sum(!is.na(match(x,opinion.lexicon.pos)))})
texts1 <- readLines("./Data/zomato chinese.csv.txt")
jj <- str_split(texts1, pattern="\\s+")
c <- lapply(jj,function(x){sum(!is.na(match(x,opinion.lexicon.pos)))})
texts2 <- readLines("./Data/zomato Italian.csv.txt")
jj <- str_split(texts2, pattern="\\s+")
e <- lapply(jj,function(x){sum(!is.na(match(x,opinion.lexicon.pos)))})
texts3 <- readLines("./Data/zomato panjabi.csv.txt")
jj <- str_split(texts3, pattern="\\s+")
g <- lapply(jj,function(x){sum(!is.na(match(x,opinion.lexicon.pos)))})
x <-c("Bengoli", "Chinese", "Italian", "Panjabi")
y <- c(a, c ,e, g)
data <- data.frame(x, y)
ggplot(data, aes(x, y)) + geom_bar(stat = "identity",aes(fill = x)) + xlab("Cuisines") + ylab("Total count") + ggtitle("")+ scale_fill_manual("Cuisines", values = c("Italian" = "lightpink", "Panjabi" = "lightblue", "Chinese" = "darkgrey", "Bengoli"="lightgreen"))
答案 0 :(得分:1)
如前所述,我们确实需要一个可复制的示例来很好地回答,但我会根据我认为您的要求进行尝试
您的主要问题是您a
,c
,e
和g
都是列表,所以y
也是列表。您需要将它们转换为数字,然后将其求和
jj <- list("Bengoli","English", "Bengoli")
g <- lapply(jj,function(i){sum(!is.na(match(i,"Bengoli")))})
# this is a list:
g
g2 <- sum(unlist(g))
# Now it is a number which you can supply to ggplot
g2
x <-c("Bengoli", "Chinese", "Italian", "Panjabi")
y <- c(g2,4,3,6)
data <- data.frame(x, y)
ggplot(data, aes(x, y)) +
geom_bar(stat = "identity",aes(fill = x)) +
xlab("Cuisines") +
ylab("Total count") +
scale_fill_manual("Cuisines",
values = c("Italian" = "lightpink",
"Panjabi" = "lightblue",
"Chinese" = "darkgrey",
"Bengoli"="lightgreen"))