我想将所有特朗普数据合并到一个栏中,将所有克林顿数据合并到另一个栏中。
即
我认为基本上我需要计算获胜者是特朗普的所有价值的均值,以及获胜者是克林顿的所有价值的均值,但我不确定如何做到这一点,因为我'是一个菜鸟。
这是我当前的代码,如果它有帮助:
library(ggplot2)
healthd = read.csv("R/states.csv")
states = healthd[[1]]
uninsured2015 = healthd[[3]]
uninsured2015 = abs(as.numeric(as.character(gsub("%","", uninsured2015))))
insuredChange = healthd[[4]]
insuredChange = abs(as.numeric(as.character(gsub("%","", insuredChange))))
winner = healthd[[15]]
ggplot(data = healthd, aes(x = states, y = insuredChange, fill=winner)) +
xlab("State") + ylab("Percent Uninsured (2015)") +
scale_fill_manual(values = c("Trump" = "red4", "Clinton" = "blue4")) +
geom_bar(stat="identity") +
theme_bw() +
theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"), axis.text.x=element_text(angle = 90, hjust = 1))
此外,这是我的数据主管:
> head(healthd)
State Uninsured.Rate..2010. Uninsured.Rate..2015. Uninsured.Rate.Change..2010.2015.
1 Alabama 14.60% 10.10% -4.50%
2 Alaska 19.90% 14.90% -5%
3 Arizona 16.90% 10.80% -6.10%
4 Arkansas 17.50% 9.50% -8%
5 California 18.50% 8.60% -9.90%
6 Colorado 15.90% 8.10% -7.80%
Health.Insurance.Coverage.Change..2010.2015. Employer.Health.Insurance.Coverage..2015.
1 215000 2545000
2 36000 390000
3 410000 3288000
4 234000 1365000
5 3826000 19552000
6 419000 2949000
Marketplace.Health.Insurance.Coverage..2016. Marketplace.Tax.Credits..2016.
1 165534 152206
2 17995 16205
3 179445 124346
4 63357 56843
5 1415428 1239893
6 108311 67062
Average.Monthly.Tax.Credit..2016. State.Medicaid.Expansion..2016. Medicaid.Enrollment..2013.
1 $310 FALSE 799176
2 $750 TRUE 122334
3 $230 TRUE 1201770
4 $306 TRUE 556851
5 $309 TRUE 7755381
6 $318 TRUE 783420
Medicaid.Enrollment..2016. Medicaid.Enrollment.Change..2013.2016. Medicare.Enrollment..2016.
1 910775 111599 989855
2 166625 44291 88966
3 1716198 514428 1175624
4 920194 363343 606146
5 11843081 4087700 5829777
6 1375264 591844 820234
X2016.Election.Winner
1 Trump
2 Trump
3 Trump
4 Trump
5 Clinton
6 Clinton
答案 0 :(得分:1)
您必须先将数据汇总到新的数据框中,然后重新绘制它。在R中有很多方法可以做到这一点,但可能dplyr
具有易学,强大和编程安全的最佳组合 - 所以我将使用它。
我愚弄了一些数据,这是代码:
library(ggplot2)
library(dplyr)
n <- 50
ss <- sprintf("State-%.2d",1:n)
u15 <- 10*(runif(n) + 0.5)
icg = 4*(runif(n) + 0.5)
w = sample(c("Candidate-1","Candidate-2"),n,replace=T)
healthd <- data.frame(states=ss,uninsured2015=u15,insuredChange=icg,winner=w)
ggplot(data = healthd, aes(x = states, y = insuredChange, fill=winner)) +
xlab("State") + ylab("Percent Uninsured (2015)") +
scale_fill_manual(values = c("Candidate-1" = "red4", "Candidate-2" = "blue4")) +
geom_bar(stat="identity") + theme_bw() +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
axis.text.x=element_text(angle = 90, hjust = 1))
# make a new aggregated dataframe with dplyr
aghealthd <- healthd %>% group_by(winner) %>%
summarise(uninsured2015=mean(uninsured2015),
insuredChange=mean(insuredChange))
# plot that with the same code, changing only the x-axis
ggplot(data = aghealthd, aes(x = winner, y = insuredChange, fill=winner)) +
xlab("State") + ylab("Percent Uninsured (2015)") +
scale_fill_manual(values = c("Candidate-1" = "red4", "Candidate-2" = "blue4")) +
geom_bar(stat="identity") + theme_bw() +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
axis.text.x=element_text(angle = 90, hjust = 1))
这是情节1:
这是情节2: