这里有两列,它们都是因子变量。第一个是囚犯的种族,第二个是他们是否再犯。我想按种族划分累犯率。我应该如何实现?
我已经尝试过了:
df %>%
group_by(race, Recidivated) %>%
summarize(count = n()) %>%
arrange (-count) %>%
ggplot(aes(reorder(race, count, FUN = max),
count, fill = race)) +
geom_col() +
coord_flip() +
scale_fill_manual(values=palette_9_colors) +
theme(legend.position = "none") +
labs(x = "Charge", y = "Count",
title="Recidivism by Rates",
subtitle= "Broward County - Source: Propublica",
caption="UrbanSpatialAnalysis.com") +
plotTheme()
结果是一个直方图,计算每个种族的数目。如何获得一个图表,以种族方式直观显示累犯率?谢谢!!!
这里有一些数据!
> head(df)
sex age age_cat race priors_count two_year_recid
1 Male 69 Greater than 45 Other 0 0
2 Male 34 25 - 45 African-American 0 1
3 Male 24 Less than 25 African-American 4 1
4 Male 44 25 - 45 Other 0 0
5 Male 41 25 - 45 Caucasian 14 1
6 Male 43 25 - 45 Other 3 0
r_charge_desc c_charge_desc
1 Aggravated Assault w/Firearm
2 Felony Battery (Dom Strang) Felony Battery w/Prior Convict
3 Driving Under The Influence Possession of Cocaine
4 Battery
5 Poss of Firearm by Convic Felo Possession Burglary Tools
6 arrest case no charge
c_charge_degree r_charge_degree juv_other_count length_of_stay
1 F 0 1
2 F (F3) 0 10
3 F (M1) 1 1
4 M 0 1
5 F (F2) 0 6
6 F 0 1
Recidivated
1 notRecidivate
2 Recidivate
3 Recidivate
4 notRecidivate
5 Recidivate
6 notRecidivate
答案 0 :(得分:0)
library(ggplot2)
ggplot(data = ideaths, aes(x = age_group, y = deaths, fill = fyear)) +
geom_col(position = position_dodge(width = 0.9)) +
geom_text(aes(x = age_group, y = deaths + 3, label = deaths),
position = position_dodge(width = 0.9)) +
ggtitle("Figure 8.") +
scale_fill_manual(values = c("#7F7F7F", "#94D451")) +
scale_y_continuous(breaks = seq(0, 55, 5)) +
theme_light() +
theme(
panel.border = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.major.y = element_line(size = .1, color = "grey"),
axis.title = element_blank(), legend.position = "bottom",
legend.title = element_blank(), plot.title = element_text(size = 10)
)
如果Recidived是逻辑变量,则应该对Recidived使用TRUE或FALSE;对于逻辑而言,mean()是TRUE的比例。
希望这会有所帮助:)