我想对数据集中的每一行进行跨列求和。但是,列之一包含NA值。因此,新列的结果-包含列总和的结果-仅返回NA。
所需/问题列为poc.Pop。我尝试在每个列名称前添加“ sum”,并添加“ na.rm = T”,但是这不起作用。我在下面包括了可复制的代码以供参考。谢谢您的帮助!
library(dplyr)
library(plyr)
# read in data
dat <- read.csv("https://raw.githubusercontent.com/vera-institute/incarceration_trends/master/incarceration_trends.csv")
# subset data to more manageable size
dat_cali <- dat %>%
filter(state == "CA")
# remove base dataset to free up memory
rm(dat)
# create new summed columns
dat_cali_sum <- dat_cali %>%
mutate(poc.Inc.Pop = asian_jail_pop + black_jail_pop + latino_jail_pop +
native_jail_pop + asian_prison_pop + black_prison_pop +
latino_prison_pop + native_prison_pop + other_prison_pop,
poc.Pop = asian_pop_15to64 + black_pop_15to64 + latino_pop_15to64 +
native_pop_15to64 + other_pop_15to64,
white.Inc.Pop = white_jail_pop + white_prison_pop,
white.Pop = white_pop_15to64)
# select only relevant columns
dat_final <- dat_cali_sum %>%
select(year, state, county_name, poc.Inc.Pop,
poc.Pop, white.Inc.Pop, white.Pop)
# view problematic, broken data
View(dat_final)