我想遍历一个数据框,并将这些行作为参数传递给一个函数,以汇总名为df3的数据框的总数。
我尝试使用传统的for循环进行代码,但没有结果。
我在https://adv-r.hadley.nz/functionals.html#pmap中查看过pmap
但是我看不到如何将此示例应用于我的代码。
以下是原始数据中的一些数据:
dput(head(df3,n=3))
structure(list(id = c("81", "83", "85"), look_work = c("yes",
"yes", "yes"), current_work = c("no", "yes", "no"), hf_l5k = c("",
"", ""), ac_l5k = c("", "", ""), hf_5_10k = c("", "1", "1"),
ac_5_10k = c("", "1", "1"), hf_11_20k = c("", "", ""), ac_11_20k = c("",
"", ""), hf_21_50k = c("", "", ""), ac_21_50k = c("", "",
""), hf_51_100k = c("", "", ""), ac_51_100k = c("", "", ""
), hf_m100k = c("", "", ""), ac_m100k = c("", "", ""), s_l1000 = c("",
"", ""), se_l1000 = c("", "", "1"), s_1001_1500 = c("", "1",
"1"), se_1001_1500 = c("", "", ""), s_2001_3000 = c("", "",
""), se_2001_3000 = c("", "1", ""), s_3001_4000 = c("", "",
""), se_3001_4000 = c("", "", ""), s_4001_5000 = c("", "",
""), se_4001_5000 = c("", "", ""), s_5001_6000 = c("", "",
""), se_5001_6000 = c("", "", ""), s_m6000 = c("", "", ""
), se_m6000 = c("", "", ""), s_n_ans = c("", "", ""), se_n_ans = c("",
"", ""), before_work = c("no", "NULL", "yes"), keen_move = c("yes",
"yes", "no"), city_size = c("village", "more than 500k inhabitants",
"more than 500k inhabitants"), gender = c("male", "female",
"female"), age = c("18 - 24 years", "18 - 24 years", "more than 50 years"
), education = c("secondary", "vocational", "secondary")), row.names = c(NA,
3L), class = "data.frame")
以下是参数的数据框hf_names:
structure(list(hf_names = c("hf_l5k", "hf_5_10k", "hf_11_20k",
"hf_21_50k", "hf_51_100k", "hf_m100k"), job = c("hf_l5k_job",
"hf_5_10k_job", "hf_11_20k_job", "hf_21_50k_job", "hf_51_100k_job",
"hf_m100k_job"), tot = c("hf_l5k_tot", "hf_5_10k_tot", "hf_11_20k_tot",
"hf_21_50k_tot", "hf_51_100k_tot", "hf_m100k_tot")), class = "data.frame", row.names = c(NA,
-6L))
这是我尝试使用传统for循环的代码:
library(dplyr)
tot_function <- function(df, filter_tot, col_name1, col_name2) {
# filter desired columns for all jobs
filter_tot <- df %>% filter(col_name1=="1") %>%
summarise(col_name2 = n())
}
for (i in seq_along(hf_names3)) {
tot_function(df3, hf_names3$tot[i], hf_names3$hf_names[i], hf_names3$job[i])
}
预期结果将是数据帧或向量:
hf_l5k_jobs hf_l5_10k_jobs
10 193
但是此代码不会生成任何内容,因为它着眼于trim和runif等简单功能。
答案 0 :(得分:0)
我认为您不必为此过于复杂。您可以从hf_names
中获取名称,从df3
中获取该列的子集,然后计算该列中1的数量。
sapply(hf_names$hf_names, function(x) sum(df3[[x]] == 1))
# hf_l5k hf_5_10k hf_11_20k hf_21_50k hf_51_100k hf_m100k
# 0 2 0 0 0 0
如果您更喜欢tidyverse
,则可以将sapply
更改为map.*
个变体
purrr::map_int(hf_names$hf_names, ~sum(df3[[.]] == 1))