Question

我正在尝试在数据框行上应用{ "posts" : { "newPostKey" : "newKeyValue" }, "user-posts" : { "uid" : { "newPostKey" : "newKeyValue" } } } Hmisc函数。它通常需要两个向量，一个用于平均值，一个用于平均值的权重。我试图找到wdt.mean / dplyr / tidyr解决方案，但无法弄明白。

purrr

Answer 1

你可以这样做：

df %>% 
  mutate(weighted.means = apply(df, 1, function(x) wtd.mean(x = as.numeric(x[21:40]), 
                                                            weights = as.numeric(x[1:20]))))

或使用这个（长...）tidyverse解决方案：

df %>% 
  rownames_to_column("group") %>% 
  gather(name, value, -group) %>% 
  extract(name, into = c("weight_mean", "number"), regex = "([[:alpha:]]+)(\\d+)") %>% 
  spread(weight_mean, value) %>% 
  group_by(group = as.numeric(group)) %>% 
  summarise(weighted.means = wtd.mean(x = mean, weights = weight))

# A tibble: 10 x 2
#    group weighted.means
#    <dbl>          <dbl>
#  1 1               70.7
#  2 2               82.9
#  3 3               82.4
#  4 4               73.4
#  5 5               70.0
#  6 6               74.1
#  7 7               73.6
#  8 8               77.1
#  9 9               72.6
# 10 10              84.7

应用函数在数据帧行上取两个向量

1 个答案: