按列计数组合,顺序不重要R

时间:2018-07-19 18:31:11

标签: r combinations

dat <- data.frame(A = c("r","t","y","g","r"), B = c("g","r","r","t","y"), C = c("t","g","t","r","t"))

  A B C
1 r g t
2 t r g
3 y r t
4 g t r
5 r y t

我想列出在三列中一起出现的字符,而忽略顺序。 例如

Combinations  Freq
r t g         3
y t r         2

如果我想添加名义变量(例如性别)的频率计数,该怎么办?

例如

dat <- data.frame(A = c("r","t","y","g","r"), B = c("g","r","r","t","y"), C = c("t","g","t","r","t"),Gender = c("male", "female", "female", "male", "male"))

dat

A B C Gender 1 r g t male 2 t r g female 3 y r t female 4 g t r male 5 r y t male

要获取此信息:

Combinations Freq Male Female r t g 3 2 1 y t r 2 1 1

2 个答案:

答案 0 :(得分:4)

你可以做...

data.frame(table(combo = sapply(split(as.matrix(dat), row(dat)), 
  function(x) paste(sort(x), collapse=" "))))

  combo Freq
1 g r t    3
2 r t y    2

出于可读性考虑,我建议多行处理和/或使用magrittr:

d = as.matrix(dat)
library(magrittr)

d %>% split(., row(.)) %>% sapply(
  . %>% sort %>% paste(collapse = " ")
) %>% table(combo = .) %>% data.frame

  combo Freq
1 g r t    3
2 r t y    2

重新编辑/提出新问题,我会采取一些不同的方法,也许就像...

# new example data
dat <- data.frame(A = c("r","t","y","g","r"), B = c("g","r","r","t","y"), C = c("t","g","t","r","t"),Gender = c("male", "female", "female", "male", "male"))

library(data.table)
setDT(dat)

dat[, combo := sapply(transpose(.SD), 
  . %>% sort %>% paste(collapse = " ")), .SDcols=A:C]

dat[, c(
  n = .N, 
  Gender %>% factor(levels=c("male", "female")) %>% table %>% as.list
), by=combo]

   combo n male female
1: g r t 3    2      1
2: r t y 2    1      1

答案 1 :(得分:1)

function fireVote(username, captchaKey){

    request.post({
    url:voteUrl, 
    form: {
                "username": username,
                "g-recaptcha-response": captchaKey

    }}, 
    function(err,httpResponse,body){ 
        console.log(body);
    })
}