使用R创建变量之间的交互频率数据

时间:2019-04-02 21:47:57

标签: r dplyr data.table tidyverse zoo

我想用R计算二元互动的频率表。我需要计算每个月动物之间的互动次数,然后总计。下面提供了数据示例:

- name: Generate vault token
  uri:
    url: "{{vault_address}}/v1/auth/github/login"
    method: POST
    body:
      token: mytoken
    validate_certs: no
    body_format: json

使用此数据,我希望能够成对显示一个地点每月存在的动物 例如,使用位置的两对的预期结果应类似于一月份的结果:

#Create sample data
B1 <-data.frame(Animal = c("A","B","C","D","E","A","B","C","D","E","A","B","C","D","E","A","B","C","D","E","A","B","C","D","E"), Location = c(1,1,2,1,3,4,2,1,1,3,3,4,3,1,1,4,2,2,2,1,1,3,4,3,2), Month = c("Jan","Jan","Jan","Jan","Jan","Feb","Feb","Feb","Feb","Feb","Mar","Mar","Mar","Mar","Mar","Apr","Apr","Apr","Apr","Apr","May","May","May","May","May"))

提取每个月后,我希望能够计算出每对之间的互动总数,例如也许A-D交互总共发生了3次。

请问最好的方法是什么?

1 个答案:

答案 0 :(得分:1)

使用data.table,您可能可以执行以下操作:

library(data.table)

#convert into data.table
setDT(B1)

#create interaction between animals in the same location & month    
ans <- B1[, if (.N > 1L) transpose(combn(unique(Animal), 2L, simplify=FALSE)), 
    by=.(Location, Month)]

#change column names to desired column names
setnames(ans, paste0("V", 1L:2L), paste0("Animal", 1L:2L))

#sort animals so that A, B and B, A are the same
ans[, paste0("Animal", 1L:2L) := .(pmin(Animal1, Animal2), pmax(Animal1, Animal2))]

#count the number of interactions as requested
ans[, .(NumInteract=.N), by=c(paste0("Animal", 1L:2L))]

输出:

   Animal1 Animal2 NumInteract
1:       A       B           1
2:       A       D           1
3:       B       D           3
4:       C       D           2
5:       A       C           1
6:       D       E           1
7:       B       C           1