匹配R中多行的文本

时间:2015-05-03 06:32:15

标签: r

我的data.frame(Networks)包含以下内容:

Location <- c("Farm", "Supermarket", "Farm", "Conference", 
         "Supermarket", "Supermarket")
Instructor <- c("Bob", "Bob", "Louise", "Sally", "Lee", "Jeff")
Operator <- c("Lee", "Lee", "Julie", "Louise", "Bob", "Louise")

Networks <- data.frame(Location, Instructor, Operator, stringsAsFactors=FALSE)

我的问题

我希望在新的data.frame Transactions$Count中添加一个新列Transactions,对每个Instructor的每个OperatorLocation之间的交换进行求和 预期输出

Location <- c("Farm", "Supermarket", "Farm", "Conference", "Supermarket")
Person1 <- c("Bob", "Louise", "Sally", "Jeff")
Person2 < - c("Lee", "Julie", "Louise", "Louise")
Count < - c(1, 2, 1, 1, 1)
Transactions <- data.frame(Location, Person1, Person2, Count, 
            stringsAsFactors=FALSE) 

例如,鲍勃和李在超市中总共有2次交流。如果一个人是讲师或操作员并不重要,我对他们的交流感兴趣。在预期的产量中,鲍勃和李在超市的两次交流被注意到。在其他地方,每隔一个组合就有一次交换。


我做了什么
我认为grepl可能有用,但我希望迭代这些数据的1300行,因此它的计算成本可能很高。

谢谢。

1 个答案:

答案 0 :(得分:4)

您可以考虑使用“data.table”并在“by”参数中使用pminpmax

示例:

Networks <- data.frame(Location, Instructor, Operator, stringsAsFactors = FALSE)
library(data.table)

as.data.table(Networks)[
  , TransCount := .N, 
  by = list(Location, 
            pmin(Instructor, Operator), 
            pmax(Instructor, Operator))][]
#       Location Instructor Operator TransCount
# 1:        Farm        Bob      Lee          1
# 2: Supermarket        Bob      Lee          2
# 3:        Farm     Louise    Julie          1
# 4:  Conference      Sally   Louise          1
# 5: Supermarket        Lee      Bob          2
# 6: Supermarket       Jeff   Louise          1

根据您的更新,听起来这可能更适合您:

as.data.table(Networks)[
  , c("Person1", "Person2") := list(
    pmin(Instructor, Operator), 
    pmax(Instructor, Operator)), 
  by = 1:nrow(Networks)
][
  , list(TransCount = .N), 
  by = .(Location, Person1, Person2)
]
#       Location Person1 Person2 TransCount
# 1:        Farm     Bob     Lee          1
# 2: Supermarket     Bob     Lee          2
# 3:        Farm   Julie  Louise          1
# 4:  Conference  Louise   Sally          1
# 5: Supermarket    Jeff  Louise          1