根据r中其他列的字符串内容创建data.table列

时间:2017-06-02 20:44:56

标签: r data.table

我正在使用与以下类似的数据进行工作/争吵/诅咒:

names <- data.table(namesID = 1:3, fullName =  c("bob so", "larry po", "sam ho"))

trips <- data.table(tripsID = 1:3, tripNames= c("Mexico", "Alaska", "New Jersey"), 
                    tripMembers = c("bob so|larry po|sam ho","bob so|sam ho", "bob so|larry po")
                   ) 

我想创建一个这样的新表,获取tripMembers,并将正确的nameID连接到正确的tripID和tripName。我想这是一个连接(尝试了很多连接)?

namesTrips 
tripsID   tripNames          namesID
1         "Mexico"            1
1         "Mexico"            2
1         "Mexico"            3
2         "Alaska"            1
2         "Alaska"            3
3         "New Jersey"        1
3         "New Jersey"        2

1 个答案:

答案 0 :(得分:3)

您可以这样做:

# split the tripMembers column and unnest it; then join with names on the tripMembers
namesTrips <- trips[, .(tripMembers = unlist(strsplit(tripMembers, "\\|"))), 
                      by = .(tripsID, tripNames)][names, on = .(tripMembers = fullName)]

namesTrips[, tripMembers := NULL][order(tripsID)]

#   tripsID  tripNames namesID
#1:       1     Mexico       1
#2:       1     Mexico       2
#3:       1     Mexico       3
#4:       2     Alaska       1
#5:       2     Alaska       3
#6:       3 New Jersey       1
#7:       3 New Jersey       2