我正在使用与以下类似的数据进行工作/争吵/诅咒:
names <- data.table(namesID = 1:3, fullName = c("bob so", "larry po", "sam ho"))
trips <- data.table(tripsID = 1:3, tripNames= c("Mexico", "Alaska", "New Jersey"),
tripMembers = c("bob so|larry po|sam ho","bob so|sam ho", "bob so|larry po")
)
我想创建一个这样的新表,获取tripMembers,并将正确的nameID连接到正确的tripID和tripName。我想这是一个连接(尝试了很多连接)?
namesTrips
tripsID tripNames namesID
1 "Mexico" 1
1 "Mexico" 2
1 "Mexico" 3
2 "Alaska" 1
2 "Alaska" 3
3 "New Jersey" 1
3 "New Jersey" 2
答案 0 :(得分:3)
您可以这样做:
# split the tripMembers column and unnest it; then join with names on the tripMembers
namesTrips <- trips[, .(tripMembers = unlist(strsplit(tripMembers, "\\|"))),
by = .(tripsID, tripNames)][names, on = .(tripMembers = fullName)]
namesTrips[, tripMembers := NULL][order(tripsID)]
# tripsID tripNames namesID
#1: 1 Mexico 1
#2: 1 Mexico 2
#3: 1 Mexico 3
#4: 2 Alaska 1
#5: 2 Alaska 3
#6: 3 New Jersey 1
#7: 3 New Jersey 2