如何在另一个字符串中搜索字符串模式并包含分隔符?

时间:2016-01-08 02:24:46

标签: r string

我的数据结构如下:

dput(head(CharacterAnalysis,5))
structure(list(Character = c("A", "a", "B", "b", "C"), 
Descriptor = c("Jog", "Change Direction", "Shuffle", "Walk", "Stop"), 
.Names = c("Character", "Descriptor"), 
row.names = c(NA, 5L), class = "data.frame")

我希望在以下数据框中查找Character及相关Descriptor,但我不确定如何执行此操作:

dput(head(StringAnalysis,3))
structure(list(MovementString = c("ACb", "aAaB", "BbCa"), 
.Names = c("MovementString"), 
row.names = c(NA, 3L), class = "data.frame")

我的预期结果/数据框将是:

dput(head(Output,3))
structure(list(MovementString = c("ACb", "aAaB", "BbCa"), 
MovementPerformed = c("Jog/ Stop/ Walk", "Change Direction/ Jog/ Change Direction/ Shuffle", "Shuffle/ Walk/ Stop/ Change Direction")
.Names = c("MovementString", "MovementPerformed"), 
row.names = c(NA, 3L), class = "data.frame")

我想要一个向前的笔划(/)或类似的东西来分隔每个Descriptor,因为它标志着一个新的运动。有关如何完成此任务的任何建议?我的数据框CharacterAnalysis超过100万行,因此我不希望分别搜索每个MovementString

谢谢。

1 个答案:

答案 0 :(得分:0)

CharacterAnalysis <- 
structure(list(Character = c("A", "a", "B", "b", "C"), 
Descriptor = c("Jog", "Change Direction", "Shuffle", "Walk", "Stop")),
.Names = c("Character", "Descriptor"), 
row.names = c(NA, 5L), class = "data.frame")

Output <-
structure(list(MovementString = c("ACb", "aAaB", "BbCa"), 
MovementPerformed = c("Jog/ Stop/ Walk", "Change Direction/ Jog/ Change Direction/ Shuffle", "Shuffle/ Walk/ Stop/ Change Direction")),
.Names = c("MovementString", "MovementPerformed"), 
row.names = c(NA, 3L), class = "data.frame")

# A simple approach based on names

# Build the lookup table just once
m <- CharacterAnalysis$Descriptor
names(m) <- CharacterAnalysis$Character

# Build the MovementPerformed column
Output$MovementPerformed <-
   sapply(strsplit(Output$MovementString,""), 
          FUN = function(x) paste(m[x], collapse = "/ "))