根据字符串向量中的模式创建列表

时间:2019-07-10 11:15:36

标签: r

我正在尝试生成一个包含位置信息的列表。目前,我有一个带有字符串的字符向量。其中具有位置,信息,信息,信息,位置2,信息,信息,信息结构。我想要一个列表,其中每个元素都是location1:信息,信息信息等。

我试图创建一个标识数据中位置的循环,但是我无法理解如何将信息与位置动态地结合在一起(位置和信息数量在变化,因此我需要动态解决方案)。

list_of_locations = list()
locations = c("location1","location2")
original_vector = c("location1","July 123","August 345", "September 678", "location2","July 123","August 345")

for (word in original_vector){
  if(word %in% locations){
    list_of_locations[[word]] = word 
  } else {
    list_of_locations[[word]] = word
  }
}

我在寻找列表:

1: location1, July 123, August 345, September 678
2: location2, July 123, August 345...

1 个答案:

答案 0 :(得分:1)

这不是有用的数据格式,但是您在这里:

split(original_vector, 
  cumsum(
    grepl("location", original_vector, fixed = TRUE) #search for the word "location"
  )
)
#$`1`
#[1] "location1"     "July 123"      "August 345"    "September 678"
#
#$`2`
#[1] "location2"  "July 123"   "August 345"

或者(如果有位置矢量,则感谢@Ronak):

split(original_vector, cumsum(original_vector %in% locations)

如果您的数据实际上是所描述的格式(1个位置,3个信息条目),我将把original_vector转换成矩阵:

original_vector = c("location1","July 123","August 345", "September 678", "location2","July 123","August 345", "September 678")
t(matrix(original_vector, 4))
#     [,1]        [,2]       [,3]         [,4]           
#[1,] "location1" "July 123" "August 345" "September 678"
#[2,] "location2" "July 123" "August 345" "September 678"

此格式允许轻松进行子设置和其他数据处理。