如何将包含分号分隔列表的矢量转换为存在/不存在矩阵?

时间:2018-01-19 20:17:42

标签: r

我有一个向量,向量的每个元素都包含一个字符串,该字符串由以分号和/或逗号分隔的属性列表组成。我想要做的是获取该向量并将其转换为列表中每个属性的存在/不存在矩阵。

到目前为止,我采用的方法是首先抓取向量中的所有分号分隔元素,如下所示:

OrientationList <- c(NULL)
for (i in levels(stroller_attributes$Orientation))
{ OrientationList <- paste(OrientationList, ",", i)}

OrientationList <- unique(gsub("^[[:space:]]|[[:space:]]$", "", unlist(strsplit(OrientationList, split=";|,"))))

这给了我一个包含在向量中的所有属性的列表。但现在我要做的是创建一个新的矩阵,其中包含长度(OrientationList)列和行(stroller_attributes)行,我这样做

OrientationFactorsMatrix <- matrix(ncol=length(OrientationList), nrow=nrow(stroller_attributes))
colnames(OrientationFactorsMatrix) <- OrientationList

接下来,我需要继续执行原始向量stroller_attributes $ Orientation并确定每个元素中包含哪些元素,然后使用TRUE或FALSE值指示OrientationFactorsMatrix中此元素的存在与否。我最初的直觉是做

之类的事情

%stroller_attributes中的OrientationList%$ Orientation [16]会自动生成矩阵中每个元素的存在/缺失值(Hooray!),遗憾的是,如果元素在逗号/分号分隔列表中包含两个不同的项目,它返回FALSE。从本质上讲,我想在%check中执行%,但执行“这是否包含术语”而不是“它是否仅包含该术语”。

我很感激任何帮助。 布拉德

structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 4L, 4L, 4L, 4L, 4L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 12L, 
2L, 2L, 2L, 2L, 2L, 2L, 12L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 12L, 2L, 2L, 12L, 2L, 21L, 21L, 23L, 22L, 17L, 17L, 17L, 
16L, 1L, 1L, 1L, 24L, 11L, 11L, 2L, 1L, 2L, 2L, 2L, 19L, 12L, 
17L, 17L, 19L, 19L, 17L, 17L, 21L, 17L, 1L, 17L, 1L, 1L, 2L, 
9L, 2L, 2L, 2L, 1L, 1L, 25L, 25L, 25L, 25L, 25L, 25L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 25L, 
13L, 2L, 25L, 1L, 26L, 2L, 25L, 25L, 13L, 2L, 2L, 1L, 25L, 25L, 
25L, 25L, 25L, 2L, 18L, 18L, 18L, 18L, 13L, 21L, 2L, 13L, 1L, 
6L, 1L, 1L, 2L, 1L, 2L, 12L, 2L, 12L, 12L, 12L, 2L, 2L, 10L, 
10L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 12L, 
2L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 2L, 12L, 12L, 2L, 
12L, 12L, 12L, 2L, 2L, 2L, 2L, 12L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 1L, 1L, 1L, 25L, 25L, 25L, 25L, 25L, 25L, 2L, 8L, 
14L, 14L, 14L, 8L, 8L, 7L, 8L, 15L, 15L, 8L, 8L, 8L, 15L, 14L, 
8L, 2L, 5L, 5L, 5L, 2L, 2L, 24L, 24L, 13L, 13L, 13L, 13L, 20L, 
20L, 20L, 20L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "Forward Facing", 
"Forward Facing ", "Forward Facing, Parent Facing", "Forward Facing; Full lie flat", 
"Forward Facing; Infant Car Seat", "Forward facing; Lie flat", 
"Forward Facing; Lie Flat", "Forward Facing; Lie flat option for Infants", 
"Forward Facing; Lie Flat; 2 Children Forward-Facing; 2 Children 1x Forward Facing, 1x Lie Flat; 2 Children 1x Forward Facing, 1x Parent Facing (Infant Car Seat); 1x Parent Facing (Infant Car Seat)", 
"Forward Facing; Lie-Flat Configuration For Newborns", "Forward Facing; Parent Facing", 
"Forward Facing; Parent Facing; Lie Flat", "Forward Facing; Parent Facing; Lie Flat On Buggy; Lie Flat Off Buggy", 
"Forward Facing; Parent Facing; Recline", "Forward Facing; Rear Facing; Lie Flat", 
"Lie Flat; Forward Facing", "Lie Flat; Forward Facing; Parent Facing", 
"Lie Flat; Forward Facing; Travel System", "Lie Flat; Forward-Facing", 
"Lie Flat; Parent Facing; Forward Facing", "Lie Flat; Travel System; Forward Facing; Second Seat", 
"Lie Flat; Travel System; Forward Facing; Second Seat; Parent Facing", 
"Off Stroller Bassinet; Forward Facing; Parent Facing; Lie Flat", 
"Reversible Seat", "Travel System; Forward Facing; Second Seat; Parent Facing"
), class = "factor")

1 个答案:

答案 0 :(得分:0)

好的,通常情况下,以详细的方式写出问题有助于我找出自己问题的答案。这是解决方案

member

它的关键部分是我必须在原始向量中使用逗号/分号分隔列表并将其转换为带有enlist的项目向量。然后我通过删除所有空白区域来清理它,并将其转换为小写。我对OrientationList的内容执行相同的基本操作,然后%in%运算符创建我想要的输出。