我有结构化列表中的数据想要在一个列表中更改它并将V3拆分为多个列。我找到了基于“,”的spiting列,但它也会打破标题
final <- structure(list(`1` = structure(list(V3 = structure(1L, .Label = "Some text one, 20:15 GMT, 16 April 2010, 341 words, (E)(D B)",
class = "factor")), .Names = "V3", row.names = c(NA, -1L), class = "data.frame"),
`2` = structure(list(V3 = structure(c(1L, 2L, 3L), .Label = c("Some text two, 18:50 GMT, 25 June 2010, 681 words, (E)(D M)",
"Some text three, 20:00 GMT, 25 June 2010, 628 words, (E)(D B)",
"Some text four, 18:50 GMT, 25 June 2010, 677 words, (E)(D MN)"),
class = "factor")), .Names = "V3", row.names = c(NA, -3L), class = "data.frame")), .Names = c("1", "2"))
进一步匹配所需的结果
List Title Words_count Source
1 Some, text one, 20:15 GMT, 16 April 2010, 341 words (E)(D B)
2 Some text, two, 18:50 GMT, 25 June 2010, 681 words (E)(D M)
2 Some text three. 20:00 GMT, 25 June 2010, 628 words (E)(D B)
2 Some text four, 18:50 GMT, 25 June 2010, 677 words (E)(D MN)
答案 0 :(得分:1)
您可以使用base
尝试此解决方案:
result_list <- lapply(names(final),function(x){
strings_use <- as.character(final[[x]]$V3)
wordcount <- regmatches(strings_use,regexpr("[0-9]{1,} words",strings_use))
split_list <- strsplit(strings_use,paste(paste0(" ",wordcount,", "),collapse = "|"))
split_mat <- do.call("rbind",split_list)
split_mat <- cbind(rep(x,nrow(split_mat)),split_mat, wordcount)
split_mat[,c(1,2,4,3)]
})
result_mat <- as.data.frame(do.call("rbind",result_list),
stringsAsFactors = F)
names(result_mat) <- c("List", "Title", "Words_count", "Source")
result_mat
# List Title Words_count Source
# 1 1 Some text one, 20:15 GMT, 16 April 2010, 341 words (E)(D B)
# 2 2 Some text two, 18:50 GMT, 25 June 2010, 681 words (E)(D M)
# 3 2 Some text three, 20:00 GMT, 25 June 2010, 628 words (E)(D B)
# 4 2 Some text four, 18:50 GMT, 25 June 2010, 677 words (E)(D MN)