我的数据采用这种格式。它是一个文本文件,类是“字符”。我已从文件中发布了几行。大约有14000行。
"KEY: Aback"
"SYN: Backwards, rearwards, aft, abaft, astern, behind, back."
"ANT: Onwards, forwards, ahead, before, afront, beyond, afore."
"KEY: Abandon"
"SYN: Leave, forsake, desert, renounce, cease, relinquish,"
"discontinue, castoff, resign, retire, quit, forego, forswear,"
"depart_from, vacate, surrender, abjure, repudiate."
"ANT: Pursue, prosecute, undertake, seek, court, cherish, favor,"
"protect, claim, maintain, defend, advocate, retain, support, uphold,"
"occupy, haunt, hold, assert, vindicate, keep."
第6行和第7行是第5行的延续。第9行和第10行是第8行的延续。我的奋斗是如何将第6行和第7行引入第5行,类似地将第9行和第10行引入第8行。
任何提示感激不尽。
答案 0 :(得分:3)
首先想到的是(您的文字存储为x
):
#prefix each line starter (identifies as pattern: `CAPS:`) with a newline (\n)
strsplit(gsub("([A-Z]+:)", "\n\\1", paste(x, collapse = " ")),
split = "\n")[[1L]][-1L]
# [1] "KEY: Aback "
# [2] "SYN: Backwards, rearwards, aft, abaft, astern, behind, back. "
# [3] "ANT: Onwards, forwards, ahead, before, afront, beyond, afore. "
# [4] "KEY: Abandon "
# [5] "SYN: Leave, forsake, desert, renounce, cease, relinquish, discontinue, castoff, resign, retire, quit, forego, forswear, depart_from, vacate, surrender, abjure, repudiate. "
# [6] "ANT: Pursue, prosecute, undertake, seek, court, cherish, favor, protect, claim, maintain, defend, advocate, retain, support, uphold, occupy, haunt, hold, assert, vindicate, keep."