我需要一个正则表达式来提取每个段落并将其存储为字符串,以便从包含许多此类相似段落的文本缓冲区进行其他处理。
示例:说,文本缓冲区是这样的:
=== Jun 11 14:05:39 - Person Details ===
Person Name = "Hurlman"
Person Address = "2nd Street Benjamin Blvd NJ"
Persion Age = 25
=== Jun 11 14:05:39 - Person Details ===
Person Name = "Greg"
Person Address = "3rd Street Benjamin Blvd NJ"
Persion Age = 26
=== Jun 11 14:05:42 - Person Details ===
Person Name = "Michel"
Person Address = "4th Street Benjamin Blvd NJ"
Persion Age = 27
And I need to iterate through all the paragraphs and store each one of them to further find the specific person details inside.
Each paragraph I need to extract should be of the below format
=== Jun 11 14:05:42 - Person Details ===
Person Name = "Michel"
Person Address = "4th Street Benjamin Blvd NJ"
Persion Age = 27
非常感谢任何帮助!
答案 0 :(得分:1)
您可以使用此模式(===.*===[\s\S]*?)(?====|$)
Demo
答案 1 :(得分:0)
使用正则表达式来解决这个问题是可能的,但它很可能会给你一个差(低效,难以理解,难以维护等)的解决方案。
您所拥有的是使用文本行表示的非正式记录结构。 (这不是自然语言文本,因此根据"段落"描述它是没有意义的。)
处理它的方法是一次读取一行,然后使用Scanner
(或等效的)将每一行解析为名称值对。您只需要一些简单的逻辑来检测记录边界和/或检查它们是否出现在输入流中的正确位置。