Question

According to the CommonMark spec，不能归类为任何其他块元素（标题，水平标尺，列表，引号，代码块）的文本成为段落。那行不会以void LateUpdate() { // (...) takeHiResShot = Input.GetKeyDown("k"); if (takeHiResShot) { Vector3 startAngles = rift.transform.eulerAngles; for (int i = 1; i <= 20; ++i) { for (int j = 1; j <= 5; ++j) { //Setting transform.rotation is safer than setting transform.eulerAngles rift.transform.rotation = Quaternion.Euler((int)startAngles.x + i, (int)startAngles.y + j, (int)startAngles.z); Debug.Log(string.Format("i: {0}, j: {1}, eulerangles: {2}", i, j, rift.transform.eulerAngles)); // (...) } //It is safe here because you didn't manipulate the value, //but Quaternion.Euler is still more advisable rift.transform.eulerAngles = startAngles; } } }（标题），#（水平规则，无序列表），-（引号），数字（有序列表）或空格（代码）开头块）。

所以我构造了以下模式来提取文本：

以下是我要测试的大部分文本：

/(?:^|\n{2,})((?:[^#>\-*\d ][^\n]+)+)(?:$|\n{2,})/gm

据我了解，我制作的模式将匹配以2个新行（行的开头或结尾）为界的文本。然后它将捕获不以The quick brown fox jumps over the lazy dog Lorem Ipsum I should match - I should NOT match Le sigh > Why am I matching? 1. Nonononono! * Aaaagh! # Stahhhp! Hello, World!开头的连续文本行。此模式几乎可以正常工作，它可以正确匹配连续的行，同时拆分由2个新行绑定的行。 问题在于，它应该匹配以#>\-*\d开头的行。这种模式对我有什么影响？

您可以通过转到https://regex101.com/进行测试，将风味设置为JavaScript并粘贴上述模式和文本。

Answer 1

您的多行参数导致了此问题。如果您尝试检查您不希望匹配的区域，则该字符在开始时仅与一行匹配，而不与两行匹配。但是，当您不使用多行时，也可能会丢失其他匹配项。

我建议使用不带两个或多个换行符的多行选项。

我试图在这里建立一个符合您条件的模式：

/^(?![#>\-*\d ]).+\n?.+/gm

我相信这不是经过优化的，但我认为它是可行的：）

编辑：精炼版是

/^(?![#>\-*\d ])((?![#>\-*\d ]).+\n?)+/gm

欢呼

Answer 2

Wiktor Stribiżew解决方案很好且很接近，但不太合适。

这不会阻止他的建议捕获标头（用=下划线）或带编号的列表或代码块。

这是一个效果更好的正则表达式。 You can see it in action here。

    (?<para_all>
        (?:\n{2,}|^)                # Needs 1 or more new lines or start of string
        (?<para_prefix>[ ]{0,3})    # Possibly some leading spaces, but less than a tab worth
        (?<para_content>
            (?:
                (?!                 # some line content not starting with those exceptions
                    [>*+-=\#]
                    |
                    \d+\.
                    |
                    \`{3,}
                )
            )
            .+
            (?!\n(?:[=-]+|\`{3,}))  # Prevents from catching line followed by header markers
            (?:\n|$)
        )+                          # Allowing multiple occurrences
    )

使用正则表达式将连续的行标记为markdown中的段落。模式看起来不一样

2 个答案: