RegEx用于在特定位置添加字符

时间:2019-05-15 16:29:11

标签: python regex javadoc regex-group regex-greedy

我正在尝试在python中分析Javadoc注释,为此,我需要句号来进行拆分。如何在Javadoc注释的正确位置添加句号?

我想要这样的东西: 输入:

/**
     * The addVertex method checks to see if the vertex isn't null, and then if
     * the graph does not contain the vertex, the vertex is then added and true
     * is returned
     *
     * @param vertex
     *
     * @throws NullPointerException.
     *
     * @return b
     */

输出:

/**
     * The addVertex method checks to see if the vertex isn't null, and then if
     * the graph does not contain the vertex, the vertex is then added and true
     * is returned.*here*
     *
     * @param vertex.*here*
     *
     * @throws NullPointerException.*here*
     *
     * @return b.*here*
     */

注意:如果已经存在句号/分号/逗号,则不需要替换,因为我的程序是基于这3个标点符号进行分割的。

此外,Javadoc描述可以包含内联标签,例如{@link ...},在此不需要标点符号。

仅在@ param,@ throw,@ return(以及结尾)之前是必需的。

解决方案

test_str = ("/**\n"
    "     * The addVertex method checks to see if the vertex isn't null, and then if\n"
    "     * the graph does not contain the vertex, the vertex is then added and true\n"
    "     * is returned\n"
    "     *\n"
    "     * @param vertex\n"
    "     *\n"
    "     * @throws NullPointerException.\n"
    "     *\n"
    "     * @return b\n"
    "     */")

result = re.sub(r'(@param|@throw|@return)', r'.\1', test_str)
print(result)

这将在所需位置添加一个句号,除了最后一个标记之后,这不是拆分的问题!

1 个答案:

答案 0 :(得分:1)

对于缺少.的用户,您可以简单地写一个类似于以下内容的表达式:

(\* @param|@return)(.*)

您可以将其替换为$1$2.

enter image description here

RegEx

您可以在regex101.com中修改/更改表达式。

RegEx电路

您还可以在jex.im中可视化您的表达式:

enter image description here

JavaScript演示

const regex = /(\* @param|@return)(.*)/gm;
const str = `/**
     * The addVertex method checks to see if the vertex isn't null, and then if
     * the graph does not contain the vertex, the vertex is then added and true
     * is returned
     *
     * @param vertex
     *
     * @throws NullPointerException.
     *
     * @return b
     */`;
const subst = `$1$2.`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

Python代码:

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(\* @param|@return)(.*)"

test_str = ("/**\n"
    "     * The addVertex method checks to see if the vertex isn't null, and then if\n"
    "     * the graph does not contain the vertex, the vertex is then added and true\n"
    "     * is returned\n"
    "     *\n"
    "     * @param vertex\n"
    "     *\n"
    "     * @throws NullPointerException.\n"
    "     *\n"
    "     * @return b\n"
    "     */")

subst = "\\1\\2."

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

输出

/**
     * The addVertex method checks to see if the vertex isn't null, and then if
     * the graph does not contain the vertex, the vertex is then added and true
     * is returned
     *
     * @param vertex.
     *
     * @throws NullPointerException.
     *
     * @return b.
     */

表达方式

如果您想在说明后添加.,则this expression可能会起作用:

([\s\*]+@param)

Python代码

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"([\s\*]+@param)"

test_str = ("/**\n"
    "     * The addVertex method checks to see if the vertex isn't null, and then if\n"
    "     * the graph does not contain the vertex, the vertex is then added and true\n"
    "     * is returned\n"
    "     *\n"
    "     * @param vertex\n"
    "     *\n"
    "     * @throws NullPointerException.\n"
    "     *\n"
    "     * @return b\n"
    "     */")

subst = ".\\1"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.