卷曲支撑的正则表达式

时间:2017-02-16 18:33:07

标签: regex

我想将Text1更改为Text2。我怎么能写一个正则表达式呢?这是可能的。该文本包含子节。新版本应以逗号分隔

文本1:

{Any
    {White-collar 
        {Exec-managerial} 
        {Prof-specialty} 
        {Sales} 
        {Adm-clerical}
    } 
    {Blue-collar 
        {Tech-support} 
        {Craft-repair} 
        {Machine-op-inspct} 
        {Handlers-cleaners} 
        {Transport-moving} 
        {Priv-house-serv}
    } 
    {Other 
        {Protective-serv} 
        {Armed-Forces} 
        {Farming-fishing} 
        {Other-service}
    }
}

文本2:

Exec-managerial,White-collar,Any        
Prof-specialty ,White-collar,Any        
Sales,White-collar,Any
Adm-clerical,White-collar,Any
Tech-support,Blue-collar,Any
Craft-repair,Blue-collar,Any
Machine-op-inspct,Blue-collar,Any
Handlers-cleaners,Blue-collar,Any
Transport-moving,Blue-collar,Any
Protective-serv,Other,Any
Armed-Forces,Other,Any
Farming-fishing,Other,Any
Other-service,Other,Any

2 个答案:

答案 0 :(得分:1)

您可以将数据结构转换为JSON,然后使用您喜欢的map / reduce方法遍历它......

// define input text
var Text1 = `{Any
    {White-collar 
        {Exec-managerial} 
        {Prof-specialty} 
        {Sales} 
        {Adm-clerical}
    } 
    {Blue-collar 
        {Tech-support} 
        {Craft-repair} 
        {Machine-op-inspct} 
        {Handlers-cleaners} 
        {Transport-moving} 
        {Priv-house-serv}
    } 
    {Other 
        {Protective-serv} 
        {Armed-Forces} 
        {Farming-fishing} 
        {Other-service}
    }
}`

// define output array to store lines
var output = []
// parse json string into plain javascript object
JSON.parse(
    // wrap input in array
    '[' + Text1
        // replace opening braces with name/children json structure
        .replace(/{([\w-]+)/g, '{"name": "$1", "children": [')
        // replace closing braces with array close
        .replace(/}/g, ']}')
        // add commas between closing and opening braces
        .replace(/}([\n\s]*){/g, '},$1{') + ']'
// loop through outer layer
).forEach(outer => outer.children
    // inner layer
    .forEach(middle => middle.children
        // and finally join all keys with comma and push to output
        .forEach(inner => output.push([inner.name, middle.name, outer.name].join(',')))
    )
)

// join output array with newlines, and assign to Text2
var Text2 = output.join('\n')

/* Text2 =>
Exec-managerial,White-collar,Any
Prof-specialty,White-collar,Any
Sales,White-collar,Any
Adm-clerical,White-collar,Any
Tech-support,Blue-collar,Any
Craft-repair,Blue-collar,Any
Machine-op-inspct,Blue-collar,Any
Handlers-cleaners,Blue-collar,Any
Transport-moving,Blue-collar,Any
Priv-house-serv,Blue-collar,Any
Protective-serv,Other,Any
Armed-Forces,Other,Any
Farming-fishing,Other,Any
Other-service,Other,Any
*/

答案 1 :(得分:0)

如果它只是你要留下的内部支撑物,那就应该这样做。

查找$1\r\n
替换 (?s) (?: .*? ( { [^{}]* } ) # (1) | .* )

\s*{([^\s{}]+)\s*|\s*{([^{}]+)}\s*|\s*}\s*

否则,如果没有复杂的递归正则表达式,则无法获取嵌套信息。

或者,使用具有简单函数递归的语言。你会递归函数

在函数体中,根据正则表达式 \s* { ( [^\s{}]+ ) # (1) \s* | \s* { ( [^{}]+ ) # (2) } \s* | \s* } \s*

采取适当的操作
function recurse( string_thats_left )  
{
    while ( match( string_thats_left, regex ) )
    {  
        if ( $1 matched )
        {    
           push $1 onto array  
           recurse( match position to end of string );
        }
        else
        if ( $2 matched )  
        {
            write $2 to output
            for ( sizeof array )  
               append "," + element to output  
        }
        else
        {
           pop the last array element
           return
        }
    }
}

如果$ 1不为空,则将其推入数组,然后调用相同的函数(递归)。

如果$ 2不为空,则创建一个临时字符串,追加数组中的所有项目,
得到下一场比赛。

如果$ 1和$ 2都为空,则删除添加到数组中的最后一项,
然后从函数返回。

这就是它的全部内容 (伪代码)

products

实际上还有比这更多的东西,比如匹配必须是连续的 没有休息,但这给了这个想法。