用逗号忽略引号中的空格替换空格

时间:2012-01-17 10:52:30

标签: python algorithm

给出由空格分隔的一串单词。需要用逗号替换空格,忽略引号中的空格。

>>> some_string = 'one two "three four" five "six seven"'
>>> replace_func(some_string)
'one,two,"three four",five,"six seven"'

这是一个简单的决定:

def replace_func(some_str):
    lines = []
    i = 1
    for l in struct.split('"'):
        if i % 2:
            lines.append(l.replace(' ', ',')
        else:
            lines.append(l)
        i += 1

    parsed_struct = '"'.join(lines)

有什么建议吗?

4 个答案:

答案 0 :(得分:11)

这可以在shlex.split的帮助下轻松完成:

>>> import shlex
>>> ','.join(shlex.split(some_string))
'one,two,three four,five,six seven'

我需要保留引号,你可以这样做:

>>> ','.join(['"{0}"'.format(fragment) if ' ' in fragment else fragment
...           for fragment in shlex.split(some_string)])
'one,two,"three four",five,"six seven"'

答案 1 :(得分:7)

或者,您可以使用正则表达式尝试这种更简单的解决方案:

>>> import re
>>> ','.join(re.findall('\"[^\"]*\"|\S+', some_string))
'one,two,"three four",five,"six seven"'

答案 2 :(得分:2)

使用正则表达式的替代方法:

result = re.sub(' (?=(?:[^"]*"[^"]*")*[^"]*$)', ",", subject)

这匹配一个空格,只有在跟随偶数引号后才用逗号替换它。因此,它只会匹配字符串之外。

答案 3 :(得分:1)

与正则表达式相比,Pyparsing通常更容易阅读和理解:

>>> some_string = 'one two "three four" five "six seven"'
>>> from pyparsing import OneOrMore, quotedString, Word, printables
>>> ','.join(OneOrMore(quotedString | Word(printables)).parseString(some_string))
'one,two,"three four",five,"six seven"'