解析为头部和脚部添加字符串

时间:2014-07-24 15:21:55

标签: python regex

我是python的新手并且想要解析一个text.I我在这里加入了因为解析这个parse的原因。考虑我有一个示例文本文件" s1.txt"包含

示例s1.text

I like to play all games and sports
Game=chess
Sports=Baseball
I also like to play other games 
Game=carrom
Sports=cricket
Game=tennis

所需的输出样本:

Game=chess
Game=carrom
Game=tennis
I like to play all games and sports
I also like to play other games 
Sports=Baseball
Sports=cricket

我得到了一些使用正则表达式(.*?)=(.*)的建议。但正则表达式让人困惑,是否有更好的方法使用字符串操作来解决它! 请帮助我获得所需的输出!感谢您的回答!

2 个答案:

答案 0 :(得分:3)

创建一个确定特定线的相对值的函数。以“Game =”开头的行的值比通常值低,以“Sports =”开头的行的值越大。在对行集合进行排序时,使用此函数作为键。

def value(line):
    if line.startswith("Game="):
        return 0
    elif line.startswith("Sports="):
        return 2
    else:
        return 1


text = """I like to play all games and sports
Game=chess
Sports=Baseball
I also like to play other games 
Game=carrom
Sports=cricket
Game=tennis"""

lines = text.split("\n")
lines.sort(key=value)
print "\n".join(lines)

结果:

Game=chess
Game=carrom
Game=tennis
I like to play all games and sports
I also like to play other games
Sports=Baseball
Sports=cricket

答案 1 :(得分:0)

根据您定义的订单,您希望基于a匹配=的LH上的元素; b)文件中的行顺序。

扩展您的示例,假设您有:

txt='''\
Pleasure=swimming
I like to play all games and sports
Game=chess
Sports=Baseball
I also like to play other games 
Game=carrom
Sports=cricket
Game=tennis
Pleasure=eating'''    

如果你想使用正则表达式,你可以使用Kevin的sort方法,返回re.groups()对象的等级来装饰sort函数。

回想一下,具有多个匹配组的正则表达式将返回哪个匹配组与None匹配的其他匹配组:

>>> re.search(r'(^Game=)|(^Sports=)|(^Pleasure=)', 'Sports=').groups()
(None, 'Sports=', None)

然后,您可以使用生成器确定匹配组的顺序:

>>> next(i for i, e in enumerate((None, 'Sports=', None)) if e)
1 

现在写一个关键函数进行排序:

def kf(s, rank_of_none=1):
    m=re.search(r'(^Game=)|(^Sports=)|(^Pleasure=)', s)
    if m:
        return next(i for i, e in enumerate(m.groups()) if e)
    else:
        return rank_of_none-.1

现在看到你在元组的开头添加一个整数来确定排序的等级。我们可以使用浮点数来匹配,以便它按文件的行顺序排序:

for line in txt.splitlines():
    print kf(line), line     

输出:

2 Pleasure=swimming
0.9 I like to play all games and sports
0 Game=chess
1 Sports=Baseball
0.9 I also like to play other games 
0 Game=carrom
1 Sports=cricket
0 Game=tennis
2 Pleasure=eating

根据正则表达式中匹配组的位置生成灵活排序现在完全无关紧要:

print '\n'.join(sorted(txt.splitlines(), key=kf))  

输出:

Game=chess
Game=carrom
Game=tennis
I like to play all games and sports
I also like to play other games 
Sports=Baseball
Sports=cricket
Pleasure=swimming
Pleasure=eating