如何从python中的文本文件中的符号开始提取某些字符串?

时间:2013-08-16 07:26:18

标签: python string extract symbols

如何从文本文件中提取符号'$'开头的所有单词?

档案a(ascii) - >

        @ExtendedAttr = nvp_add(@ExtendedAttr, "severity", $severity,  
 "description", $description, "eventID", $eventID,
             "eventURL", $eventURL, "alertLevel", $alertLevel, 
      "eventStart", $eventStart,
             "eventSourceCount", $eventSourceCount, "eventSourceTable", 
$eventSourceTable, "eventDestCount", $eventDestCount)

我希望输出像这样(全部在新行中):

$severity
$description
$eventID
$eventURL
$alertLevel
$eventStart
$eventSourceCount
$eventSourceTable
$eventDestCount

2 个答案:

答案 0 :(得分:2)

使用regex

>>> import re
>>> with open('filename') as f:
...     ans = []
...     for line in f:
...         matches = re.findall(r'(?<!\w)(\$\w+)', line)
...         ans.extend(matches)
...         
>>> print ans
['$severity', '$description', '$eventID', '$eventURL', '$alertLevel', '$eventStart', '$eventSourceCount', '$eventSourceTable', '$eventDestCount']

现在使用str.join来获得预期的输出:

>>> print "\n".join(ans)
$severity
$description
$eventID
$eventURL
$alertLevel
$eventStart
$eventSourceCount
$eventSourceTable
$eventDestCount

答案 1 :(得分:0)

使用正则表达式,注意$(通常是行尾)与\的转义。使用f.read()一次读取整个文件(也可以将其提取到另一行以提高可读性)

import re

with open("filename", "r") as f:
...     matches = re.findall("(\$\w+)", f.read())
print matches