在Python中清理一段文本

时间:2013-02-06 23:46:30

标签: python compare steam

我对Python很新,但我想通过为我将要使用的功能编写一些脚本来开始学习。我在Team Fortress 2的控制台中输入“status”时检索到一些文本。我想要实现的是,我想将此文本转换为只有STEAM_X:X:XXXXXXXX是Steam64 ID的文本。

# userid name                uniqueid            connected ping loss state
#     31 "Atonement -Ai-"    STEAM_0:1:27464943  00:48      103    0 active
#     10 "?loop?"        STEAM_0:0:31072991  40:48       62    0 active
#     11 "爱 -Ai-"          STEAM_0:0:41992530  40:46       68    0 active
#     12 "MrKateUpton -Ai-"  STEAM_0:1:10894538  40:25       81    0 active
#     13 "Tacet -Ai-"        STEAM_0:1:52131782  39:59       83    0 active
#     14 "CottonBonbon-Ai-"  STEAM_0:1:47812003  39:39       51    0 active
#     15 "belt -Ai-"         STEAM_0:1:4941202   38:43      123    0 active
#     16 "boutros :3"        STEAM_0:0:32271324  38:21       65    0 active
#     17 "[tilt] Xikkari"    STEAM_0:1:41148798  38:14       92    0 active
#     24 "ElenaWitch"        STEAM_0:0:17495028  31:30       73    0 active
#     19 "[tilt] Batcan #boutros" STEAM_0:1:41205650 38:10   63    0 active
#     20 "[?l??]whatupmydiggas" STEAM_0:1:50559125 37:58  112    0 active
#     21 "[tilt] musicman"   STEAM_0:1:37758467  37:31       89    0 active
#     22 "Jack Frost"        STEAM_0:0:24206189  37:28       90    0 active
#     28 "[tilt-sub]deaf ears #best safet" STEAM_0:1:29612138 19:05   94    0 active
#     25 "? notez ?ai"    STEAM_0:1:29663879  31:23      113    0 active
#     27 "-Ai- Lord English" STEAM_0:1:44114633  24:08      116    0 active
#     29 "1.prototypes"      STEAM_0:0:42256202  17:41       83    0 active
#     30 "SourceTV  // name for SourceTV" BOT                        active
#     32 "PUT ME IN COACH"   STEAM_0:1:48004781  00:36      173    0 spawning

Python中是否有任何内置函数可以执行以下算法?

For all that is not (!) Steam_X:X:XXXXXXXX, delete/remove.

我已经做了相当多的谷歌搜索,但没有什么真正具体。如果有人能让我开始使用内置的Python函数,我将非常感激开始编码。

P.S。输出就像这样

STEAM_0:1:27464943
STEAM_0:0:31072991
STEAM_0:1:10894538
etc
etc

1 个答案:

答案 0 :(得分:4)

对于正则表达式来说,这听起来很简单。假设他们总是这样的数字:

>>> import re

>>> with open('/tmp/spam.txt') as f:
...   for steam64id in re.findall(r'STEAM_\d:\d:\d+', f.read()):
...     print steam64id
... 
STEAM_0:1:27464943
STEAM_0:0:31072991
STEAM_0:0:41992530
STEAM_0:1:10894538
STEAM_0:1:52131782
STEAM_0:1:47812003
STEAM_0:1:4941202
STEAM_0:0:32271324
STEAM_0:1:41148798
STEAM_0:0:17495028
STEAM_0:1:41205650
STEAM_0:1:50559125
STEAM_0:1:37758467
STEAM_0:0:24206189
STEAM_0:1:29612138
STEAM_0:1:29663879
STEAM_0:1:44114633
STEAM_0:0:42256202
STEAM_0:1:48004781

删除行的常用方法不是从原始文件中删除它们,而是将要保留的行打印到新文件中(然后,可选择复制它覆盖原始文件)文件,如果处理成功)。