Question

问题是从文本文件中的一堆垃圾中提取数据。例如，首先，我需要从文本文件中提取此特定部分：

％T 525 1：0.00：6425.12 2：0.01：6231.12 3：0.00：3234.51并持续很长时间。

然后，我需要专门从每个短语中取出第3个数据，即6425.12,6231.12和3234.51，并将其写入新的文本文件，然后对此数据进行其他编辑。

我正在考虑使用正则表达式来处理这种情况。有人可以展示示例代码吗？对于体验程序员来说应该是非常直接的。

Answer 1

您不需要re来获取数字......

s='%T 525 1:0.00:6425.12 2:0.01:6231.12 3:0.00:3234.51'
columns=s.split()[2:]  #Create a list of all the columns except the first 2.
numbers=[c.split(':')[-1] for c in columns]  #Split each column on ':' and take the last piece.

但是，在我们确定如何首先挑选字符串s之前，我们需要更多关于文件结构的信息。

Answer 2

我不认为我会使用正则表达式，看起来很简单。

with open(...) as file:
    for line in file:
        for word in line.split():
             if ':' in word:
                  print word.split(':')[2]  # do something with it here

从文本文件中提取Python数据

2 个答案: