在Python中解析文本文件并打印与String匹配的行

时间:2019-05-07 09:29:59

标签: python pandas data-science

我有一个名为file.txt的文件,其中包含一些文本。还有另一个文件details.txt,其中包含要从file.txt中提取的字符串,并打印与details.txt中的字符串匹配的行。

files.txt

12345 04/04/19 07:06:55  entered | computer message|  ID WRE435TW: headway | | 
23456 04/04/19 07:10:00  entered | computer message|  Double vehicle logon  | | 
23567 04/04/19 07:06:55  entered | computer message|  ID EWFRSDE3: small   | | 
09872 04/04/19 07:07:47  entered | computer message|  Double vehicle logon  | | 
76789 04/04/19 07:10:05  entered | computer message|  Veh : logoff          | | 

details.txt

headway
small
logoff
logon

我尝试解析文本文件,但没有获得正确的格式输出。

import pandas as pd
import re
import os
import glob
import csv


os.chdir("file_path")

with open("file.txt", "r") as fp:
    with open("details.txt", 'r+') as f:
        for i in f:
            for line in fp:
                if i:
                    print(i)
                else:
                    print('no Event')

2 个答案:

答案 0 :(得分:1)

只需使用Pandas POWER:

apply(array(unlist(df_lst), c(3, 3, 3)), 1:2, mean)

我们会得到它:

import pandas as pd
import numpy as np

# Read the file as CSV with custom delimiter
df = pd.read_csv(
    'files.txt',
    delimiter='|',
    header=None
)

选择第三列(索引为2)并将其转换:


    0                                   1                   2                     3 4
0   64834 04/04/19 07:06:55 entered     computer message    Veh SBS3797R: headway       
1   73720 04/04/19 07:10:00 entered     computer message    Double vehicle logon        
2   64840 04/04/19 07:06:55 entered     computer message    Veh SBS3755L: small         
3   67527 04/04/19 07:07:47 entered     computer message    Double vehicle logon        
4   73895 04/04/19 07:10:05 entered     computer message    Veh : logoff        

words = np.vectorize(lambda x: x.strip().split(' ')[-1])(df[2].values) 将函数np.vectorize(清除文本并选择最后一个单词)应用于第三列lambda x: x.strip().split(' ')[-1]

因此您可以将其写入结果文件:

df[2].values

请注意,您应该使用with open("details.txt", 'a+') as f: f.write('\n'.join(words)) 附加到结果文件。 a+禁止这样做。

答案 1 :(得分:1)

请注意,与python中的''不同的字符串被视为True。因此,在您的代码中:

with open("file.txt", "r") as fp:
    with open("details.txt", 'r+') as f:
        for i in f:
            for line in fp:
                if i:  # This is always true (for input you showed)
                    print(i)
                else:
                    print('no Event')

您可以尝试以下方法:

with open("file.txt", "r") as fp:
    with open("details.txt", 'r+') as f:
        for i in f:
            for line in fp:
                if i in line:  
                    print(line)  # I assume you wanted to print line from files.txt
                else:
                    print('no Event')