Python将数据合并到两个文件中

时间:2015-04-27 22:36:52

标签: python regex

我有两个文本文件。一个文本文件是“numbers.txt”。它包含10位数的电话号码,每行一个。第二个文件“users”包含有关多个帐户的数据。我只想查找numbers.txt

中列出的帐户信息

因此,对于numbers.txt中的每个数字,搜索所述号码的用户文件。如果找到则返回该行文本和下一行文本(或返回所有文本,直到下一个空行也可以)。

numbers.txt看起来像:

1234567021
1234566792

用户文件如下:

1234567021@host.com User-Password == "secret"
           Framed-IP-Address = 192.168.1.100,

结果我正在寻找:

1234567021 1234567021@host.com User-Password == "secret" Framed-IP-Address = 192.168.1.100

我被困在/难以理解如何接近它。到目前为止我所拥有的:

#!/usr/bin/env python

import os

# Load numbers text file
if os.path.isfile("numbers.txt"):
    print "Loaded Numbers"
    #### Open file, if exists
    numbers = open('numbers.txt', 'r')
else:
    print "ERROR: Unable to read numbers.txt"
    quit()

# Load user data file
if os.path.isfile("users.txt"):
    print "Loaded user data"
    #### Open file, if exists
    users_data = open('users.txt', 'r')
else:
    print "ERROR: Unable to read users_data"
    quit()


#### Search 
if any(str(users_data) in s for s in numbers):
    for line in numbers:
        if number in line:
            #### Produce sanitized list of output
            output = line.split(' ')
            #print output[0]
            print output
            # also need next line from users_data
            # after each match 

#### Close numbers file and quit
numbers.close()
users_data.close()
quit()

4 个答案:

答案 0 :(得分:0)

代码不是最优的,因此必须读取users_data numbers.txt行次数:

#### Search
for number in numbers:
    for data in users_data:
        if data.startswith(number):
            print (number, data)

答案 1 :(得分:0)

我建议您可以先对数据进行排序,然后我们可以循环查找数字。可以在users_data中找到该号码。

答案 2 :(得分:0)

这是用Python 3编写的,用于获取我想要的StringIO行为。

只需将with StringIO(nums_txt) as f:更改为open('numbers.txt') as f:即可使用您的nums文件的文件名,并与用户文件部分相同。这应该是显而易见的:

nums_txt='''\
1234567021
1234566792'''

users='''
1234567021@host.com User-Password == "secret"
           Framed-IP-Address = 192.168.1.100,
''' 

import re
from io import StringIO

with StringIO(nums_txt) as f:   # with open('numbers.txt') as f:  ...
    nums={line.strip():'Not Found' for line in f}

nfs={}    
with StringIO(users) as f:      # with open('users.txt') as f: ...
    for m in re.finditer(r'(^\d{10})(@.*?)(?=(?:\d{10}@)|\Z)', f.read(), re.S | re.M):
        rec=re.sub(r'\s{2,}', ' ', ' '.join(m.group(2).splitlines()))
        if m.group(1) in nums:
            nums[m.group(1)]=rec
        else:
            nfs[m.group(1)]='Not Found'    
print(nums)

打印:

{'1234567021': '@host.com User-Password == "secret" Framed-IP-Address = 192.168.1.100,', '1234566792': 'Not Found'}

评论:

  1. users文件的格式是不明显的。相应地调整正则表达式
  2. 仅在numbers中的数字唯一
  3. 时才有效
  4. users中没有相应号码的numbers中的记录收集在词典nfs

答案 3 :(得分:0)

将数字读入集合

with open('numbers.txt') as f:
    numbers = {line.strip() for line in f if line.strip()}

查看users.txt中每行的前十个字符。如果该字符串位于numbers中,请将两行保存到容器(dict

result = dict()
with open('users.txt') as f:
    for line in f:
        key = line[:10]
        if key in numbers:
            value = line + f.next()
            result[key] = value