python3映射两个文件之间的唯一值并合并两个文件中的唯一行

时间:2018-05-09 15:08:10

标签: python-3.x

我弄清楚两个文件之间的唯一值,并希望如果两个文件中都存在唯一值,则将两个文件中的行分成一行。

更明确地说,我正在寻找MAC ADDRESS file1中第3列的file2作为密钥,并希望在192.168.100.1 0 001c.0718.1ed6 Vlan100, Port-Channel230 192.168.100.2 0 fa16.3e88.245d Vlan100, Port-Channel230 192.168.100.3 0 001c.0718.1f52 Vlan100, Port-Channel230 192.168.100.4 0 001c.0724.tb6a Vlan100, Port-Channel51 192.168.100.5 0 01c.0718.1t9c Vlan100, Port-Channel230 192.168.100.6 0 fa16.3ed8.dd6c Vlan100, Port-Channel27 192.168.100.7 0 fa16.3e22.20c3 Vlan100, Port-Channel230 192.168.100.8 0 fa16.3ecd.e1db Vlan100, Port-Channel27 192.168.100.9 0 001c.0718.9c8f Vlan100, Port-Channel230 上匹配它,如果是匹配然后将文件的匹配合并为一行。

文件1

   4    001c.0724.tb6a    DYNAMIC     Po17       1       13 days, 22:08:51 ago
   4    001c.0718.1f52    DYNAMIC     Po15       1       12 days, 5:07:20 ago
   4    001c.0718.1ed6    DYNAMIC     Po11       1       12 days, 5:05:44 ago
   4    001c.0718.1t9c    DYNAMIC     Po9        1       12 days, 5:07:16 ago
   4    001c.0718.9c8f    STATIC      Po9        1       12 days, 5:07:16 ago

file2的

!#/usr/bin/python3
# port_details.py
    mapping_dict = {}

    INPUT_FILE_1 = 'file1'
    INPUT_FILE_2 = 'file2'
    with open(INPUT_FILE_1) as file1:
        while True:
            line = file1.readline()
            print(line)
            if not line:
                break
            #_, mac, _, port = line.strip()
            ip_addr, _, mac, _, status = line.split()

            mapping_dict[mac.lower()] = status

    with open(INPUT_FILE_2) as file2:
        while True:
            line = file2.readline()
            if not line:
                break
            #ip_addr, _, mac, _ = line.strip()
            _, value_id, _, port = line.split()
            status = mapping_dict.get(mac, '')
            print(ip_addr, mac, port, status)

代码:下面是我根据googles中找到的示例中的模式和拟合尝试的代码,但是在执行时会抛出错误。

Traceback (most recent call last):
  File "./srdjan1.py", line 13, in <module>
    ip_addr, _, mac, _, status = line.split()
ValueError: not enough values to unpack (expected 5, got 4)

错误在运行时,我尝试了错误的不同值,但没有让它继续运行,任何提示或建议都将非常感激。

192.168.100.1   001c.0718.1ed6  Po11 Vlan100 Port-Channel230

期望值将是:

#add _ to end of values
liste =  [record.id + '_' for record in SeqIO.parse(data, "fasta")]
#liste = ["gene1_","gene2_","gene3_","gene4_","gene5_"]

#get boolean mask for each column    
m1 = df['name1'].str.contains('|'.join(liste))
m2 = df['name2'].str.contains('|'.join(liste))

#chain masks and count Trues
a = (m1 & m2).sum()
print (a)
3

1 个答案:

答案 0 :(得分:1)

我得到了解决方案,使用字典来保存值和空格的re

import re

INPUT_FILE_1 = 'file1.txt'
INPUT_FILE_2 = 'file2.txt'

dict1 = {}
dict2 = {}

def fl2():
    with open(INPUT_FILE_2) as file2:
        while True:
            line = file2.readline()
            if not line:
                break
            data1 = re.split("\s+",line)
            dict1[data1[2]] = data1[4]
        file2.close()
        return dict1

with open(INPUT_FILE_1) as file1:
    dict2 = fl2()
    while True:
        line = file1.readline()
        if not line:
            break
        data = re.split("\s+",line)
        if data[2] in dict2:
            print(data[0],data[2],dict2[data[2]])

按预期结果:

./port_details.py
192.168.1.1 001001c Good