Python:如何改进python列表并逐行打印?

时间:2015-02-17 14:21:53

标签: python list dictionary

我有一个这样的文件:

2.nseasy.com.|['azeaonline.com']
ns1.iwaay.net.|['alchemistrywork.com', 'dha-evolution.biz', 'hidada.net', 'sonifer.biz']
ns2.hd28.co.uk.|['networksound.co.uk']

预期结果:

2.nseasy.com.|'azeaonline.com'
ns1.iwaay.net.|'alchemistrywork.com'
ns1.iwaay.net.|'dha-evolution.biz'
ns1.iwaay.net.|'hidada.net'
ns1.iwaay.net.|'sonifer.biz'
ns2.hd28.co.uk.|'networksound.co.uk'

当我尝试这样做时,我得到的是域名字符,而不是有价值的域名列表。这意味着字典d的值中的列表被识别为列表但被识别为字符串。这是我的代码:

d = defaultdict(list)
f = open(file,'r')
start = time()
for line in f:
    NS,domain_list = line.split('|')
    s = json.dumps(domain_list)
    d[NS] = json.loads(s)


for NS, domains in d.items():
    for domain in domains:
        print (NS, domain)

当前结果的示例:

w
o
o
d
l
a
n
d
f
a
r
m
e
r
s
m
a
r
k
e
t
.
o
r
g
'
]

5 个答案:

答案 0 :(得分:4)

你正在用json做什么是不正确的。 s = json.dumps(domain_list)将列表转储为字符串sjson.loads(s)再次读取字符串,然后将字符串放在字符串上并打印它,因此输出中的单个字符。 尝试类似:

d = defaultdict(list)
f = open(file,'r')
start = time()
for line in f:
    NS,domain_list = line.split('|')
    d[NS] = json.loads(domain_list.replace("'", '"'))


for NS, domains in d.items():
    for domain in domains:
        print (NS, domain)

答案 1 :(得分:2)

这是另一个(假设names.txt包含您的数据):

with open('names.txt') as f: # Open the file for reading
  for line in f:             # iterate over each line
     host,parts=line.strip().split('|') # Split the parts on the |
     parts=parts.replace('[','').replace(']','') # Remove the [] chars
     parts_a=map(str.strip, parts.split(',')) # Split on the comma, and remove any spaces
     for part in parts_a:       # for the split part, iterate through each one
         print '{0}|{1}'.format(host, part)  # print the host and part separated by a |

注意:你也可以用parts_a = json.loads(parts)替换第4行和第5行,假设|是JSON ......

答案 2 :(得分:2)

在这种情况下你不需要使用json,因为它无法解决你的问题,你可以在列表理解中使用ast.literal_evalitertools.repeat来创建欲望对:< / p>

>>> from itertools import repeat
>>> import ast
>>> sp_l=[(i.split('|')[0],ast.literal_eval(i.split('|')[1])) for i in s.split('\n')]
>>> for k in [zip(repeat(i,len(j)),j) for i,j in sp_l]:
...    for item in k:
...         print '|'.join(item)
... 
2.nseasy.com.|azeaonline.com
ns1.iwaay.net.|alchemistrywork.com
ns1.iwaay.net.|dha-evolution.biz
ns1.iwaay.net.|hidada.net
ns1.iwaay.net.|sonifer.biz
ns2.hd28.co.uk.|networksound.co.uk

答案 3 :(得分:2)

尝试:

import ast
with open(file, "r") as f:
    d = {k: ast.literal_eval(v) for k, v in map(lambda s: s.split("|"), f)}

for NS, domains in d.items():
    for domain in domains:
        print "%s|'%s'" % (NS, domain)

甚至只是:

with open('file.xyz') as f:
    for thing in f:
        q, r = thing.split('|')
        r = ast.literal_eval(r)
        for other in r:
            print '{}|{}'.format(q, other)

答案 4 :(得分:1)

这是一个正则表达式解决方案:

import re

input = '''2.nseasy.com.|['azeaonline.com']
ns1.iwaay.net.|['alchemistrywork.com', 'dha-evolution.biz', 'hidada.net', 'sonifer.biz']
ns2.hd28.co.uk.|['networksound.co.uk']'''

for line in input.split('\n'):
    splitted = line.split('|')
    left = splitted[0]
    right = re.findall("'([a-z\.-]+?)'", splitted[1])

    for domain in right:
        print '{0}|{1}'.format(left, domain)

输出:

2.nseasy.com.|azeaonline.com
ns1.iwaay.net.|alchemistrywork.com
ns1.iwaay.net.|dha-evolution.biz
ns1.iwaay.net.|hidada.net
ns1.iwaay.net.|sonifer.biz
ns2.hd28.co.uk.|networksound.co.uk