Python使用字典,文件

时间:2014-01-14 13:31:22

标签: python algorithm dictionary

我有这段代码:

def merge_new_old_urls(urls_list, urls_file_path):
    url_dict = {}
    try:
        with open(urls_file_path, "r") as f:
            data = f.readlines()
        for line in data:
            #read what is already in file
            url_dict = { line.split()[0]: int(line.split()[1])}
        for new in urls_list:
            for key in url_dict.keys():
                if new == key:
                    print 'found'
                    url_dict[key] += 1
                else:
                    url_dict[new] = 1

    except IOError:
        logging.critical('no files to read from %s' % urls_file_path)
        raise IOError('no files to read from %s' % urls_file_path)
    return url_dict

这应该从文件读取数据并将其与新的数据列表合并,计算重复的次数。带有旧网址的文件如下所示:

http://aaa.com 1
http://bbb.com 2
http://ccc.com 1

如果新的网址列表包含http://aaa.com http://bbb.com,则dict应为:

'http://aaa.com':2
'http://bbb.com':3
'http://ccc.com':1

但我的代码不正常。有人可以治愈吗?

1 个答案:

答案 0 :(得分:2)

每次循环重新定义url_dict

url_dict = {line.split()[0]: int(line.split()[1])}

将条目添加到字典中:

for line in data:
    key, val = line.split()
    if key in url_dict:
        url_dict[key] += val
    else:
        url_dict[key] = val

你完全没必要搜索字典,你可以使用与上面相同的语法:

for key in urls_list:
    if key in url_dict:
        url_dict[key] += val
    else:
        url_dict[key] = val  

最后,你不应该在try

中包裹这么多
try:
   with open(urls_file_path, "r") as f:
        data = f.readlines()
except IOError:
    logging.critical('no files to read from %s' % urls_file_path)
    raise IOError('no files to read from %s' % urls_file_path)
else:
    # rest of your code