从文件中的每一行创建一个字典

时间:2018-05-22 10:43:08

标签: python dictionary key-value

我正在尝试从这个文件中创建一个字典:键是第一个单词,值是后面的所有单词。

andrew fred
fred
judy andrew fred
george judy andrew
john george

这是我的代码:

follows_file = open("C:\\Users\\Desktop\\Python\\follows.txt")
followers = {}
for line in follows_file:   #==> [Judy Andrew Fred]
    users = line.split(' ')     #==> [Judy, andrew, Fred, ....]
    follower = users[0]     #==> [Judy]
    followed_by = users[1:] #==> [Andrew, Fred]

    for user in followed_by:
        # Add the 'follower to the list of followers user
        if user not in followers:
            followers[user] = []
        followers[user].append(follower)
print(followers.items())

当我打印关注者后跟变量时,它们是正确的,但我无法正确地将它们添加到字典中;这是输出

dict_items([('fred\n', ['andrew', 'judy']), ('andrew', ['judy']), ('judy' ['george']), ('andrew\n', ['george']), ('george', ['john'])])

我想要的输出是

(Andrew[Fred])(Fred[])(judy[Andrew Fred])(George[Judy Fred])(john[george])

非常感谢任何帮助!

4 个答案:

答案 0 :(得分:3)

您可以使用collections.defaultdict()作为字典工厂,只需追踪一个人后面的用户,例如:

import collections

followers = collections.defaultdict(list)  # use a dict factory to save some time on checks
with open("path/to/your_file", "r") as f:  # open the file for reading
    for line in f:  # read the file line by line
        users = line.split()  # split on any white space
        followers[users[0]] += users[1:]  # append the followers for the current user

将为您的数据生成:

{'andrew': ['fred'],
 'fred': [],
 'judy': ['andrew', 'fred'],
 'george': ['judy', 'andrew'],
 'john': ['george']}

这也允许您在重复记录上为用户添加多个列表 - 否则您可以使用普通dict作为followers并将其设置为followers[users[0]] = users[1:]。< / p>

您显示为所需输出的数据结构不是有效的Python,您真的希望以这种方式呈现吗?我的意思是,如果你坚持你可以这样做:

print("".join("({}[{}])".format(k, " ".join(v)) for k, v in followers.items()))
# (andrew[fred])(fred[])(judy[andrew fred])(george[judy andrew])(john[george])

答案 1 :(得分:1)

这是使用str.splittry / except子句捕获只存在密钥的实例的一种解决方案。

注意io.StringIO让我们从字符串中读取,就像它是一个文件一样。

from io import StringIO
import csv

mystr = StringIO("""andrew fred
fred
judy andrew fred
george judy andrew
john george""")

# replace mystr with open("C:\\Users\\zacan\\Desktop\\Python\\follows.txt")
with mystr as follows_file:
    d = {}
    for users in csv.reader(follows_file):
        try:
            key, *value = users[0].split()
        except ValueError:
            key, value = users[0], []

        d[key] = value

print(d)

{'andrew': ['fred'],
 'fred': [],
 'george': ['judy', 'andrew'],
 'john': ['george'],
 'judy': ['andrew', 'fred']}

答案 2 :(得分:0)

编辑回答,得益于@ PM2Ring和@IljaEverilä的评论。

这是我使用词典理解的原始解决方案

followers = {line.split()[0]: line.split()[1:] for line in follows_file}

@IljaEverilä提出的一种更有效的替代方法是避免两次调用split

followers = {follower: followees for follower, *followees in map(str.split, follows_file)}

结果:

{'andrew': ['fred'],
 'fred': [],
 'george': ['judy', 'andrew'],
 'john': ['george'],
 'judy': ['andrew', 'fred']}

请注意,上述两种解决方案均假设您的文件不包含重复的密钥。

之后不要忘记关闭文件:

follows_file.close()

或者更好的是,只需使用上下文管理器,它会为您处理文件关闭:

with open('C:\\Users\\zacan\\Desktop\\Python\\follows.txt', 'r') as follows_file:
    followers = {follower: followees for follower, *followees in map(str.split, follows_file)}

答案 3 :(得分:0)

followers = dict()
with open('C:\\Users\\zacan\\Desktop\\Python\\follows.txt', 'r') as f:
    for line in f:
        users = line.split(' ')
        followers[users[0]] = [_ for _ in users[1:]]

这应该有效,没有测试