我目前正在学习数据挖掘,电子书中的一个例子我阅读时有一个字典来存储每个用户及其对歌曲的评分。这是给出的字典的初始化。
users ={"Angelica": {"Blues Traveler": 3.5, "Broken Bells": 2.0,
"Norah Jones": 4.5, "Phoenix": 5.0,
"Slightly Stoopid": 1.5,
"The Strokes": 2.5, "Vampire Weekend": 2.0},
"Bill": {"Blues Traveler": 2.0, "Broken Bells": 3.5,
"Deadmau5": 4.0, "Phoenix": 2.0,
"Slightly Stoopid": 3.5, "Vampire Weekend": 3.0},
"Chan": {"Blues Traveler": 5.0, "Broken Bells": 1.0,
"Deadmau5": 1.0, "Norah Jones": 3.0,
"Phoenix": 5, "Slightly Stoopid": 1.0}}
如果相同的值在文本文件中,其中每行包含每个用户的信息,我就会坚持找出如何创建相同的字典。 这将是文本文件中第一行的示例:
Angelica, "Blues Traveler": 3.5, "Broken Bells": 2.0, "Norah Jones": 4.5, "Phoenix": 5.0, "Slightly Stoopid": 1.5, "The Strokes": 2.5, "Vampire Weekend": 2.0
到目前为止我所拥有的:
with open(text_file) as f:
for line in f:
songs = line.split(',')
for current_song in songs
ratings = current_songs.split(':')
我不太清楚如何创建字典。嵌套的词典让我困惑了几个小时。
答案 0 :(得分:4)
users = {}
with open(text_file) as f:
for line in f:
parts = line.rstrip().split(', ')
name = parts[0]
users[name] = {}
for rating in parts[1:]:
song, score = rating.split(': ')
song = song[1:-1]
users[name][song] = score
print users
答案 1 :(得分:0)
可以使用json库更简洁。我们会做以下事情:
首先让我们分割线条并分隔艺术家的名字和数据。因此,字符串Angelica, "Blues Traveler": 3.5, "Broken Bells": 2.0, "Norah Jones": 4.5,...
分为两个字符串Angelica
和第二个字符串"Blues Traveler": 3.5, "Broken Bells": 2.0, "Norah Jones": 4.5,...
username, songs = line.split(',', 1)
如果您仔细观察,第二个字符串可以通过将其导入json.loads
轻松转换为字典,但是它没有{
和}
来制作它一个有效的json。所以我们将手动添加它并将其导入json。
songs = "{%s}" % songs
json.loads(songs)
所以总代码是:
import json
user = {}
with open('my.txt') as f:
for line in f:
username, songs = line.split(',', 1)
songs = "{%s}" % songs
user[username] = json.loads(songs)
print user