我有这种形式的数据结构:
Song = namedtuple('Song', ['fullpath', 'tags']) # tags is a dictionary
Album = namedtuple('Album', ['album_key', 'songs'])
The data_structure is a list of Albums
有数千张专辑,每张专辑有10-20首歌曲
我正在寻找比赛:
for new_album in new_albums:
for old_album in old_albums:
if new_album.album_key == old_album.album_key:
for new_song in new_album.songs:
for old_song in old_album.songs:
if new_song.fullpath == old_song.fullpath:
# do something
break
这样效率很低,主要是因为它为每个new_album重启了old_album的循环。一种解决方案是使用字典,但我需要排序,OrderedDict只按键插入排序。另一个是将列表更改为字典,进程,然后更改回列表,但这似乎并不理想。
有更好的方法吗?
答案 0 :(得分:0)
您不必将数据转换为新格式,但您仍然可以使用dict查找匹配项:
paths = {}
for album, a_id in zip(albums, xrange(len(albums))):
for song, s_id in zip(album.songs, xrange(len(album.songs))):
if song.fullpath not in paths:
paths[song.fullpath] = (a_id, s_id)
else:
# do something
break
当您到达#do something
时,您可以使用paths[song.fullpath]
为您提供[0]
(相册)和[1]
匹配的歌曲。这样:
matched_album, matched_song = paths[song.fullpath]
print albums[matched_album].songs[matched_song], "matches!"
这有帮助吗?