Question

我有这种形式的数据结构：

Song = namedtuple('Song', ['fullpath', 'tags']) # tags is a dictionary
Album = namedtuple('Album', ['album_key', 'songs']) 

The data_structure is a list of Albums

有数千张专辑，每张专辑有10-20首歌曲

我正在寻找比赛：

for new_album in new_albums:
    for old_album in old_albums:
        if new_album.album_key == old_album.album_key:
            for new_song in new_album.songs:
                for old_song in old_album.songs:
                    if new_song.fullpath == old_song.fullpath:
                        # do something
                        break

这样效率很低，主要是因为它为每个new_album重启了old_album的循环。一种解决方案是使用字典，但我需要排序，OrderedDict只按键插入排序。另一个是将列表更改为字典，进程，然后更改回列表，但这似乎并不理想。

有更好的方法吗？

Answer 1

您不必将数据转换为新格式，但您仍然可以使用dict查找匹配项：

paths = {}

for album, a_id in zip(albums, xrange(len(albums))):
    for song, s_id in zip(album.songs, xrange(len(album.songs))):
         if song.fullpath not in paths:
              paths[song.fullpath] = (a_id, s_id)
         else:
              # do something
              break

当您到达#do something时，您可以使用paths[song.fullpath]为您提供[0]（相册）和[1]匹配的歌曲。这样：

 matched_album, matched_song = paths[song.fullpath]
 print albums[matched_album].songs[matched_song], "matches!"

这有帮助吗？

python：在数据结构中查找匹配项

1 个答案: