这真的是一个有效的算法吗?

时间:2017-09-13 03:52:17

标签: python

我已阅读,但我不明白。

我希望看到像Python或Java这样的代码......但是我找不到它......你能给我一些代码吗?

1 个答案:

答案 0 :(得分:0)

Spotify不会为他们使用的算法提供任何代码或命名,但我接受了您的挑战,并编写了一些实现其算法的代码。

对于正在阅读此内容的人来说,他们懒得阅读这篇文章。 Spotify的随机功能导致同一位艺术家的歌曲连续播放2-3次,人们抱怨他们的算法不够随机。下面是他们的博客文章中概述的算法,它永远不会播放同一位艺术家两次的歌曲。该算法还会以同步方式将同一位艺术家的歌曲传播出去。

import random

songs = {
    'Artist a': ['song a1', 'song a2'],
    'Artist b': ['song b1', 'song b2', 'song b3'],
    'Artist c': ['song c1', 'song c2'],
    'Artist d': ['song d1', 'song d2', 'song d3', 'song d4'],
    'Artist e': ['song e1', 'song e2', 'song e3']
 }

def shuffle(artist_song_dict):
    """ Shuffles songs in a semi-random fashion while keeping songs by the same artist spread out, as described in
    https://labs.spotify.com/2014/02/28/how-to-shuffle-songs/
    artist_song_dict must be a dictionary where the key = the artist name and the value = a list of songs
    """
    lineup = {} #each song will be stored in this dictionary with a value between 0 and 1 representing the song's position in the lineup
    variation = .3 #a 30% variation will be added to all songs
    for artist in artist_song_dict:
        songs = artist_song_dict[artist]
        random.shuffle(songs)

        #distance between songs in the lineup, if we were to space the songs out evenly
        spread = 1/len(songs)

        #this is reffered to as the offset in the article, but I found this has a different purpose than what the article says
        #without this, the number of songs an artists has in the lineup affects the probablity that their songs will appear sooner/later in the lineup versus other artists
        artist_variation = random.uniform(0, spread-variation)

        for i, song in enumerate(songs):
            #the random 30% variation
            song_variation = random.uniform(0, spread*variation)

            #assign this song the next evenly spaced spot in the lineup plus our variations
            lineup[song] = i*(spread) + artist_variation + song_variation

    return sorted(lineup, key=lineup.get)

print(shuffle(songs))


作为发电机

这很酷。我想出了一个完全独立的算法,它实现了作为生成器的混乱。

此算法优于上述

  • 您可以动态地添加或删除歌曲到阵容。添加的歌曲将适当改组。
  • 稍快一点
  • 该算法一次一个地播放歌曲,所以如果你发现你经常只听你图书馆中的一些歌曲,那么你只需要改变你听的内容就可以获得额外的性能提升。

缺点

  • 该算法可以重复播放同一位艺术家的歌曲,直到排到阵容中的最后几首歌曲
  • 该算法非常容易遵循相同的艺术家序列。因此,艺术家的一首歌被播放,然后是艺术家b,然后是艺术家c,然后是艺术家a,依此类推。情况并非总是如此,但通常都是如此。
  • 不是由同一位艺术家均匀地分发歌曲,每次我们在所有艺术家中循环时,都有可能跳过艺术家。如果艺术家剩下的歌曲较少,那么他们将被跳过的可能性更大。同一位艺术家的歌曲之间仍然会有足够的空间(直到我们到达该算法开始分解的阵容的末尾)。

好的,这是算法。它将我将所有歌曲粉碎在一个列表中,然后跟踪艺术家在阵容中的最大歌曲数量。我们取这个数字(艺术家在阵容中拥有的歌曲数量最多),然后我们按照这个数字+随机变化向前推进阵容。

我们将播放特定艺术家的歌曲的可能性与他们在阵容中留下的歌曲数量一致,而且由于我们正在浏览由艺术家组织的列表,因此在我们之前留出一些时间能够回到同一位艺术家演奏的歌曲。由于这些项目,阵容保持改组而不是随机。

def shuffle2(artist_song_dict):
    """ Returns a generator that will shuffles the songs in a semi-random fashion while keeping some distance between songs by the same artist.
    artist_song_dict must be a dictionary where the key = the artist name and the value = a list of songs
    """

    # pairs is a list of all of the (artist, song) tuples
    pairs = []
    for artist, songs in artist_song_dict.items():
        random.shuffle(songs)
        pairs.extend([(artist, song) for song in songs])

    #cpy is a copy of the artist_song_dict
    #that we will use to keep track of how many songs each artist
    #has left that has not already been shuffled
    cpy = deepcopy(artist_song_dict)
    cpy_vals = cpy.values()

    n = 0 #n is the index of which song from our pairs list gets played next
    while len(pairs):
        #move forwards by the most songs an artist has left to be shuffled
        #anything smaller than this could lead to two songs being placed twice by the same artist
        n += max([len(l) for l in cpy_vals])
        #add a 30% random value that will result in an artists being skipped every once in a while
        n = int(n * random.uniform(1, 1.3))
        #wrap around the end of the list if exceed the length of our pairs list
        n = n % len(pairs)

        #now that we have our n number we can grab the artist and song
        artist = pairs[n][0]
        song = pairs[n][1]
        #we remove this info from the cpy dictionary
        cpy[artist].remove(song)
        #and we remove this info from the pairs list, and return the (artist, song) tuple.
        yield pairs.pop(n)

如果你想要一个更加优化速度的版本 - 而且更容易理解,那么就去吧。

def shuffle2_optimized(artist_song_dict):
    """ optimized version of the shuffle2 function """

    pairs = [(k,random.sample(v, len(v))) for k, v in artist_song_dict.items()]
    lens = [len(l) for l in artist_song_dict.values()]
    artists = list(artist_song_dict.keys())

    n = 0
    numb_pairs = len(pairs)
    max_lens = max(lens)-1
    while numb_pairs:
        n = int((n+max_lens) * (1 + .3*random.random())) % numb_pairs
        ret = pairs.pop(n)

        art_idx = artists.index(ret[0])
        lens[art_idx] -= 1
        if lens[art_idx]+1 == max_lens:
            max_lens = max(lens)

        numb_pairs = len(pairs)
        yield ret