Question

我目前正在使用music21和midi2audio生成.wav文件以用于机器学习。我观察到一个非常奇怪的事实。

path = '/Users/CatLover/Documents/DataScience/Insight/MusicDetector/music/'
npath = '/Users/CatLover/Documents/DataScience/Insight/MusicDetector/noise/'
nmpath = '/Users/CatLover/Documents/DataScience/Insight/MusicDetector/noisy_music/'
tpath = '/Users/CatLover/Documents/DataScience/Insight/MusicDetector/test_music/'
tnmpath = '/Users/CatLover/Documents/DataScience/Insight/MusicDetector/test_noisy_music/'

def build_dataset(char_list):
    dictionary = dict()
    for char in char_list:
        dictionary[char] = len(dictionary)
    reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
    return dictionary, reverse_dictionary

def repeating_music(char, length):
    preamble = 'tinyNotation: 4/4'
    for i in range(length):
        preamble = preamble + ' ' + char + '4'
    music = music21.converter.parse(preamble)
    music.write('midi', fp = tpath + 'test1' + '.midi')
    subprocess.call(["midi2audio", tpath + 'test1' + '.midi', tpath + 'test1' + '.wav'])
    test_X = segmenting(tpath + 'test1' + '.wav')
    test_y = np.full(length, dic[char])
    return test_X, test_y
char_list = ["r","CCC","CCC#","DDD","DDD#","EEE","FFF","FFF#","GGG","GGG#","AAA","AAA#","BBB","CC","CC#","DD","DD#","EE","FF","FF#","GG","GG#","AA","AA#","BB","C","C#","D","D#","E","F","F#","G","G#","A","A#","B","c","c#","d","d#","e","f","f#","g","g#","a","a#","b","c'","c'#","d'","d'#","e'","f'","f'#","g'","g'#","a'","a'#","b'","c''","c''#","d''","d''#","e''","f''","f''#","g''","g''#","a''","a''#","b''"]
dic, rdic = build_dataset(char_list)

使用librosa.get_duration()可以看到，只有2个.wav音符的F文件比只有1个{{1}的.wav文件长约50％ } 注意。此外，除非使用F，否则这些比率与所使用的音符无关。为什么会这样？

什么因素决定.wav文件的长度？

0 个答案: