我尝试读取MIDI
文件,并通过多处理pool.map()
将注释附加到列表中,因为有成千上万的{{1}}文件。我正在使用python类开发项目,如下所示。
MIDI
主脚本:
class NoteProcessor:
def __init__(self):
self.note_sequences = list()
self.notes = set()
.....
def process_midi(self, df):
PATH_ROOT = 'clean_midi/'
for row in tqdm(df.itertuples(), total=df.shape[0]):
notes = set()
path = PATH_ROOT + row.path + '/' + row.song + '.mid'
try:
midi = converter.parse(path)
elements = instrument.partitionByInstrument(midi)
encoded_notes = self.arrange_notes(elements)
if len(encoded_notes) != 0:
#self.note_sequences should be shared array
self.note_sequences.append(encoded_notes)
else:
self.df = self.df.drop(row[0])
except Exception as e:
self.df = self.df.drop(row[0])
logging.error("MIDI Error " + str(e))
由于if __name__ == '__main__':
df = read_csv()
note_p = NoteProcessor(df)
num_processes = cpu_count()
chunk_size = int(df.shape[0] / num_processes)
chunks = [df.ix[df.index[i:i + chunk_size]] for i in range(0, df.shape[0], chunk_size)]
pool = Pool()
pool.map(note_p.process_midi, chunks)
不是共享数组,因此没有用。我研究了python多处理数组,并找到了一些答案answer1 answer2。但是我对共享数组作为类属性感到困惑,当我将note_sequences
替换为RuntimeError: SynchronizedArray objects should only be shared between processes through inheritance
时出现了self.note_sequences = list()
错误。如何解决呢?