所以我在代码库中清理代码,我发现了一个值得注意的瓶颈。
/**
* Gets an InputStream to MP3Data for the returned information from a request
* @param synthText List of Strings you want to be synthesized into MP3 data
* @return Returns an input stream of all the MP3 data that is returned from Google
* @throws IOException Throws exception if it cannot complete the request
*/
public InputStream getMP3Data(List<String> synthText) throws IOException{
//Uses an executor service pool for concurrency. Limit to 1000 threads max.
ExecutorService pool = Executors.newFixedThreadPool(1000);
//Stores the Future (Data that will be returned in the future)
Set<Future<InputStream>> set = new LinkedHashSet<Future<InputStream>>(synthText.size());
for(String part: synthText){ //Iterates through the list
Callable<InputStream> callable = new MP3DataFetcher(part);//Creates Callable
Future<InputStream> future = pool.submit(callable);//Begins to run Callable
set.add(future);//Adds the response that will be returned to a set.
}
List<InputStream> inputStreams = new ArrayList<InputStream>(set.size());
for(Future<InputStream> future: set){
try {
inputStreams.add(future.get());//Gets the returned data from the future.
} catch (ExecutionException e) {//Thrown if the MP3DataFetcher encountered an error.
Throwable ex = e.getCause();
if(ex instanceof IOException){
throw (IOException)ex;//Downcasts and rethrows it.
}
} catch (InterruptedException e){//Will probably never be called, but just in case...
Thread.currentThread().interrupt();//Interrupts the thread since something went wrong.
}
}
return new SequenceInputStream(Collections.enumeration(inputStreams));//Sequences the stream.
}
这种方法很简单。它只是从互联网上获得了一堆与不同MP3相对应的输入流,并将它们排列在一起。不幸的是,这样做有很长的延迟,大约250ms左右,这会导致序列中出现大量MP3的问题。这里的阻塞调用当然是get(),它要求每个线程连接到服务器,然后开始读入本地机器的东西。这很好,但是这会立即产生一个巨大的带宽瓶颈,其中正在下载大量数据,因此SequenceInputStream可以对它们进行排序。有没有办法可以让像SequenceInputStream这样的类懒惰地评估?是否有自动执行此操作的库?任何帮助将不胜感激。
此外,如果您还没有意识到,这些音频文件是动态生成的,因此会产生延迟。有文本到语音的音频文件。