我在双核计算机上运行以下伪程序的等效java代码的2个单独实例,而机器上没有其他活动。两者都使用log4j登录到同一个文件。
for each url
step 1. download data from url to U //takes 3-5 seconds
step 2. perform computation on data f(U) = A. //0.1 sec
step 3. Store A to disk
step 4. perform computation on data g(U) = B. //0.5 sec
step 5. Store B to disk
日志中发生了一些奇怪的事情。我希望2个实例完全独立运行,因为第1步是最长的操作,大多数时候它们都应该等待日志中的url响应:
process M: downloading data from url1
process N: downloading data from url2
//below in random order
process M: performing computation on data f(U1) = A1
process N: performing computation on data f(U2) = A2
process M: storing to disk A1
process M: performing computation on data g(U1) = B1
process N: storing to disk A2
//etc...
但是日志似乎总是按照大部分时间排序。并行性不可见:
process M: downloading data from url1
process M: performing computation on data f(U1) = A1
process M: storing to disk A1
process M: performing computation on data g(U1) = B2
process M: storing to disk B1
process N: downloading data from url2
process N: performing computation on data f(U2) = A2
process N: storing to disk A2
process N: performing computation on data g(U2) = B2
process N: storing to disk B2
process M: downloading data from url3
//etc...
可能导致此问题的原因是什么?一些可能性:
双核未正确使用?
他们无法同时访问互联网?
日志以某种方式锁定进程?
最后,如何让它们并行运行?