具有多个线程的ExecutorService无法正常工作但在调试模式下工作正常

时间:2017-04-14 13:39:28

标签: java multithreading

基本上我必须使用多线程处理一个包含近100万条记录的大型csv文件。

我创建了一个IngestionCallerThread类

public class IngestionCallerThread {

public static void main(String[] args) {

    try {
        int count = 0;
        InputStream ios = IngestionCallerThread.class.getClassLoader().getResourceAsStream("aa10.csv");

        byte[] buff = new byte[8000];

        int bytesRead = 0;
        ByteArrayOutputStream bao = new ByteArrayOutputStream();

        while ((bytesRead = ios.read(buff)) != -1) {
            bao.write(buff, 0, bytesRead);
        }

        byte[] data = bao.toByteArray();

        ByteArrayInputStream bin = new ByteArrayInputStream(data);
        BufferedReader fileInputStreamBufferedReader = new BufferedReader(new InputStreamReader(bin));

        while ((fileInputStreamBufferedReader.readLine()) != null) {
            count++;
        }
        bin.reset();

        int numberOfThreads = 12;
        int rowsForEachThread = count / numberOfThreads;
        int remRows = count % numberOfThreads;
        int startPosition = 0;
        System.out.println(count);
        ExecutorService es = Executors.newCachedThreadPool();
        for (int i = 0; i < numberOfThreads && startPosition < count; i++) {
            if (remRows > 0 && i + 1 >= numberOfThreads)
                rowsForEachThread = remRows;

            IngestionThread ingThread = new IngestionThread(bin, startPosition, rowsForEachThread);
            es.execute(ingThread);
            startPosition = (startPosition + rowsForEachThread);
        }
        es.shutdown();
        if (es.isTerminated()) {
            System.out.println("Completed");
        }
        // t2.start();
    } catch (IOException e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }

    catch (Exception e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

}

}

我用来调用我已经实现的另一个Runnable类

public class IngestionThread implements Runnable {

InputStream is;
long startPosition;
long length;

public IngestionThread(InputStream targetStream, long position, long length) {
    this.is = targetStream;
    this.startPosition = position;
    this.length = length;
}

@Override
public void run() {
    // TODO Auto-generated method stub
    int currentPosition = 0;
    try {
        is.reset();
    } catch (IOException e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }
    BufferedReader fileInputStreamBufferedReader = new BufferedReader(new InputStreamReader(is));
    if (startPosition != 0) {   
        String line;
        try {
            while (((line = fileInputStreamBufferedReader.readLine())) != null) {
                if (currentPosition + 1 == startPosition)
                    break;
                currentPosition++;
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
    try {
        int execLength = 0;
        String line;
        while ((line = fileInputStreamBufferedReader.readLine()) != null && execLength < length) {
            System.out.println(line);
            execLength++;
        }
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

}

我测试了20个记录的小型csv文件。问题是当我调试类几乎所有记录都被打印出来时。但是当我运行课程时,有时会读取15条记录,有时会读取12条记录。我不确定是什么问题。任何帮助将非常感激。提前谢谢。

1 个答案:

答案 0 :(得分:2)

您遇到问题的原因是您有许多线程从包含共享BufferedReader的不同ByteArrayInputStream对象中读取。没有同步,这意味着不同的线程将读取其他线程应该读取的流的部分。

每个线程都需要自己的ByteArrayInputStream