多线程文件读取

时间:2015-12-08 00:35:56

标签: java multithreading file

我正在尝试用Java编写代码来通过几个线程读取文件并计算其中的单词。每个线程应该读取不同的行。它很好地计算单词(当我让1个线程运行时)但是我的线程正在读同一行并同时递增行计数器。我确信read方法中的synchronized关键字会修复它,但事实并非如此。我该怎么做才能解决它?

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;


public class WordCounterr implements Runnable {
    private static Hashtable<String, Integer> ht = new Hashtable<String, Integer>();
    private int lineCounter;
    private String path;
    private int tNumber;
    //private final AtomicInteger whichLine = new AtomicInteger();
    private static int whichLine;
    private static boolean flag;

    public WordCounterr(String path,int num){
        lineCounter = 0;
        //whichLine = 0;
        flag= false;
        this.path=path;
        tNumber = num;
    }

    public void countWords(String s) throws IOException{
        char[] c = s.toCharArray();
        String str="";  
        char ch;        
        for(int k=0;k<c.length;k++){                        

            ch=c[k];                    
            if((ch>40 && ch<91) ||(ch>96 && ch<123)){       
                if(ch>40 && ch<91)
                    ch+=32;             
                str+=ch;
            }           
            else if(ch==32 ||k==c.length-1){
                if(str.length()>1){ //sprawdzamy czy funkcja znalazla juz 
                    if(ht.containsKey(str))     //takie slowo               
                        ht.put(str,ht.get(str)+1); //znalazla - powiekszamy wartosc przy kluczu
                    else
                        ht.put(str,1);  //nie znalazla - dodajemy slowo do Hashtable            

                }
                str="";
            }
        }
    }

    public synchronized void read(String path) throws IOException{  
        BufferedReader buf=new BufferedReader(new FileReader(path));

        String linia ;
        for(int i=0;i<whichLine;i++){
            linia=buf.readLine();
        }

        if((linia=buf.readLine())!=null){
            System.out.println(linia);
            countWords(linia);
            lineCounter++;
            System.out.println("watek nr:"+tNumber+"ktora linia:"+whichLine);               
            whichLine++;
            /*try{
                    Thread.sleep(100);

                }catch(InterruptedException el){
                    System.out.println(el.toString());
                }*/
        } else
            setFlag(true);

        buf.close();    //pamietamy o zamknieciu pliku

    }

    public synchronized void print(){
        if(getFlag()){
            setFlag(false);         
            System.out.println(ht);
        }   
        System.out.println("watek nr: "+tNumber+", przeanalizowano "+ lineCounter+ "linii tekstu");
    }

    public void setFlag(boolean val){
        flag=val;
    }

    public boolean getFlag(){
        return flag;
    }

    @Override
    public void run() {
        try{    

            while(getFlag()==false) {   
                read(path);
                Thread.yield(); //let other thread read
                try {
                    Thread.sleep(100);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }catch(IOException ex){
            System.out.println(ex.toString());
        }//catch(InterruptedException el){
        //  System.out.println(el.toString());
        //}     
        print();
    }   

    public static void main(String[] args) throws IOException, InterruptedException{
        String path = args[0];
        int tNum = Integer.parseInt(args[1]);

        Thread[] thread = new Thread[tNum]; // tablica w?tków
        for (int i = 0; i < tNum; i++){
            thread[i] =new Thread(new WordCounterr(path,i));
        }   

        for (int i = 0; i < tNum; i++) 
            thread[i].start();
        }
}

2 个答案:

答案 0 :(得分:1)

synchronized修饰符的定义如下:it is not possible for two invocations of synchronized methods on the same object to interleave.

您在每个read

中调用方法Threads

但是,您没有调用相同的 read方法,因为您将WordCounterr实例传递给每个新Thread 。这意味着您在不同对象上调用该方法,该方法不受同步修饰符的影响。

要解决此问题,请尝试:

WordCounterr reader = new WordCounterr(path,0); //I changed i to 0 because it can't differentiate between threads with a simple int. This is because each Thread now references the same object.
Thread[] thread = new Thread[tNum]; // tablica w?tków
for (int i = 0; i < tNum; i++){
    thread[i] =new Thread(reader);
} 

而不是:

Thread[] thread = new Thread[tNum]; // tablica w?tków
for (int i = 0; i < tNum; i++){
    thread[i] =new Thread(new WordCounterr(path,i));
} 

我希望这会有所帮助:)

答案 1 :(得分:1)

我猜测它仍然无法有效地读取文件内容。 尝试更改同步点。它应该放在read方法中。此方法读取整个文件内容。而是尝试同步只读取此文件的下一行。您可以通过向每个WordCounterr实例添加相同的读取器文件实例并仅同步移动指针到下一行读取该行的内容的过程来实现它。计算行中的单词可以在没有同步的情况下完成,只应更新HashTable。 并行读取文件内容可以如下同步:

static class Reader implements Runnable {
    int lineReaded = 0;
    final Scanner scanner;

    Reader(Scanner scanner) {
        this.scanner = scanner;
    }

    public void run() {
        boolean hasNext = true;
        while (hasNext) {
            hasNext = false;
            synchronized (scanner) {
                if (scanner.hasNext()) {
                    hasNext = true;
                    String line = scanner.nextLine();
                    ++lineReaded;
                }
            }
            try {
                Thread.sleep((long) (Math.random() * 100));
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}