比较文本文件中的两行&打印对应于相似日期的单行

时间:2013-07-11 04:13:55

标签: java date hashmap

我想把第1行和第5行的用户名和日期相同但在第一行包含它的时间,在第5行它包含时间 我想阅读这两行并比较它以检查两行是否具有相同的用户名和日期,如果是,则将其作为单行打印在其他文本文件或哈希映射中

这样的例子:"sangeetha-May 02, 2013 , -in-09:48:06:61 -out-08:08:19:27(在JAVA中)

这是文本文件的内容:

line 1. "sangeetha-May 02, 2013 , -in-09:48:06:61
line 2. "lohith-May 01, 2013 , -out-09:10:41:61
line 3 . "sushma-May 02, 2013 , -in-09:48:06:61
line 4. "sangeetha-May 01, 2013 , -out-08:36:38:50
line 5. "sangeetha-May 02, 2013 , -out-08:08:19:27
line 6. "sushma-May 02, 2013 , -out-07:52:13:51
line 7. "sangeetha-Jan 01, 2013 , -in-09:27:17:52-out-06:47:48:00
line 8. "madhusudhan-Jan 01, 2013 , -in-09:38:59:31-out-07:41:06:40

以上数据通过以下代码生成以上数据

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.HashSet;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Set;
import java.util.Map;
import java.util.TreeMap;

public class FlatFileParser 
{
    public static void main(String[] args)
    {    
        // The stream we're reading from
        BufferedReader in;
        BufferedWriter out1;
         BufferedWriter out2;
        // Return value of next call to next()
        String nextline;
        try 
        {
            if (args[0].equals("1"))
            {
                in = new BufferedReader(new FileReader(args[1]));
                nextline = in.readLine();
                while(nextline != null)
                {
                    nextline = nextline.replaceAll("\\<packet","\n<packet");
                    System.out.println(nextline);
                    nextline = in.readLine();
                }
                in.close();
            }
            else
            {
                in = new BufferedReader(new FileReader(args[1]));
                out1 = new BufferedWriter(new FileWriter("inValues.txt" , true));
                 out2 = new BufferedWriter(new FileWriter("outValues.txt"));
                nextline = in.readLine();
                HashMap<String,String> inout = new HashMap<String,String>();
                while(nextline != null)
                {
                    try
                    {
                        if (nextline.indexOf("timetracker")>0)
                        {
                            String from = "";
                            String indate = "";
                            if (nextline.indexOf("of in")>0)
                            {

                                int posfrom = nextline.indexOf("from");
                                int posnextAt = nextline.indexOf("@", posfrom);
                                int posts = nextline.indexOf("timestamp");
                                from = nextline.substring(posfrom+5,posnextAt);
                                indate = nextline.substring(posts+11, posts+23);
                                String dd = indate.split(" ")[1];
                                String key = dd+"-"+from+"-"+indate;
                                //String key = from+"-"+indate;
                                String intime = "-in-"+nextline.substring(posts+24, posts+35);
                                inout.put(key, intime);    

                            }
                            else if (nextline.indexOf("of out")>0)
                            {
                                int posfrom = nextline.indexOf("from");
                                int posnextAt = nextline.indexOf("@", posfrom);
                                int posts = nextline.indexOf("timestamp");
                                from = nextline.substring(posfrom+5,posnextAt);
                                indate = nextline.substring(posts+11, posts+23);
                                String dd = indate.split(" ")[1];
                                String key = dd+"-"+from+"-"+indate;
                                String outtime = "-out-"+nextline.substring(posts+24, posts+35);
                                if (inout.containsKey(key))
                                {
                                    String val = inout.get(key);
                                    if (!(val.indexOf("out")>0))
                                        inout.put(key, val+outtime);                    
                                }
                                else
                                    inout.put(key, outtime);
                            }
                        }
                    }
                    catch(Exception e)
                    {
                        System.err.println(nextline);
                        System.err.println(e.getMessage());
                    }
                    nextline = in.readLine();    
                }
                in.close();

                for(String key: inout.keySet())
                {
                    String val = inout.get(key);
                    out1.write(key+" , "+val+"\n");
                    System.out.println(key + val);
                }
                out1.close();
            }
        } 
        catch (IOException e)
        {
            throw new IllegalArgumentException(e);
        }
    }
}

描述:这些是员工的登录和注销时间,我正在从日志文件中读取这些内容,但有些内容正在单行中正确显示,如第7行和第8行                        有些是同一个日期的不同行,我希望它像我上面提供的例子一样在同一行打印,                        无论是进出时间都记录在单行中的记录应该保留它......                        PLZ可以帮助任何人......!

3 个答案:

答案 0 :(得分:1)

考虑到您有lstFile中文件中所有行的列表。

你可以这样做

String output="",line1,line2;
for(int i=0;i<lstFile.size();i++)
{

    line1=lstFile.get(i);
    if(line1.contains("in") && line1.contains("out"))continue;
    for(int j=i+1;j<lstFile.size();j++)
    {
        line2=lstFile.get(j);

        if(line2.contains("in") && line2.contains("out"))continue;

        if(line1.contains(getNameDate(line2)) && line2.contains("out") && line1.contains("in"))
        {
              output+=line1+line2.substring(line2.lastIndexOf(","),line2.length());
              output+=System.getProperty("line.separator");
              break;
        }
    }
}
//output now contains your desired result

以下方法将获取名称和日期

public String getNameDate(String input)
{
    return input.substring(0,input.lastIndexOf(","));
}

答案 1 :(得分:0)

这是基本的数据解析。在伪代码中,我就是这样做的

Create a class that holds 4 valus, Employee, Date, InTime, OutTime
Instantiate a HashMap for all the final log lines
For each log line
    Parse the line using RegEx to find Employee, Date and in and/or out time
    Create the HashKey using Employee + Date
    See if the HashMap already contains such an object, else create one
    Populate with the in and/or out times found on the current line
Done, the HashMap now contains all parsed data.

答案 2 :(得分:0)

以下是一些示例代码,可帮助您入门。

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

public class ParseLogs {

    public static void main(String[] args) {
        BufferedReader br = null;
        try {
            String line;

            br = new BufferedReader(new FileReader("src/main/resources/log.txt"));
            while ((line = br.readLine()) != null) {
                String[] split = line.split(" ");

                if (split.length > 2) {
                    String name = split[2].split("-")[0];
                    name = name.replace("\"", "");
                    System.out.println(name);
                }

                if (split.length > 5) {
                    String date = split[6];
                    System.out.println(date);
                }
            }

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (br != null) {
                    br.close();
                }
            } catch (IOException ex) {
                ex.printStackTrace();
            }
        }
    }
}

另一篇文章是正确的,您需要使用正则表达式并查看字符串行的模式,以便您可以将它们分解。如果可能的话,尝试最初标准化日志,使它们符合类似的模式,之后不需要按摩数据。

这是程序的输出

sangeetha
-in-09:48:06:61
lohith
-out-09:10:41:61
.
,
sangeetha
-out-08:36:38:50
sangeetha
-out-08:08:19:27
sushma
-out-07:52:13:51
sangeetha
-in-09:27:17:52-out-06:47:48:00
madhusudhan
-in-09:38:59:31-out-07:41:06:40

请注意,您仍然需要清除“名称”和“日期”变量以查找错误,但希望这会有所帮助。