在java中格式化/分隔字符串中的名称和日期?

时间:2014-02-24 04:31:30

标签: java string date simpledateformat names

基本上我想在这里做的是逐行读取文本文件并将它们格式化为: 姓氏,头衔,名字,中间,然后是出生/死亡日期,如MM / DD / YYYY

我在这样的日期读到:

Month, day, year
Mon.  day, year
Mon  day,  year
MMDDYY
M/D/year
M-D-year

和这样的名字:

Last,   Title   First   Middle  (comma after name needed)

OR

Title   First   Middle   Last

我一直在这工作很长一段时间,但是无法理解。下面是我非常混乱的代码,经过大量的改变,我想要解决这个问题,感谢你的时间任何想要帮助我的人(我是学生)这里也是一个读取名字的例子在:

Roger  Veium  MAY     12,  1908        JUNE 2, 1984
McDermott, James   D.     Jan.    4,  1914      Jul  1, 1970
Amy  Chamberlain   Sep.     28, 1975   09-06-95
Gross,  Adam M. 01-03-77
Joseph Lisota  April    9,  1964
Joseph   W. Eisel Sep   3, 1990

代码:

public String[] readLines(String filename) throws IOException {
    FileReader fileReader = new FileReader(filename);
    BufferedReader bufferedReader = new BufferedReader(fileReader);
    List<String> lines = new ArrayList<String>();
    List<String> names = new ArrayList<String>();
    String line = null;
    String name = "";
    int i;
    int ind;
    int indTemp;
    int indTemp2;
    boolean flag = false;
    String[] monthsLong = {"JANUARY", "FEBRUARY", "MARCH", "APRIL", "MAY", "JUNE", "JULY", "AUGUST", "SEPTEMBER", "OCTOBER", "NOVEMBER", "DECEMBER"};
    String[] monthsLongR = {" 01", "02", " 03", "04", "05", "06", "07", "08", " 09", "10", "11", "12"};
    String[] monthsLow = {"JAN\\.", "FEB\\.", "MAR\\.","APR\\.", "MAY", "JUN\\.", "JUL\\.", "AUG\\.", "SEP\\.", "OCT\\.", "NOV\\.", "DEC\\."};
    String[] monthsCaps = {"   JAN", "FEB", " MAR", "APR", "MAY", "JUN", "JUL", "AUG", " SEP", "OCT", "NOV", "DEC"};

    while ((line = bufferedReader.readLine()) != null) {
        line = line.replaceAll("null", "");
        line = line.replaceAll("-","/");
        line = line.toUpperCase() ;

        for(i = 0; i<12; i++)
        {
            line = line.replaceAll(monthsLong[i], monthsLongR[i]);
        }

        for(i = 0; i<12; i++)
        {
            line = line.replaceAll(monthsLow[i], monthsLongR[i]);
        }

        for(i = 0; i<12; i++)
        {
            line = line.replaceAll(monthsCaps[i], monthsLongR[i]);
        }

        line = line.replaceAll("\\s+", " ");
        if (Character.toString(line.charAt(0)).equals(" "))
            line = line.replaceFirst(" ", "");

 /*     name = line;

        ind = name.indexOf(".");
        indTemp = name.indexOf("0");
        indTemp2 = name.indexOf("1");

        if (ind > -1) {
            System.out.println(" period");
            ind = ind + 1;
            flag = true;
        }
        if(flag == false) {
            if(indTemp2 > indTemp){
                ind = indTemp2 -1;
                System.out.println(" 1");
            }
            if (indTemp > indTemp2){ 
                ind = indTemp - 1;
                System.out.println(" 2");
            }
        }
        flag = false;
    */
        // name = name.substring(0,ind);

        lines.add(line);
    }
    bufferedReader.close();
    return lines.toArray(new String[lines.size()]);
}

1 个答案:

答案 0 :(得分:0)

好的,那么唯一的另一种方法是逐行进行并为每种不同的行格式创建规则列表。有一些重复,但有许多行与其他行非常不同。然后,在您执行操作时循环遍历这些行,并查找规则指针,以便将该规则应用于该行。

据我所知,这是最好的方法。我有这些文件的经验,如果处理不当,它们可能是一场噩梦。在完成规则的过程中,您实际上可能会找到一种可以使用的模式,通常就是这种情况。

我希望这会有所帮助。