如何将文本文件中的字符串分成不同的数组(java)

时间:2014-09-23 07:27:07

标签: java arrays regex string

我有一个由字符串组成的文本文件。我想要做的是将字符串与" [ham]"分开。和" [垃圾邮件]"在内部到不同的阵列,我怎么能这样做,我想用正则表达式识别模式(火腿和垃圾邮件),但我不知道开始。请帮我。

文本文件中的字符串:

good [ham]
very good [ham]
bad [spam]
very bad [spam]
very bad, very bad [spam]

我希望输出如下:

火腿阵列:

good
very good

垃圾邮件阵列:

bad
very bad
very bad, very bad

请帮帮我。

2 个答案:

答案 0 :(得分:2)

我认为你应该去ArrayList

而不是使用数组
List<String> ham=new ArrayList<String>();
List<String> spam=new ArrayList<String>();
if(line.contains("[ham]"))
   ham.add(line.substring(0,line.indexOf("[ham]")));
if(line.contains("[spam]"))
   spam.add(line.substring(0,line.indexOf("[spam]")));

答案 1 :(得分:0)

如果你真的需要这样做(用正则表达式和数组作为输出)写这样的代码:

public class StringResolve {

    public static void main(String[] args) {
        try {
            // read data from some source
            URL exampleTxt = StringResolve.class.getClassLoader().getResource("me/markoutte/sandbox/_25989334/example.txt");
            Path path = Paths.get(exampleTxt.toURI());
            List<String> strings = Files.readAllLines(path, Charset.forName("UTF8"));

            // init all my patterns & arrays
            Pattern ham = getPatternFor("ham");
            List<String> hams = new LinkedList<>();

            Pattern spam = getPatternFor("spam");
            List<String> spams = new LinkedList<>();

            // check all of them
            for (String string : strings) {
                Matcher hamMatcher = ham.matcher(string);
                if (hamMatcher.matches()) {
                    // we choose only text without label here
                    hams.add(hamMatcher.group(1));
                }
                Matcher spamMatcher = spam.matcher(string);
                if (spamMatcher.matches()) {
                    // we choose only text without label here
                    spams.add(spamMatcher.group(1));
                }
            }

            // output data through arrays
            String[] hamArray = hams.toArray(new String[hams.size()]);
            System.out.println("Ham array");
            for (String s : hamArray) {
                System.out.println(s);
            }
            System.out.println();

            String[] spamArray = spams.toArray(new String[spams.size()]);
            System.out.println("Spam array");
            for (String s : spamArray) {
                System.out.println(s);
            }

        } catch (URISyntaxException | IOException e) {
            e.printStackTrace();
        }
    }

    private static Pattern getPatternFor(String label) {
        // Regex pattern for string with same kind: some text [label]
        return Pattern.compile(String.format("(.+?)\\s(\\[%s\\])", label));
    }

}

如果您需要从驱动器中的某处读取它,可以使用Paths.get("some/path/to/file")