.split()函数如何在java中工作?

时间:2014-10-24 16:17:05

标签: java

我一直试图让这个代码块起作用:

System.out.println("Please type a sentence.");
String input = scan.nextLine();

for (String u: input.split(" ")) {          
System.out.println(u);      
}

代码似乎工作的意思是我想为每一行添加一个不同的单词,我需要使用.split()函数,但我不知道它为什么工作。

有人可以向我解释一下吗?

2 个答案:

答案 0 :(得分:2)

了解特定方法如何工作的最直接(通常是最合适的)方式是参考其API文档并进一步了解其实现。

可以找到String.split()的API,例如http://docs.oracle.com/javase/6/docs/api/java/lang/String.html

由于Java库是开源的,您可以随时查看它们的实现。

要做到这一点,您可以,例如,只需在您调用String.split()的行上放置一个断点,然后单步进入。

(这需要正确设置类路径和库)。

或者,您只需从位于JDK安装根目录的src.zip存档中打开相应的.java文件。

大多数这些基本方法的实施非常精简和小,所以你不应该很难研究它。

要正确回答您的问题,请找到Oracles String类String.java中定义的确切实现及其文档:


/**
 * Splits this string around matches of the given
 * <a href="../util/regex/Pattern.html#sum">regular expression</a>.
 *
 * <p> The array returned by this method contains each substring of this
 * string that is terminated by another substring that matches the given
 * expression or is terminated by the end of the string.  The substrings in
 * the array are in the order in which they occur in this string.  If the
 * expression does not match any part of the input then the resulting array
 * has just one element, namely this string.
 *
 * <p> When there is a positive-width match at the beginning of this
 * string then an empty leading substring is included at the beginning
 * of the resulting array. A zero-width match at the beginning however
 * never produces such empty leading substring.
 *
 * <p> The {@code limit} parameter controls the number of times the
 * pattern is applied and therefore affects the length of the resulting
 * array.  If the limit <i>n</i> is greater than zero then the pattern
 * will be applied at most <i>n</i>&nbsp;-&nbsp;1 times, the array's
 * length will be no greater than <i>n</i>, and the array's last entry
 * will contain all input beyond the last matched delimiter.  If <i>n</i>
 * is non-positive then the pattern will be applied as many times as
 * possible and the array can have any length.  If <i>n</i> is zero then
 * the pattern will be applied as many times as possible, the array can
 * have any length, and trailing empty strings will be discarded.
 *
 * <p> The string {@code "boo:and:foo"}, for example, yields the
 * following results with these parameters:
 *
 * <blockquote><table cellpadding=1 cellspacing=0 summary="Split example showing regex, limit, and result">
 * <tr>
 *     <th>Regex</th>
 *     <th>Limit</th>
 *     <th>Result</th>
 * </tr>
 * <tr><td align=center>:</td>
 *     <td align=center>2</td>
 *     <td>{@code { "boo", "and:foo" }}</td></tr>
 * <tr><td align=center>:</td>
 *     <td align=center>5</td>
 *     <td>{@code { "boo", "and", "foo" }}</td></tr>
 * <tr><td align=center>:</td>
 *     <td align=center>-2</td>
 *     <td>{@code { "boo", "and", "foo" }}</td></tr>
 * <tr><td align=center>o</td>
 *     <td align=center>5</td>
 *     <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
 * <tr><td align=center>o</td>
 *     <td align=center>-2</td>
 *     <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
 * <tr><td align=center>o</td>
 *     <td align=center>0</td>
 *     <td>{@code { "b", "", ":and:f" }}</td></tr>
 * </table></blockquote>
 *
 * <p> An invocation of this method of the form
 * <i>str.</i>{@code split(}<i>regex</i>{@code ,}&nbsp;<i>n</i>{@code )}
 * yields the same result as the expression
 *
 * <blockquote>
 * <code>
 * {@link java.util.regex.Pattern}.{@link
 * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
 * java.util.regex.Pattern#split(java.lang.CharSequence,int) split}(<i>str</i>,&nbsp;<i>n</i>)
 * </code>
 * </blockquote>
 *
 *
 * @param  regex
 *         the delimiting regular expression
 *
 * @param  limit
 *         the result threshold, as described above
 *
 * @return  the array of strings computed by splitting this string
 *          around matches of the given regular expression
 *
 * @throws  PatternSyntaxException
 *          if the regular expression's syntax is invalid
 *
 * @see java.util.regex.Pattern
 *
 * @since 1.4
 * @spec JSR-51
 */
public String[] split(String regex, int limit) {
    /* fastpath if the regex is a
     (1)one-char String and this character is not one of the
        RegEx's meta characters ".$|()[{^?*+\\", or
     (2)two-char String and the first char is the backslash and
        the second is not the ascii digit or ascii letter.
     */
    char ch = 0;
    if (((regex.value.length == 1 &&
         ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
         (regex.length() == 2 &&
          regex.charAt(0) == '\\' &&
          (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
          ((ch-'a')|('z'-ch)) < 0 &&
          ((ch-'A')|('Z'-ch)) < 0)) &&
        (ch < Character.MIN_HIGH_SURROGATE ||
         ch > Character.MAX_LOW_SURROGATE))
    {
        int off = 0;
        int next = 0;
        boolean limited = limit > 0;
        ArrayList<String> list = new ArrayList<>();
        while ((next = indexOf(ch, off)) != -1) {
            if (!limited || list.size() < limit - 1) {
                list.add(substring(off, next));
                off = next + 1;
            } else {    // last one
                //assert (list.size() == limit - 1);
                list.add(substring(off, value.length));
                off = value.length;
                break;
            }
        }
        // If no match was found, return this
        if (off == 0)
            return new String[]{this};

        // Add remaining segment
        if (!limited || list.size() < limit)
            list.add(substring(off, value.length));

        // Construct result
        int resultSize = list.size();
        if (limit == 0) {
            while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
                resultSize--;
            }
        }
        String[] result = new String[resultSize];
        return list.subList(0, resultSize).toArray(result);
    }
    return Pattern.compile(regex).split(this, limit);
}

答案 1 :(得分:0)

split方法使用您提供的正则表达式,并在该表达式出现的任何地方拆分隐式参数。

例如:

"qwerty uiop asdfghjkl".split(" ");
       ^    ^

字符串将在这些点被分解,因为我们指定我们想要在空格处分割,产生以下字符串数组:

{"qwerty", "uiop", "asdfghjkl"}