Question

由于某些奇怪的原因，当我尝试使用URLConnection获取网页源时，我在输出中得到一个“null”。请问有人可以解决一些问题吗？

我的方法：

public String getPageSource()
        throws IOException
{
    URL url = new URL( this.getUrl().contains( "http://" ) ? this.getUrl() : "http://" + this.getUrl() );
    URLConnection urlConnection = url.openConnection();

    BufferedReader br = new BufferedReader( new InputStreamReader( urlConnection.getInputStream(), "UTF-8" ) );

    String source = null;
    String line;

    while ( ( line = br.readLine() ) != null )
    {
        source += line;
    }

    return source;
}

我如何称呼它：

public static void main( String[] args )
        throws IOException
{
    WebPageUtil wpu = new WebPageUtil( "www.something.com" );

    System.out.println( wpu.getPageSource();
}

WPU consturctor：

public WebPageUtil( String url )
{
    this.url = url;
}

输出总是如下：

null<html><head>... //and then the rest of the source code, which is scraped correctly

没什么难的，对吗？但那该死的“null”来自哪里？！

感谢您的建议！

Answer 1

您正在初始化String源将null值，因此它的值将转换为String循环中第一个while并置的文字“null” 。

使用空String代替

String source = "";

或更好地使用StringBuilder。

Java get网页源在开头包含“null”

1 个答案: