如何从网站正确下载文本并将其放入.txt?

时间:2018-06-17 15:47:14

标签: java text web connection printwriter

我正在尝试从某个网站下载所有文本并将其放入文本文件中。我有这样的东西,但它只是下载第一句而不是所有的文字。有人能告诉我我做错了什么吗?感谢您的帮助和对不起我的英语,我是初学者。

    Connection connect = Jsoup.connect("http://www.onet.pl/");

    try {
        Document document = connect.get();
        Elements links = document.select("span.title");
        PrintWriter out = new PrintWriter("popular_words.txt");

        for (Element elem : links) {
            if (elem.hasText()) {
                out.append(elem.text());
            }
            out.close();
        }
    } catch (Exception e) {
        e.printStackTrace();
    }


}

2 个答案:

答案 0 :(得分:1)

在循环外关闭PrintWriter,否则它将在第一次迭代后关闭,并在下一次迭代中抛出异常。

Connection connect = Jsoup.connect("http://www.onet.pl/");

        try {
            Document document = connect.get();
            Elements links = document.select("span.title");
            PrintWriter out = new PrintWriter("popular_words.txt");

            for (Element elem : links) {
                if (elem.hasText()) {
                    out.append(elem.text());
                }

            }
            out.close();
        } catch (Exception e) {
            e.printStackTrace();
        }

答案 1 :(得分:1)

得到这条线 if语句之外的from tkinter import * from tkinter import ttk def start_process(n=0, times=10): n += 1 if not stop_button_state and n < times: print('Iteration started') print(f'Iteration number: {n}') print('Iteration completed \n') root.after(1000, start_process, n) else: print('stopping everything') def stop_fun(): global stop_button_state stop_button_state = True if __name__ == '__main__': root = Tk() start = ttk.Button(root, text="Start", command=start_process) start.grid(row=0, column=0, padx=10, pady=10) p = ttk.Button(root, text="Stop", command=stop_fun) p.grid(row=1, column=0) stop_button_state = False root.mainloop() ,如果它没有通过打印所选元素的数量来检查您的选择标准,那么可以在它下面重试它可能是原因