BeautifulSoup和CSV:每个字符后的分隔符

时间:2019-02-14 10:37:54

标签: python csv beautifulsoup

我想抓取一个Wikipedia页面,并将所有没有标签的h2标题写到csv中。我想这只是一个简单的初学者任务。

现在我有问题,在csv中的每个字符后都设置了分号。

我的代码:

from bs4 import BeautifulSoup
import requests
import csv

url = "https://de.wikipedia.org/wiki/%C3%84gypten"
r = requests.get(url).content


soup = BeautifulSoup(r, 'lxml')

for h2 in soup.find_all('h2'):
    # Output is okay
    print(h2.get_text())

    with open('Daten/Test.csv', mode='a') as csv_file:
        write_h2 = csv.writer(csv_file, delimiter=';')
        write_h2.writerow(h2)

csv中的输出如下:

I;n;h;a;l;t;s;v;e;r;z;e;i;c;h;n;i;s

ܻb;e;r;b;l;i;c;k

L;a;n;d;e;s;n;a;m;e

G;e;o;g;r;a;p;h;i;e

B;e;v;��k;e;r;u;n;g

G;e;s;c;h;i;c;h;t;e

P;o;l;i;t;i;k

M;i;l;i;t;伲

V;e;r;w;a;l;t;u;n;g;s;g;l;i;e;d;e;r;u;n;g

S;o;z;i;a;l;e; ;L;a;g;e

W;i;r;t;s;c;h;a;f;t

T;o;u;r;i;s;m;u;s; ;u;n;d; ;V;e;r;k;e;h;r

K;u;l;t;u;r

L;i;t;e;r;a;t;u;r

W;e;b;l;i;n;k;s

E;i;n;z;e;l;n;a;c;h;w;e;i;s;e

N;a;v;i;g;a;t;i;o;n;s;m;e;n;�

我对编程非常陌生,所以如果您的答案对于新手来说很容易理解,我将不胜感激。

控制台的输出效果很好。

2 个答案:

答案 0 :(得分:0)

writerow将列表作为输入,因此您必须传递字符串列表,如果传递字符串,它将被视为字符列表。

请参见以下示例:

       public void run() {
            try {
                System.out.println(file1 + " Started Merging " + file2 );
                FileReader fileReader1 = new FileReader(file1);
                FileReader fileReader2 = new FileReader(file2);

                //......TODO with N ?? ......

                FileWriter writer = new FileWriter(file3);
                BufferedReader bufferedReader1 = new BufferedReader(fileReader1);
                BufferedReader bufferedReader2 = new BufferedReader(fileReader2);
                String line1 = bufferedReader1.readLine();
                String line2 = bufferedReader2.readLine();
                //Merge 2 files based on which string is greater.
                while (line1 != null || line2 != null) {
                    if (line1 == null || (line2 != null && line1.compareTo(line2) > 0)) {
                        writer.write(line2 + "\r\n");
                        line2 = bufferedReader2.readLine();
                    } else {
                        writer.write(line1 + "\r\n");
                        line1 = bufferedReader1.readLine();
                    }
                }
                System.out.println(file1 + " Done Merging " + file2 );
                new File(file1).delete();
                new File(file2).delete();
                writer.close();
            } catch (Exception e) {
                System.out.println(e);
            }
        }

答案 1 :(得分:0)

您需要将字符串h2.get_text()作为列表传递

所以您必须替换最后一行:

with open('Daten/Test.csv', mode='a') as csv_file:
    write_h2 = csv.writer(csv_file, delimiter=';')
    write_h2.writerow([h2.get_text()])