麻烦写BeautifulSoup输出到文件

时间:2014-11-18 20:05:29

标签: python python-2.7 web-scraping beautifulsoup

Hello stackoverflow社区!

我还在学习Python编码的细节,所以请原谅我即将发布的代码。

我目前正在尝试编写一个脚本,该脚本将使用BS4从http://kat.ph中删除最新的媒体种子列表并将其保存到文件中。但是,我无法从BS4输出打印到此文件。当您打开文本文件时,它是空白的,但是当您在终端中运行脚本时,它可以正常工作。最后,我想让python在一封电子邮件中发送bs4输出(这是我最初遇到这个问题的地方,并决定看看我是否可以写入.txt文件)。

我目前没有在家用电脑上制作的脚本,但是我重新创建了另一个我做了同样事情的脚本。

非常感谢任何帮助/建议!

from bs4 import BeautifulSoup
import requests
import time

#The goal of this script was to scrape the names of the latest media torrents and  write them to a text file.
#When I run the script on my computer, I can see the prompt give me the list of torrents just fine.
#When I try to write to a file or send an email, it doesn't print anything.

req = requests.get('http://kat.ph')

site = req.text

soup = BeautifulSoup(site) #Tried making this 'soup = str(BeautifulSoup(site)) to no avail.

def writingFunction():
    #I imported time module because I had my script display the time and date here.
    counter = 1
    for i in soup.find_all('div', {'class': 'markeredBlock torType filmType'}):

        print str(counter) + '.' + ' ' + i.text
        counter = counter + 1

textFile = open('C:/python27/file.txt', 'a')
textFile.write(writingFunction()) #I've tried making this a str and I've also tried  assigning the function to a variable
textFile.close()

1 个答案:

答案 0 :(得分:1)

只需写入函数,目前只是打印输出:

def writing_function():
    with open('C:/python27/file.txt', 'a') as f:
        for ind, i in enumerate(soup.find_all('div', {'class': 'markedBlock torType filmType'}),1):
            f.write("{}. {}".format(ind, i.text.replace("\n","")))

print str(counter) + '.' + ' ' + i.content.replace('\n', '')正在打印每一行,你没有返回任何内容,因此textFile.write(writingFunction())是徒劳的,只需在函数中写一下。

如果您不想对文件名进行硬编码,请将其作为参数传递:

def writing_function(my_file):
    with open(my_file, 'a') as f:

使用enumerate开头索引1将执行您的计数器变量正在执行的操作。

bs4中没有.content,如果您希望文本使用contents,则i.text是列表。