Question

我有一堆网址，我想下载文本并做一些进一步的分析。我是一个蟒蛇新手。我有两个问题：（1）我有一个非常奇怪的类型错误; （2）结果未写入数据框。我的代码如下：

smallURL= ['http://www.walesonline.co.uk/business/business-news/more-70-jobs-created-bio-12836127','http://economictimes.indiatimes.com/articleshow/61006825.cms?utm_source=contentofinterest&utm_medium=text&utm_campaign=cppst','http://100seguro.com.ar/telefonica-pone-en-venta-su-aseguradora-antares-vida/','http://13wham.com/news/local/urmc-opens-newest-urgent-care-facility']

import pandas
import datetime


f = open('myfile', 'w')

#lista= ['http://www.walesonline.co.uk/business/business-news/more-70-jobs-created-bio-12836127','http://economictimes.indiatimes.com/articleshow/61006825.cms?utm_source=contentofinterest&utm_medium=text&utm_campaign=cppst','http://100seguro.com.ar/telefonica-pone-en-venta-su-aseguradora-antares-vida/','http://13wham.com/news/local/urmc-opens-newest-urgent-care-facility']

df = pandas.DataFrame(columns=('d', 'datetime', 'title', 'text','keywords', 'url'))

from newspaper import Article 

for index in range(len(smallURL)):

#url = "https://www.bloomberg.com/news/articles/2017-11-10/microsoft-and-google-turn-to-ai-to-catch-amazon-in-the-cloud"
    article = Article(smallURL[index])
#1 . Download the article
    #try:
    article.download()
    #f.write('article.title+\n')
    #except:
    #pass
#2. Parse the article
    try:
        article.parse()
        f.write('article.title+\n')
    except:
        pass
#Print article title
    #print(article.title)
    article.title
#3. Fetch Author Name(s)
    print(article.authors)
#4. Fetch Publication Date
    if article.publish_date is None:
        d = datetime.datetime.now().date()
    else:
        d = article.publish_date
#5. Print article text
    print(article.text)
#6. Natural Language Processing on Article to fetch Keywords
    #article.nlp()
    #Print Keywords
    print(article.keywords)
#7. Generate Summary of the article
    #print(article.url)
    print(article.url)
    df.loc[index]  = [d, datetime.datetime.now().date(), article.title, article.text,article.keywords,article.url]

我的输出包括：

[] http://100seguro.com.ar/telefonica-pone-en-venta-su-aseguradora-antares-vida/ 回溯（最近一次调用最后一次）：

文件“”，第1行，in runfile（'C：/Users/theiman/Desktop/untitled7.py',wdir ='C：/ Users / theiman / Desktop'）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ spyder \ utils \ site \ sitecustomize.py”，第710行，在runfile中 execfile（filename，namespace）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ spyder \ utils \ site \ sitecustomize.py”，第101行，在execfile中 exec（compile（f.read（），filename，'exec'），namespace）

文件“C：/Users/theiman/Desktop/untitled7.py”，第57行，in df.loc [index] = [d，datetime.datetime.now（）。date（），article.title，article.text，article.keywords，article.url]

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ indexing.py”，第179行， setitem self._setitem_with_indexer（indexer，value）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ indexing.py”，第425行，在_setitem_with_indexer中 self.obj._data = self.obj.append（value）._ data

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ frame.py”，第4533行，附加 other = other._convert（datetime = True，timedelta = True）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ generic.py”，第3472行，在_convert 复制=复制））。的最终化（个体）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ internals.py”，第3227行，在转换中 return self.apply（'convert'，** kwargs）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ internals.py”，第3091行，在申请中 applied = getattr（b，f）（** kwargs）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ internals.py”，第1892行，在转换中 values = fn（values.ravel（），** fn_kwargs）

文件“C：\ Users \ theiman \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pandas \ core \ dtypes \ cast.py”，第740行，在soft_convert_objects中 values = lib.maybe_convert_objects（values，convert_datetime = datetime）

文件“pandas / _libs / src \ inference.pyx”，第1204行，pandas._libs.lib.maybe_convert_objects

TypeError：不可用类型：'tzutc'

对于出了什么问题以及如何解决它有任何想法？谢谢!!

python，报纸，不可用类型：'tzutc'并写入数据帧

0 个答案: