AttributeError:尝试使用writerow时,“ str”对象没有属性“ keys”

时间:2019-05-19 19:15:56

标签: python

试图编写一个Python抓取器,将网页中的数据抓取到csv文件中

如果删除行dataFrameCleaned = cleanDataUp(dataFrame),也尝试改变我编写python文件的方式     csvData(dataFrameCleaned) 代码会运行,但是不会将数据写入csv文件

plt.show()
'''
write data to csv
'''
def csvData(dataFrame):
    with open('threads.csv', 'w+', newline='', encoding='utf8') as csvfile:
        fieldnames = ['post id', 'name', 'date of the post', 'post body']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        for posts in dataFrame:
            writer.writerow(posts)
        print('file write complete')


'''
defaults
'''
if __name__ == "__main__":
    path = str(os.path.dirname(os.path.realpath(__file__)))+'/data/'
    reload(sys)
    fieldnames = ['post id', 'name', 'date of the post', 'post body']
    dataFrame = pd.DataFrame(columns=fieldnames)
    url = 'http://www.oldclassiccar.co.uk/forum/phpbb/phpBB2/viewtopic.php?t=12591'
    urlList = [url]

    soup = get_soup(url)

    while True:
        newUrlSuffix = getURL(soup)
        if newUrlSuffix == '':
            break
        newUrl = 'http://www.oldclassiccar.co.uk/forum/phpbb/phpBB2/' + newUrlSuffix
        print("Adding new URL to list..")
        urlList.append(newUrl)
        soup = get_soup(newUrl)

    for link in urlList:
        print("Getting data from URL:" + link+ '\n\n\n')
        dataFrameNew = extractData(link)
        dataFrame = pd.concat([dataFrame,dataFrameNew])
    dataFrameCleaned = cleanDataUp(dataFrame)
    csvData(dataFrameCleaned)

The function for cleanDataUp
def cleanDataUp(dataFrame):
    dataFrame = dataFrame.reset_index(drop=True).dropna()
    return dataFrame

1 个答案:

答案 0 :(得分:1)

writer.writerow(posts)csv.DictWriter类型的writer中,该参数应该是字典,例如

writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})

但是正如错误所言,posts是字符串而不是字典,因此错误AttributeError: 'str' object has no attribute 'keys'

也许您的cleanDataUp(dataFrame)返回了一个字符串列表,但是您想要一个字典列表,您需要检查该函数以确保它返回正确的输出以传递给csvData()函数