尝试将 csv 文件从 url 插入 mongodb 时出现 HTTP 错误

时间:2021-04-21 14:47:46

标签: python mongodb csv http pymongo

我正在尝试使用 python 插入脚本将数据从 csv 文件发送到 mongodb。我必须直接通过url获取这个文件,也就是它不是本地的。我正在使用 Pandas 中的 pymongo 和 read_csv 进行插入,但我收到了“HTTP 错误 500:内部服务器错误”。我想这与编码或标头有关。我尝试了几种组合,但都没有奏效。代码如下:

try:
    import pymongo
    from pymongo import MongoClient
    import pandas as pd
    import json
except Exception as e:
    print("Some Modules are Missing ")

import requests
import urllib.request
from urllib.error import HTTPError


class MongoDB(object):

    def __init__(self, dBName=None, collectionName=None):

        self.dBName = dBName
        self.collectionName = collectionName

        #self.client = MongoClient("localhost", 27017)
        self.client = MongoClient("<connection_string>")

        self.DB = self.client[self.dBName]
        self.collection = self.DB[self.collectionName]



    def InsertData(self, path=None):

        df = pd.read_csv(path, sep=";", encoding='UTF-8', header="infer")
        data = df.to_dict('records')

        self.collection.insert_many(data, ordered=False)
        print("All the Data has been Exported to Mongo DB Server .... ")

if __name__ == "__main__":
    mongodb = MongoDB(dBName = 'vacinacao-covid', collectionName='teste')
    mongodb.InsertData(path = "https://www.saopaulo.sp.gov.br/wp-content/uploads/2021/04/20210420_percentual_primeira_dose.csv")

1 个答案:

答案 0 :(得分:0)

UTF-8 对我不起作用。我改用了 latin-1。

import pandas as pd
import requests
from io import BytesIO


user_agent = {'User-agent': 'Mozilla/5.0'}
path = "https://www.saopaulo.sp.gov.br/wp-content/uploads/2021/04/20210420_percentual_primeira_dose.csv"
r = requests.get(path, headers=user_agent)
f = BytesIO(r.content)
df = pd.read_csv(f, sep=";", encoding='latin-1', header="infer")
print(df)