Question

我正在构建一个包含视图的Django Web应用程序，该视图可以通过csv-import将数据上传到数据库中。每个导入包含大约2,000行和9列以及DecimalFields和CharFields。到目前为止，我一直在使用Django的SQLite数据库，每次上传最多花了我1分钟。我切换到PostgreSQL（通过ElephantSQL托管），现在上传至少需要10分钟。我在一些帖子中读到，SQLite比PostgreSQL快，但是我没想到会有这么大的东西。有没有一种方法可以加快PostgreSQL中的上传过程？我认为速度偏低的一个原因可能是我使用的是ElephantSQL的免费Tiny Turtle Plan，但是如果我正确理解，非免费计划仅在数据库的最大大小方面有所不同，但在数据库速度方面却没有区别？另请参见此处https://www.elephantsql.com/plans.html

在本地安装PostgreSQL而不使用云提供商可能是一种解决方案？我还有其他可以优化的方法来加快这一过程吗？

我的模特：

class Testdata3(models.Model):
    key = models.CharField(max_length=100, primary_key=True)
    mnemonic = models.CharField(max_length=50)
    assetclass = models.CharField(max_length=50)
    value = models.DecimalField(max_digits=255, decimal_places=25)
    performance = models.DecimalField(max_digits=255, decimal_places=25)
    performance_exccy = models.DecimalField(max_digits=255, decimal_places=25)
    performance_abs = models.DecimalField(max_digits=255, decimal_places=25)
    performance_abs_exccy = models.DecimalField(max_digits=255, decimal_places=25)
    date = models.DateField()

    def __str__(self):
        return self.key

我的观点：

def file_upload(request):
    template = "upload.html"
    prompt = {
        'order': 'Order of the CSV should be "placeholder_1", "placeholder_2", "placeholder_3" '
    }

    if request.method == "GET":
        return render(request, template, prompt)

    csv_file = request.FILES['file']

    if not csv_file.name.endswith('.csv'):
        messages.error(request, 'This is not a csv file')

    data_set = csv_file.read().decode('UTF-8')

    io_string = io.StringIO(data_set)

    #Ignores header row by jumping to next row
    next(io_string) 

    for column in csv.reader(io_string, delimiter=';', quotechar="|"):
        # Check if csv-row is empty, if true jump to next iteration/row
        if all(elem == "" for elem in column):
            next
        else:
            _, created = Testdata3.objects.update_or_create(
                key = column[0],

                defaults = {
                'key' : column[0],
                # Get everything after the date part in the primary key
                'mnemonic': re.findall(r'AMCS#[0-9]*(.*)', column[0])[0],
                # Create datetime object from a string
                'date' : datetime.datetime.strptime(column[6], '%d/%m/%Y'),
                'assetclass' : column[10],
                'value' : column[16], 
                'performance' : column[19],
                'performance_abs' : column[20],
                'performance_abs_exccy' : column[30],
                'performance_exccy' : column[31],
                }
            )
        context = {}

    return render(request, template, context)

Answer 1

我不这么认为。我猜您的服务提供商存在问题，或者您导入的CSV文件很大。我将AWS RDS与Postgres一起使用，这足够快。它与SQLite vs Postgres无关。同样，这可能是因为磁盘的IO速度对于SSD和高端计算机而言可能很高。

PostgreSQL（通过ElephantSQL）是否比Django的SQLite慢得多的数据库，该怎么办？

1 个答案: