如何在python烧瓶中对两个或多个CSV文件进行验证

时间:2018-12-29 06:27:01

标签: python csv flask

请在不喜欢这个问题之前,问我您不了解的事情,大​​家好,我有数据生成程序,该程序会进行大量计算,因此我无法将整个程序粘贴到此处,因此只谈论我的程序程序的所有计算都从读取文件开始,因此当我在“选择文件”选项中的网页中选择多个CSV文件时,我需要验证所有csv文件的列号(应相同),并且列标题名称也应该匹配。.我编写的程序是这样的:

from flask import Flask, render_template
import os
import csv
import pandas as pd
import numpy as np
app = Flask(__name__)

APP_ROOT = os.path.dirname(os.path.abspath(__file__))
@app.route("/")
def index():
    print("Loading the root file")
    return render_template("upload.html")
@app.route("/upload", methods=['POST'])
def upload():
    target = os.path.join(APP_ROOT, 'input/')
        print("target-",target)

        if not os.path.isdir(target):
            os.mkdir(target)

    for file in request.files.getlist("source_fileName"):
            print("file-",file)
            filename = file.filename
            print("filename-",filename)

            destination = "/".join([target, filename])
            print("destination-",destination)
            file.save(destination)
            print("file>",file)
            global tempFile
            tempFile = destination
            print("tempFile - " + tempFile)
    return redirect("/compute", )
def compute():
    readerForRowCheck = pd.read_csv(tempFile)
        for row in readerForRowCheck:
            if (len(row) != 8):
                return render_template("Incomplete.html")

            headerColumn1 = row[0];
            headerColumn2 = row[1];
            headerColumn3 = row[2];
            headerColumn4 = row[3];
            headerColumn5 = row[4];
            headerColumn6 = row[5];
            headerColumn7 = row[6];
            headerColumn8 = row[7];

            if (headerColumn1 != "Asset_Id") or (headerColumn2 != "Asset Family") \
                or (headerColumn3 != "Asset Name") or (headerColumn4 != "Location")or (headerColumn5 != "Asset Component") \
                or (headerColumn6 != "Keywords") or (headerColumn7 != "Conditions") or (headerColumn8 != "Parts") :
                    return render_template("incomplete.html")
.....................................so on to then it will go to perform other task

HTML程序:

html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <title> upload </title>
</head>
<body>
<div class="container">
    <h1>Large Data Generation</h1> 
<form id = "upload-form" action="{{ url_for('upload') }}" method="POST" enctype="multipart/form-data">
        <div id="file-selector">
            <p> 
                <strong>Source File: </strong>
                <input id="source_fileName" type="file" name="source_fileName" accept="csv/*" multiple />
            </p> 
        </div>
    <input type="submit" value="Generate Data" id="upload-button"  >
</form>
</div>
</body>

注意:**我只给出了很重要的代码行,否则其中包含很多代码。**我在此得知如何验证列号和名称上的csv文件应该相同知道我对读取csv文件的验证不正确,为什么我在这里,请帮助我.....谢谢

1 个答案:

答案 0 :(得分:1)

I您有多个文件,则需要为每个文件创建数据框实例 上传功能如下:

def upload():
    target = os.path.join(APP_ROOT, 'input/')
        print("target-",target)
        if not os.path.isdir(target):
            os.mkdir(target)
    abs_path_files=[]
    for file in request.files.getlist("source_fileName"):
            print("file-",file)
            filename = file.filename
            print("filename-",filename)
            destination = "/".join([target, filename])
            print("destination-",destination)
            file.save(destination)
            print("file>",file)
            tempFile = os.path.abspath(destination)
            abs_path_files.append(tempfile)
            print("tempFile - " + tempFile)
    return redirect(url_for("compute", files_list=abs_path_files))

计算功能将如下所示:

def compute(files_list):
    dataFrames=[]
    for f in files_list:
        dataFrame=pd.read_csv(f)
        dataFrames.append(dataFrame)
    col_in_files = set([",".join(list(f.column.values)) for f in dataFrames])
    if len(col_in_files)==1:
       #then process your data here