尝试通过csv.DictReader流式传输文件时出现错误

时间:2019-07-17 20:54:05

标签: python-3.x csv flask

在我的Flask应用中,我遵循以下路线,基本上是在其中尝试流式传输上载的TSV文件(flask.request.files['file']),然后遍历其内容。在下面的代码中,它在Python 2中可以正常工作,但是现在我已更改为Python 3,出现了错误:

  File "/Users/cdastmalchi/Desktop/author_script/main.py", line 89, in process_file
    for line in contents:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 111, in __next__
    self.fieldnames
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 98, in fieldnames
    self._fieldnames = next(self.reader)
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

我也尝试过这样做,但是由于TypeError: expected str, bytes or os.PathLike object, not FileStorage无效:

io.open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True)

问题:如何正确流式处理输入文件,以便遍历每一行?

烧瓶代码

@app.route('/process_file', methods=['POST'])
def process_file():

    # Run checks on the file
    if 'file' not in flask.request.files or not flask.request.files['file'].filename:
        return flask.jsonify({'result':'False', 'message':'no files selected'})
        return flask.redirect(url_for('home'))
    file = flask.request.files['file']

    # Stream file and check that places exist
    contents = csv.DictReader(file, delimiter='\t')
    for line in contents:
        print(line)
    return None

更新

我也尝试过使用StringIO的以下变体,但是仍然空白或产生与_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)相同的错误:

contents = csv.DictReader(str(file.read()), delimiter='\t')

contents = csv.DictReader(io.StringIO(str(file.read())), delimiter='\t')

contents = csv.DictReader(io.StringIO(file.read()), delimiter='\t')

contents = csv.DictReader(file.read().splitlines(), delimiter='\t')

csv_io = io.StringIO(str(file.read()))
csv_io.seek(0)

contents = csv.DictReader(csv_io, delimiter='\t')

1 个答案:

答案 0 :(得分:0)

我遇到了同样的问题,并且遇到了同样的错误。我不确定这是否对您有帮助,但是我想在此添加它,以防万一。

我相信我的问题是文件中的某些字符串已编码。我可以通过在文件字符串中添加.decode()来解决此问题。

这是我的代码:

f = request.files.get['data_file']
file_str = f.read()

# Below is the decode() that fixed the error
for row in csv.DictReader(file_str.decode().splitlines(), skipinitialspace=True):
    data += [{k: v for k, v in row.items()}]

此代码将我上传的.csv文件转换为词典列表,每一行都是列表中自己的项目。希望对您有所帮助!