我正在尝试通过为其创建API来修改此简历解析器https://github.com/bjherger/ResumeParser。我使用了一个flask框架来创建一个API,用户可以在其中上传pdf / doc文件。以下是烧瓶框架的代码:
UPLOAD_FOLDER = 'user_uploads'
ALLOWED_EXTENSIONS = set(['txt', 'pdf','doc','docx', 'png', 'jpg', 'jpeg',
'gif'])
app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
def allowed_file(filename):
return filename[-3:].lower() in ALLOWED_EXTENSIONS
@app.route('/', methods=['GET', 'POST'])
def upload_file():
if request.method == 'POST':
file = request.files['file']
print type(file)
if file and allowed_file(file.filename):
print "File uploaded successfully. File name is" +file.filename
在上传用户之后,我将使用textract库中的textract.process()方法。但是,'file'对象是一个werkzeug.datastructures.FileStorage对象,无法通过textract进行解析。
有没有办法将FileStorage对象转换回pdf / doc,而不将文件保存到服务器,以便textract可以解析它?