Question

我想通过HTTP下载文件，但是所有在线示例都涉及获取数据然后将其放入本地文件中。这个问题是您需要显式设置本地文件的文件类型。

我想下载一个文件，但我不知道我正在下载的文件类型。

这就是我目前所拥有的：

urllib.urlretrieve(fetch_url,output.csv)

但是，如果我下载，说一个XML文件，它将是CSV。无论如何让python检测我从以下网址发送的文件：http://asassaassa.com/assaas?abc=123

假设上面的URL给了我一个我希望python检测到的XML。

Answer 1

您可以使用python-magic来检测文件类型。它可以通过“pip install python-magic”安装。

我假设您使用的是python 2.7，因为您正在调用urlretreieve。该示例适用于2.7，但很容易适应。

这是一个有效的例子：

import mimetypes # Detects mimetype
import magic  # Uses magic numbers to detect file type, and does so much better than the built in mimetypes
import urllib # Your library
import os     # for renaming your file
mime = magic.Magic(mime=True) 
output = "output" # Your file name without extension
urllib.urlretrieve("https://docs.python.org/3.0/library/mimetypes.html", output) # This is just an example url
mimes = mime.from_file(output) # Get mime type
ext = mimetypes.guess_all_extensions(mimes)[0] # Guess extension
os.rename(output, output+ext) # Rename file

Python - 通过HTTP下载文件并自动检测文件类型

1 个答案: