我想从FTP服务器中提取文本文件。这是我已有的代码:
from ftplib import FTP
import re
def my_function(data):
print(data)
ftp = FTP('ftp.nasdaqtrader.com')
ftp.login()
nasdaq=ftp.retrbinary('RETR /SymbolDirectory/nasdaqlisted.txt', my_function)
#nasdaq contains the text file
我对这种方法有几个问题。例如,每次运行脚本时,所有内容都打印出我真正不想要的内容,我只需要将变量“nasdaq”存储为字符串即可。此外,即使“纳斯达克”打印出这一行:
b'Symbol|Security Name|Market Category|Test Issue|Financial Status|Round Lot Size|ETF|NextShares\r\nAAAP|Advanced Accelerator Applications S.A. - American Depositary Shares
我不能证明它在“纳斯达克”中:
print ("\r\nAAAP|Advanced Accelerator Applications S.A." in nasdaq)
Out: False
什么是更加pythonic的方法?
答案 0 :(得分:1)
这基本上是Is it possible to read FTP files without writing them using Python?的副本,但我想展示如何根据您的情况专门实现它。
from ftplib import FTP
from io import BytesIO
data = BytesIO()
with FTP("ftp.nasdaqtrader.com") as ftp: # use context manager to avoid
ftp.login() # leaving connection open by mistake
ftp.retrbinary("RETR /SymbolDirectory/nasdaqlisted.txt", data.write)
data.seek(0) # need to go back to the beginning to get content
nasdaq = data.read().decode() # convert bytes back to string
nasdaq
现在应该是一个包含指定文件内容的字符串,其中包含\r\n
Windows样式的行结尾。如果您对这两个字符.split()
,则会获得一个列表,其中每行都是一个组件。