在python中读取存储在FTP中的csv文件

时间:2018-02-15 19:45:16

标签: python pandas csv ftp

我已连接到FTP并且连接成功。

import ftplib
ftp = ftplib.FTP('***', '****','****')
listoffiles = ftp.dir()
print (listoffiles)

我在这个ftp中有一些csv文件和一些包含更多csv的文件夹。

我需要识别此位置(主页)中的文件夹列表,并需要导航到文件夹。我认为cwd命令应该可行。

我还读了这个ftp中存储的csv。我怎样才能做到这一点?有没有办法直接将csv加载到pandas?

2 个答案:

答案 0 :(得分:1)

根据这里的答案(Python write create file directly in FTP)和我自己对ftplib的了解:

您可以做的是:

num = int(input("Enter your birth year: "))
x  = num //1000
x1 = (num - x*1000)//100
x2 = (num - x*1000 - x1*100)//10
x3 = num - x*1000 - x1*100 - x2*10
print (x, x1, x2, x3)
x4 = x+x1+x2+x3
print (x4)
num2 = int(x4)
x6 = num2 //10
x7 = (num2 -x6)//10
print (x6, x7)
print("your birth number is" ,x6+x7)

或者,如果您知道ftpserver的结构,您可以使用文件夹/文件结构遍历字典,并通过ftplib或urllib下载文件,如示例所示:

from ftplib import FTP
import io, pandas

session = FTP('***', '****','****')

# get filenames on ftp home/root
remoteFilenames = session.nlst()
if ".." in remoteFilenames:
    remoteFilenames.remove("..")
if "." in remoteFilenames:
    remoteFilenames.remove(".")
# iterate over filenames and check which ones are folder
for filename in remoteFilenames:
    dirTest = session.nlst(filename)
    # This dir test does not work on certain servers
    if dirTest and len(dirTest) > 1:
        # its a directory => go to directory
        session.cwd(filename)
        # get filename for on ftp one level deeper
        remoteFilenames2 = session.nlst()
        if ".." in remoteFilenames2:
            remoteFilenames2.remove("..")
        if "." in remoteFilenames2:
            remoteFilenames2.remove(".")
        for filename in remoteFilenames2:
            # check again if the filename is a directory and this time ignore it in this case
            dirTest = session.nlst(filename)
            if dirTest and len(dirTest) > 1:
                continue

            # download the file but first create a virtual file object for it
            download_file = io.BytesIO()
            session.retrbinary("RETR {}".format(filename), download_file.write)
            download_file.seek(0) # after writing go back to the start of the virtual file
            pandas.read_csv(download_file) # read virtual file into pandas
            ##########
            # do your thing with pandas here
            ##########
            download_file.close() # close virtual file

session.quit() # close the ftp session

这两种解决方案都可以通过递归或一般支持多个文件夹级别进行优化

答案 1 :(得分:0)

迟到总比没有好……我能够直接读到熊猫。不知道这是否对任何人都有用。

import pandas as pd
from ftplib import FTP
ftp = FTP('ftp.[domain].com') # you need to put in your correct ftp domain
ftp.login() # i don't need login info for my ftp
ftp.cwd('[Directory]') # change directory to where the file is
df = pd.read_csv("[file.csv]", delimiter = "|", encoding='latin1') # i needed to specify delimiter and encoding
df.head()