根据日期时间同步文件夹

时间:2014-07-04 12:30:15

标签: python

我是python的新手并尝试使用此脚本从多个ftp站点接收数据并根据日期目录将昨天数据下载到我的本地文件夹。但如果收到任何一天的失败,它将不会更新当天的记录,并且会在第二天发布。我想同步文件,即使它丢失了特别是它应该完成同步新文件到本地文件夹我试过看rsync但需要你的帮助来处理它脚本。这是我的脚本。

MAX_CHILDREN = 16
    ftp_site_prog = "/usr/local/bin/ftp_site.py"

    class SpawnQ:
        def __init__(self, max_pids):
            self.max_pids = max_pids 
            self.queue = []
            tmp = re.split("/", ftp_site_prog)
            self.my_name = tmp[-1]


        def addQ(self, site_id):
            self.queue.append(site_id)
            return

        def runQ(self):
            while (len(self.queue) != 0):
                # Check how many sessions are running
                cmd = """ps -ef | grep "%s" | grep -v grep""" % self.my_name
                num_pids = 0
                for line in os.popen(cmd).readlines():
                    num_pids = num_pids + 1

                if (num_pids < self.max_pids):
                    site_id = self.queue.pop()
                    # print site_id
                    # print "Forking........"
                    fpid = os.fork()
                    if fpid:
                        # print "Created child: ", fpid
                        os.waitpid(fpid, os.WNOHANG)
                    else:
                        # print "This is the Child"

                        # Exec the ftp_site
                        arg_string = "%s" % site_id
                        args = [arg_string]


           os.execvp(ftp_site_prog, (ftp_site_prog,) + tuple(args))
    how to  call rsync on my py script.//os.system("rsync -ftp_site_prog, (ftp_site_prog,)+ tuple(args))
        sys.exit(0)
                else:
                    # print "Waiting for a spare process...."
                    time.sleep(10)
            return

    # Get a list of the sites
    db_obj = nprint.celDb()
    site_list = db_obj.get_all_site_ids()

    myQ = SpawnQ(MAX_CHILDREN)

    for site_id in site_list:
        myQ.addQ(site_id)

    myQ.runQ()

    # Wait until we only have the parent left
    # Check how many sessions are running
    tmp = re.split("/",ftp_site_prog)
    ftp_name = tmp[-1]
    cmd = """ps -ef | grep "%s" | grep -v grep""" % ftp_name

    num_pids = MAX_CHILDREN
    while (num_pids > 0):
        num_pids = 0
        for line in os.popen(cmd).readlines():
            num_pids = num_pids + 1

        time.sleep(60)

    today = datetime.date.today()
    daydelta = datetime.timedelta(days=1)
    yesterday = today - daydelta

1 个答案:

答案 0 :(得分:0)

使用ftplib module可以完成大部分操作,以便从标准FTP服务器检索文件。如果您正在处理SFTP服务器,则可以使用paramiko库。