Question

我的英语很糟糕，所以很抱歉

HI。我有一些python循环的麻烦。我的脚本计算两个日期之间的所有工作日，并且每天将所有文本从x.txt复制到textfile.txt，在文本的开头插入日期。 x是从1到n编号的少量文本文件。例如，开始日期是1,1,2017，结束日期是5,1,2017。 1.txt包含aaa，2.txt是bbb，3.txt是ccc，依此类推。输出文件应为：

2017-01-02 (because 1 January is Sunday which is weekend)

aaa

2017-01-03

bbb

2017-01-04

ccc

2017-01-05

ddd

但看起来像是

2017-01-05

aaa

2017-01-05

bbb

2017-01-05

ccc (only 3 files are processed, should be 4 and only last date is inserted everywhere)

我添加print n试图找出问题所在，并在for x in range (1, n):之后给了我

这是我的代码：

# -*- coding: utf-8 -*-
from datetime import datetime
from datetime import timedelta
import shutil
import os


def dc(d1, m1, y1, d2, m2, y2):
    start = datetime(y1, m1, d1)
    end = datetime(y2, m2, d2)
    delta = timedelta(days=1)
    d = start
    n = 1
    weekend = ([5, 6])
    while d <= end:
        if d.weekday() not in weekend:
            with open('textfile.txt','wb') as destination:
                for x in range (1, n):
                    print n
                    with open(str(x) + '.txt','rb') as source:
                        destination.write(str(d)[:10])
                        destination.write(os.linesep)
                        shutil.copyfileobj(source, destination)
                        destination.write(os.linesep*2)
        n += 1
        d += delta

dc(1,1,2017,5,1,2017)

所以主要问题是：我如何按计划工作？

以及其他一些问题

如何以其他方式存储输入数据？有30个文本文件对我来说没问题，但是将来我想选择输入源，并且拥有5个文件夹，每个文件夹中包含30个文件，这样会很难组织它们。也许我可以使用列表或东西？
如何重新格式化输出文件中写入的日期？ yyyy-mm-dd很好，但我更喜欢dd-mm-yyyy
这个脚本大部分时间都可以使用n = 30，但是如果n <30，我希望最后5个复制的文件总是为25.txt-30.txt 我很乐意听到任何建议。感谢您的帮助

Answer 1

你应该使用'while'循环或'for'循环。

import requests


domain = "https://www.python.org/"


response = requests.get(domain)
page = response.text
all_urls = set()
params = ["src", "href"]


def getURL(page, param):

    start_link = page.find(param)
    if start_link == -1:
        return None, 0
    start_quote = page.find('"', start_link)
    end_quote = page.find('"', start_quote + 1)
    url = page[start_quote + 1: end_quote]
    return url, end_quote

for param in params:

    while True:
        url, n = getURL(page, param)
        page = page[n:]
        #count += 1
        if url:
            if url.startswith('/') or url.startswith('#!'):
                all_urls.add(domain + url)
            elif url.startswith('http'):
                all_urls.add(url)
            else:
                continue
        else:
            break


print("all urls length:", len(all_urls))

P.S。我不确定将文件保持打开是否会产生任何问题，即程序员通常会在使用后关闭文件。

快乐的编码！

Python循环仅返回最后一个值

1 个答案: