Question

有一个Python script，它允许通过处理带有标识符（DOI / URL）列表的文件来进行批量下载。但是，Sci-hub非常擅长跟踪此类动作，因此需要在相当长的时间内进行小批量处理和/或不断更改代理以保持未被检测到。

但是，我的想法是这可能是自动化的（根据我几乎不存在的python知识，这应该是实现此目的的关键部分）。

      elif args.file:
    with open(args.file, 'r') as f:
        identifiers = f.read().splitlines()
        for identifier in identifiers:
            result = sh.download(identifier, args.output)
            if 'err' in result:
                logger.debug('%s', result['err'])
            else:
                logger.debug('Successfully downloaded file with identifier %s', identifier)

那么有可能：

计算已处理的字符串数
运行一个循环，该循环每执行10次就会执行特定的操作（例如，开始3-5分钟的暂停）

Answer 1

您可以将索引放入for循环中。您可以使用模数，也可以每次重置：

    count = 0
    for identifier in identifiers:
        result = sh.download(identifier, args.output)
        if 'err' in result:
            logger.debug('%s', result['err'])
        else:
            logger.debug('Successfully downloaded file with identifier %s', identifier)
            if count == 10:
                 action()
                 count = 0
            count +=1

如果只想计算成功的迭代次数，则应将代码放在enumerate块中：

    for count, identifier in enumerate(identifiers):
        if count%10 == 0:
              action()
        result = sh.download(identifier, args.output)
        if 'err' in result:
            logger.debug('%s', result['err'])
        else:
            logger.debug('Successfully downloaded file with identifier %s', identifier)

您也可以使用{{1}}：

{{1}}

在字符串的第N个出现时执行操作

1 个答案: