Question

对网址排序存在问题。 .jpg文件以“xxxx-xxxx.jpg”结尾。第二组键需要按字母顺序排序。到目前为止，我只能按字母顺序对第一组字符进行排序（这是不必要的）。

例如：

http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babf-bbac.jpg

正在进行

http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babh-bajc.jpg

何时

#!/usr/bin/python
# Copyright 2010 Google Inc.
# Licensed under the Apache License, Version 2.0
# http://www.apache.org/licenses/LICENSE-2.0

# Google's Python Class
# http://code.google.com/edu/languages/google-python-class/

import os
import re
import sys
import requests

"""Logpuzzle exercise
Given an apache logfile, find the puzzle urls and download the images.

Here's what a puzzle url looks like:
10.254.254.28 - - [06/Aug/2007:00:13:48 -0700] "GET /~foo/puzzle-bar-aaab.jpg HTTP/1.0" 302 528 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"
"""

def url_sort_key(url):
    print url [-8:]
#Extract the puzzle urls from inside a logfile
def read_urls(filename):
    """Returns a list of the puzzle urls from the given log file,
    extracting the hostname from the filename itself.
    Screens out duplicate urls and returns the urls sorted into
    increasing order."""
    # +++your code here+++



# Use open function to search fort the urls containing "puzzle/p"
# Use a line split to pick out the 6th section of the filename
# Sort out all repeated urls, and return sorted list
    with open(filename) as f:
        out = set()
        for line in f:
            if re.search("puzzle/p", line):
                url = "http://code.google.com" + line.split(" ")[6]
                print line.split(" ")
                out.add(url)
    return sorted(list(out))



# Complete the download_images function, which takes a sorted
# list of urls and a directory
def download_images(img_urls, dest_dir):
    """Given the urls already in the correct order, downloads
    each image into the given directory.
    Gives the images local filenames img0, img1, and so on.
    Creates an index.html in the directory
    with an img tag to show each local image file.
    Creates the directory if necessary.
    """
    # ++your code here++
    if not os.path.exists(dest_dir):
        os.makedirs(dest_dir)

    # Create an index
    index = file(os.path.join(dest_dir, 'index.html'), 'w')
    index.write('<html><body>\n')

    i = 0
    for img_url in img_urls:
        i += 1
        local_name = 'img%d' %i
        print "Retrieving...", local_name
        print local_name 
        print dest_dir
        print img_url

        response = requests.get(img_url)
        if response.status_code == 200:
            f = open(os.path.join(dest_dir,local_name + ".jpg"), 'wb')
            f.write(response.content)
            f.close()

        index.write ('<img src="%s">' % (local_name + ".jpg"))


    index.write('\n</body></html>\n')
    index.close()

def main():
    args = sys.argv[1:]

    print args
    if not args:
        print ('usage: [--todir dir] logfile ')
        sys.exit(1)

    todir = None
    if args[0] == '--todir':
        todir = args[1]
        del args[0:2]


    img_urls = read_urls(args[0])

    if todir:
        download_images(img_urls, todir)
    else:
        print ('\n'.join(img_urls))

if __name__ == '__main__':
    main()

我认为错误在于read_urls函数的返回，但不是正面的。

Answer 1

鉴于网址以格式结尾 Cannot assign value of type 'TimelineViewController' to type 'UITabBarControllerDelegate?'

并且您希望根据第二个键对网址进行排序，即id | status_id | X | 90001 | 12 | NULL | 90002 | 12 | NULL | 90003 | 12 | 2015-01-06 | 90004 | 12 | 2015-01-09 |

xxxx-yyyy.jpg

例如，输入文件包含

yyyy

它产生列表

def read_urls(filename):
    with open(filename) as f:
        s = {el.rstrip() for el in f if 'puzzle' in el}
    return sorted(s, key=lambda u: u[-8:-4]) # u[-13:-9] if need to sort on the first key

即。 http://localhost/p-xxxx-yyyy.jpg http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babf-bbac.jpg http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babh-bajc.jpg http://localhost/p-xxxx-yyyy.jpg出现在['http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babh-bajc.jpg', 'http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babf-bbac.jpg']之前。

如果您想按第一个键排序（bajc）

，请参阅代码中的注释

按字母顺序排序URL以下载图像

1 个答案: