对网址排序存在问题。 .jpg文件以“xxxx-xxxx.jpg”结尾。第二组键需要按字母顺序排序。到目前为止,我只能按字母顺序对第一组字符进行排序(这是不必要的)。
例如:
http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babf-bbac.jpg
正在进行
http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babh-bajc.jpg
何时
#!/usr/bin/python
# Copyright 2010 Google Inc.
# Licensed under the Apache License, Version 2.0
# http://www.apache.org/licenses/LICENSE-2.0
# Google's Python Class
# http://code.google.com/edu/languages/google-python-class/
import os
import re
import sys
import requests
"""Logpuzzle exercise
Given an apache logfile, find the puzzle urls and download the images.
Here's what a puzzle url looks like:
10.254.254.28 - - [06/Aug/2007:00:13:48 -0700] "GET /~foo/puzzle-bar-aaab.jpg HTTP/1.0" 302 528 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"
"""
def url_sort_key(url):
print url [-8:]
#Extract the puzzle urls from inside a logfile
def read_urls(filename):
"""Returns a list of the puzzle urls from the given log file,
extracting the hostname from the filename itself.
Screens out duplicate urls and returns the urls sorted into
increasing order."""
# +++your code here+++
# Use open function to search fort the urls containing "puzzle/p"
# Use a line split to pick out the 6th section of the filename
# Sort out all repeated urls, and return sorted list
with open(filename) as f:
out = set()
for line in f:
if re.search("puzzle/p", line):
url = "http://code.google.com" + line.split(" ")[6]
print line.split(" ")
out.add(url)
return sorted(list(out))
# Complete the download_images function, which takes a sorted
# list of urls and a directory
def download_images(img_urls, dest_dir):
"""Given the urls already in the correct order, downloads
each image into the given directory.
Gives the images local filenames img0, img1, and so on.
Creates an index.html in the directory
with an img tag to show each local image file.
Creates the directory if necessary.
"""
# ++your code here++
if not os.path.exists(dest_dir):
os.makedirs(dest_dir)
# Create an index
index = file(os.path.join(dest_dir, 'index.html'), 'w')
index.write('<html><body>\n')
i = 0
for img_url in img_urls:
i += 1
local_name = 'img%d' %i
print "Retrieving...", local_name
print local_name
print dest_dir
print img_url
response = requests.get(img_url)
if response.status_code == 200:
f = open(os.path.join(dest_dir,local_name + ".jpg"), 'wb')
f.write(response.content)
f.close()
index.write ('<img src="%s">' % (local_name + ".jpg"))
index.write('\n</body></html>\n')
index.close()
def main():
args = sys.argv[1:]
print args
if not args:
print ('usage: [--todir dir] logfile ')
sys.exit(1)
todir = None
if args[0] == '--todir':
todir = args[1]
del args[0:2]
img_urls = read_urls(args[0])
if todir:
download_images(img_urls, todir)
else:
print ('\n'.join(img_urls))
if __name__ == '__main__':
main()
我认为错误在于read_urls函数的返回,但不是正面的。
答案 0 :(得分:1)
鉴于网址以格式结尾
Cannot assign value of type 'TimelineViewController' to type 'UITabBarControllerDelegate?'
并且您希望根据第二个键对网址进行排序,即id | status_id | X |
90001 | 12 | NULL |
90002 | 12 | NULL |
90003 | 12 | 2015-01-06 |
90004 | 12 | 2015-01-09 |
xxxx-yyyy.jpg
例如,输入文件包含
yyyy
它产生列表
def read_urls(filename):
with open(filename) as f:
s = {el.rstrip() for el in f if 'puzzle' in el}
return sorted(s, key=lambda u: u[-8:-4]) # u[-13:-9] if need to sort on the first key
即。 http://localhost/p-xxxx-yyyy.jpg
http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babf-bbac.jpg
http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babh-bajc.jpg
http://localhost/p-xxxx-yyyy.jpg
出现在['http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babh-bajc.jpg',
'http://code.google.com/edu/languages/google-python-class/images/puzzle/p-babf-bbac.jpg']
之前。
如果您想按第一个键排序(bajc
)