如何使用Python从iTunes URL轻松提取ID

时间:2011-12-17 00:43:19

标签: python regex

iTunes网址如下所示:

http://itunes.apple.com/us/album/break-of-dawn/id472335316?ign-mpt=uo%3D
http://itunes.apple.com/us/app/monopoly-here-now-the-world/id299110947?mt=8
http://itunes.apple.com/es/app/revista-/id397781759?mt=8%3Futm_so%3Dtwitter
http://itunes.apple.com/app/id426698291&mt=8"
http://itunes.apple.com/us/album/respect-the-bull-single/id4899
http://itunes.apple.com/us/album/id6655669

如何轻松提取ID号?

示例:

get_id("http://itunes.apple.com/us/album/brawn/id472335316?ign-mpt=uo")

#returns 472335316

3 个答案:

答案 0 :(得分:9)

import re

def get_id(toParse):
    return re.search('id(\d+)', toParse).groups()[0]

我会让你弄清楚错误处理......

答案 1 :(得分:2)

您可以使用像"/id(\\d+).*"这样的正则表达式;第一个捕获组将包含id号。我想你也可以在Python中将它写成r"/id(\d+).*"

答案 2 :(得分:1)

没有正则表达式(无缘无故):

import urlparse

def get_id(url):
    """Extract an integer id from iTunes `url`.

    Raise ValueError for invalid strings
    """
    parts = urlparse.urlsplit(url) 
    if parts.hostname == 'itunes.apple.com':
       idstr = parts.path.rpartition('/')[2] # extract 'id123456'
       if idstr.startswith('id'):
          try: return int(idstr[2:])
          except ValueError: pass
    raise ValueError("Invalid url: %r" % (url,))

实施例

print get_id("http://itunes.apple.com/us/album/brawn/id472335316?ign-mpt=uo")
# -> 472335316