从字典和列表中获取所有可能的组合

时间:2020-10-19 14:55:14

标签: python

我正在构建网络抓取工具,并希望生成我要请求的所有网址。
该URL具有三个参数:

  • 日期
  • facility_id
  • sport_id

我需要从日期列表以及设施和运动词典中生成所有可能的组合。


    dates = ['2020-10-21', '2020-10-22']
    db = {'facility_id': [184, 4, 3, 3], 'sport_id': [1, 2, 1, 5]}

结果URL看起来像这样(这是八个(第2个日期*字典中有4行)的第一个结果


    https://www.website.se/subsite?date=2020-10-21&facility_id=184&sport_id=1

我尝试了嵌套的for循环,但发现自己卡住了。


    url = 'https://www.website.se/subsite?'
    dates = ['2020-10-21', '2020-10-22']
    db = {'facility_id': [184, 4, 3, 3], 'sport_id': [1, 2, 1, 5]}
    
    for date in dates:
        url = url + date + ','
        
        for col in db:
            url = url + col + ','
            
            for values in db[col]:
                url = url + str(values) + ','
        print(url)

嵌套的for循环是走的路还是有更好的方法?

我要生成的完整结果

https://www.website.se/subsite?date=2020-10-21&facility_id=184&sport_id=1
https://www.website.se/subsite?date=2020-10-21&facility_id=4&sport_id=2
https://www.website.se/subsite?date=2020-10-21&facility_id=3&sport_id=1
https://www.website.se/subsite?date=2020-10-21&facility_id=3&sport_id=5
https://www.website.se/subsite?date=2020-10-22&facility_id=184&sport_id=1
https://www.website.se/subsite?date=2020-10-22&facility_id=4&sport_id=2
https://www.website.se/subsite?date=2020-10-22&facility_id=3&sport_id=1
https://www.website.se/subsite?date=2020-10-22&facility_id=3&sport_id=5

3 个答案:

答案 0 :(得分:4)

您可以使用itertools.product

from itertools import product


dates = ['2020-10-21', '2020-10-22']
db = {'facility_id': [184, 4, 3, 3], 'sport_id': [1, 2, 1, 5]}

for d, (f, s) in product(dates, zip(db['facility_id'], db['sport_id'])):
    print('https://www.website.se/subsite?date={}&facility_id={}&sport_id={}'.format(d, f, s))

打印:

https://www.website.se/subsite?date=2020-10-21&facility_id=184&sport_id=1
https://www.website.se/subsite?date=2020-10-21&facility_id=4&sport_id=2
https://www.website.se/subsite?date=2020-10-21&facility_id=3&sport_id=1
https://www.website.se/subsite?date=2020-10-21&facility_id=3&sport_id=5
https://www.website.se/subsite?date=2020-10-22&facility_id=184&sport_id=1
https://www.website.se/subsite?date=2020-10-22&facility_id=4&sport_id=2
https://www.website.se/subsite?date=2020-10-22&facility_id=3&sport_id=1
https://www.website.se/subsite?date=2020-10-22&facility_id=3&sport_id=5

答案 1 :(得分:1)

尝试一下:

for date in dates:
    for fac_id, sport_id in zip(db['facility_id'], db['sport_id']):
        res = f'https://www.website.se/subsite?date={date}&facility_id={fac_id}&sport_id={sport_id}'
        print(res)

答案 2 :(得分:0)

使用您当前的代码,您要在网址中插入,。这是一个解决方案:

dates = ['2020-10-21', '2020-10-22']
db = {'facility_id': [184, 4, 3, 3], 'sport_id': [1, 2, 1, 5]}

for date in dates:  
  for col in db:  
    for values in db[col]:
      url = f"https://www.website.se/subsite?date={date}&facility_id={col}&sport_id={values}"
      print(url)

不确定是否有解决嵌套循环的方法。