我正在解析一个网站,以获取可用的产品和尺寸。共有3个产品。有一个名为“ find_id_1”的列表,其中包含3个元素,每个元素都有产品名称及其变体ID。我还列出了另外两个列表,一个命名为关键字,另一个命名为否定。关键字列表中包含我所需的产品标题应具有的关键字。如果商品名称中包含否定列表中的任何元素,那么我就不需要该商品。
found_product = []
keywords = ['YEEZY','BOOST','700']
negative = ['INFANTS','KIDS']
find_id_1 = ['{"id":2069103968384,"title":
"\nYEEZY BOOST 700 V2","handle":**"yeezy-boost-700-v2-vanta-june-6"**,
[{"id":19434310238336,"parent_id":2069103968384,"available":true,
"sku":"193093889925","featured_image":null,"public_title":null,
"requires_shipping":true,"price":30000,"options"',
'{"id":2069103935616,"title":"\nYEEZY BOOST 700 V2 KIDS","handle":
"yeezy-boost-700-v2-vanta-kids-june-6",`
["10.5k"],"option1":"10.5k","option2":"",
`"option3":"","option4":""},{"id":19434309845120,"parent_id":2069103935616,
"available":false,"sku":"193093893625","featured_image":null,
"public_title":null,"requires_shipping":true,"price":18000,"options"',
'{"id":2069104001152,"title":"\nYEEZY BOOST 700 V2 INFANTS",
"handle":**"yeezy-boost-700-v2-vanta-infants-june-6"***,`
["4K"],"option1":"4k","option2":"",`
"option3":"","option4":""},{"id":161803398876,"parent_id":2069104001152,
"available":false,"sku":"193093893724",
"featured_image":null,"public_title":null,
"requires_shipping":true,"price":15000,"options"']
我尝试使用for循环遍历find_info_1中的每个元素,然后创建另一个for循环遍历关键字和negative中的每个元素,但是我得到了错误的乘积。这是我的代码:
for product in find_id_1:
for key in keywords:
for neg in negative:
if key in product:
if neg not in product:
found_product = product
它打印以下内容:
'{"id":2069104001152,"title":"\nYEEZY BOOST 700 V2 INFANTS",
"handle":"yeezy-boost-700-v2-vanta-infants-june-6,`
["4K"],"option1":"4k","option2":"",`
"option3":"","option4":""},
{"id":161803398876,"parent_id":2069104001152,
"available":false,"sku":"193093893724",
"featured_image":null,"public_title":null,
"requires_shipping":true,"price":15000,"options"']
我试图让它从find_info_1返回元素0,因为那是唯一一个不包含列表负面元素的元素。使用for循环会是遍历我的列表的最佳,最快的方法吗?谢谢!欢迎任何帮助!
答案 0 :(得分:0)
首先,您不应该将json数据视为字符串。只需使用json库解析json,即可检查产品的标题。随着产品列表和每种产品规格的增加,迭代所需的时间也会增加。
要回答您的问题,您只需完成
for product in find_id_1:
if any(key in product for key in keywords):
if not any(neg in product for neg in negative):
found_product.append(product)
这将为您提供符合规范的元素。但是我对您的数据进行了一些更改,只是为了使其成为有效的python代码。
found_product = []
keywords = ['YEEZY','BOOST','700']
negative = ['INFANTS','KIDS']
find_id_1 = [""""'{"id":2069103968384,"title":
"\nYEEZY BOOST 700 V2","handle":**"yeezy-boost-700-v2-vanta-june-6"**,
[{"id":19434310238336,"parent_id":2069103968384,"available":true,
"sku":"193093889925","featured_image":null,"public_title":null,
"requires_shipping":true,"price":30000,"options"'""",
""""'{"id":2069103935616,"title":"\nYEEZY BOOST 700 V2 KIDS","handle":
"yeezy-boost-700-v2-vanta-kids-june-6",`
["10.5k"],"option1":"10.5k","option2":"",
`"option3":"","option4":""},{"id":19434309845120,"parent_id":2069103935616,
"available":false,"sku":"193093893625","featured_image":null,
"public_title":null,"requires_shipping":true,"price":18000,"options"'""",
""""'{"id":2069104001152,"title":"\nYEEZY BOOST 700 V2 INFANTS",
"handle":**"yeezy-boost-700-v2-vanta-infants-june-6"***,`
["4K"],"option1":"4k","option2":"",`
"option3":"","option4":""},{"id":161803398876,"parent_id":2069104001152,
"available":false,"sku":"193093893724",
"featured_image":null,"public_title":null,
"requires_shipping":true,"price":15000,"options"'"""]
for product in find_id_1:
if any(key in product for key in keywords):
if not any(neg in product for neg in negative):
found_product.append(product)
print(found_product)