Question

我有以下脚本，并且想找到一种方法来对其进行更改，以便与其提供列表（即sample_rows），而不是通过列表中的数据行（其中“关键字”是一列，而“ URL” “是一列。

我发现了this个类似的问题，但是没有一个对这个特定任务有用的答案。

有什么想法吗？

import re

sample_rows = [
    ("hyundai sonata rebate", "https://www.edmunds.com/hyundai/sonata/2018/deals"),
    ("2017 jeep wrangler", "https://www.edmunds.com/jeep/wrangler/2017/deals"),
    ("2019 honda accord", "https://www.edmunds.com/honda/accord/2019/deals"),
    ("1985 some old car", "https://www.edmunds.com/some/oldcar/1985/deals")
]

for row in sample_rows:
    keywords = row[0]
    url = row[1]
# the url
    if "/2019/" in url:
        new_url = url
        print(f"new_url {new_url}")
    elif re.search("/(?:(?:20)|(?:19))\d{2}/", url):
        old_url = url
        print(f"old_url {old_url}")    
 # the "words"
    if "2019" in keywords:
        new_word = keywords
        print(f"new_word {new_word}")
    elif re.search("(?:(?:20)|(?:19))\d{2}", keywords) is None:
        new_word = keywords
        print(f"new_word {new_word}")

编辑：这是我拥有的数据框，并希望合并

编辑：这是上面脚本的输出。

所需的输出：

Landing_page_type是脚本这一部分的输出，遍历每一行：

# the url
    if "/2019/" in url:
        new_url = url
        print(f"new_url {new_url}")
    elif re.search("/(?:(?:20)|(?:19))\d{2}/", url):
        old_url = url
        print(f"old_url {old_url}")

ideal_target_page_type作为本部分的输出：

 # the "words"
    if "2019" in keywords:
        new_word = keywords
        print(f"new_word {new_word}")
    elif re.search("(?:(?:20)|(?:19))\d{2}", keywords) is None:
        new_word = keywords
        print(f"new_word {new_word}")

Answer 1

因此，如果做对了，这就是使用pandas（+ zip）进行这种迭代的方式：

for url, kwords in zip(df.url, df.keywords):
   # the url
   # your code here

如果您最喜欢它，也可以使用类似dict的语法：

for url, kwords in zip(df["url"], df["keywords"]):
   # the url
   # your code here

希望它能回答您的问题

遍历python数据框的行

1 个答案: