Question

我正在尝试将此for循环更改为apply方法，因为iterrows / itertuples都太慢了。我有一个相当大的数据集。这可能吗？

for index, row in df2.iterrows():
    startDateString = str(row['Date'].replace("/",""))
    endDateString = str(row['Date'].replace("/",""))
    zipcode = str(row['Zip'])
    #startDateString = str(startDate)
    #endDateString = str(endDate)
    print("zip: " + "%s" %zipcode + ", daterange: " + startDateString + " - " + endDateString )

Answer 1

为什么的startDate和结束日期相同的列？

在STR呼叫作为格式specifer做他们是无用的。删除它们将导致：

class Response(models.Model):
    user = models.ForeignKey(
        User, 
        on_delete=models.CASCADE, 
        )
    enrollment = models.ForeignKey(
        Enrollment, 
        on_delete=models.CASCADE, 
        )
    evaluation = models.ForeignKey(
        Evaluation,
        on_delete=models.CASCADE, 
        )
    question = models.ForeignKey(
        Question,
        on_delete=models.CASCADE, 
        )
    question_component = models.ForeignKey(
        Question_Component,
        on_delete=models.CASCADE, 
        )

Answer 2

apply（）是熊猫库中最慢的方法之一。您可以使用str属性调用执行相同的操作。您无需创建所有变量。

    df2['new_column'] = f"""zip: {df2.Zip}, daterange: {df2['Date'].str.replace("/","")} - {df2['Date'].str.replace("/","")}"""
    for x in df2.new_column:
         print(x)

希望这可以处理您的数据。

如何在Python中将for循环转换为apply方法

2 个答案: