Question

我是pandas的新手并使用数据帧。我有一个相当简单的问题，我认为应该有一个直截了当的解决方案，我不清楚（我不太了解大熊猫）。

所以我的数据框中出现了多次具有相同索引的行：

                     Glucose   Insulin  Carbs
Hour
2018-05-16 06:43:00    156.0       0.0    0.0
2018-05-16 06:43:00      NaN       0.0   65.0
2018-05-16 06:43:00      NaN       7.0    0.0

我想合并它们来得到这个，这一行包含给定时间索引的所有可用信息：

                     Glucose   Insulin  Carbs
Hour
2018-05-16 06:43:00    156.0       7.0   65.0
2018-05-16 06:43:00      NaN       0.0   65.0
2018-05-16 06:43:00      NaN       7.0    0.0

之后我会删除任何列中包含NaN的所有行：

                     Glucose   Insulin  Carbs
Hour
2018-05-16 06:43:00    156.0       7.0   65.0

问题在于，在相同的数据框中，我有重复的信息较少，可能只有碳水化合物或胰岛素。

                     Glucose   Insulin  Carbs
Hour
2018-05-19 06:15:00      NaN       1.5    0.0
2018-05-19 06:15:00    229.0       0.0    0.0

我已经知道这些条目的索引：

bad_indices = _df[ _df.Glucosa.isnull() ].index

我想知道的是，是否有一种不错的Pythonic方式来完成这样的任务（两种情况和三种情况）。也许是大熊猫内置方法或半标准的东西或至少可读，因为我不想写丑陋（并且容易破碎）对每种情况都有明确考虑的代码。

Answer 1

您可以将def index(request): data = dict() data["name"] = "ThePythonDjango.Com" data["DOB"] = "Jan 10, 2015" template = get_template('testapp/test.html') html = template.render(data) pdf = pdfkit.from_string(html, False) filename = "sample_pdf.pdf" response = HttpResponse(pdf, content_type='application/pdf') response['Content-Disposition'] = 'attachment; filename="' + filename + '"' return response替换为0，然后为每个组获取第一个非NaN值：

NaN

在pandas dataframe

1 个答案: