x["next_dtScraped"] =
if x["away_team"] == x["away_team"].shift(-1):
x["dtScraped"].shift(-1)
else:
None
所以基本上我想要创建一个列,它会返回一列的下一行,但前提是另一列的行等于该列的下一行。由于语法错误,上面的代码无法正常工作。我不确定这是否可行。
+--------+---------------+-------------------+---------------------+---------------------+
| | home_team | away_team | dtScraped | next_dtScraped |
+--------+---------------+-------------------+---------------------+---------------------+
| 81965 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 17:40:48 | 2017-09-26 17:54:38 |
| 76817 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 17:54:38 | 2017-09-26 17:56:05 |
| 236234 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 17:56:05 | 2017-09-26 18:04:43 |
| 192767 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 18:04:43 | 2017-09-26 18:08:38 |
| 13448 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 18:08:38 | 2017-09-26 18:17:56 |
| 38306 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 18:17:56 | 2017-09-26 18:23:14 |
| 106907 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 18:23:14 | 2017-09-26 18:36:36 |
| 235751 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 18:36:36 | 2017-09-26 18:45:56 |
| 143897 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 18:45:56 | 2017-09-26 18:47:34 |
| 206117 | APOEL Nicosia | Tottenham Hotspur | 2017-09-26 18:47:34 | 2017-09-28 19:22:49 |
| 112775 | AS Monaco | Besiktas JK | 2017-09-28 19:22:49 | 2017-09-28 19:37:41 |
| 128744 | AS Monaco | Besiktas JK | 2017-09-28 19:37:41 | 2017-09-28 19:49:06 |
| 238778 | AS Monaco | Besiktas JK | 2017-09-28 19:49:06 | 2017-09-28 19:54:15 |
| 37271 | AS Monaco | Besiktas JK | 2017-09-28 19:54:15 | 2017-09-28 20:13:15 |
| 81647 | AS Monaco | Besiktas JK | 2017-09-28 20:13:15 | 2017-09-28 20:17:44 |
| 65930 | AS Monaco | Besiktas JK | 2017-09-28 20:17:44 | 2017-09-28 20:20:31 |
| 45845 | AS Monaco | Besiktas JK | 2017-09-28 20:20:31 | 2017-09-28 20:21:50 |
| 110165 | AS Monaco | Besiktas JK | 2017-09-28 20:21:50 | 2017-09-28 20:35:16 |
| 4856 | AS Monaco | Besiktas JK | 2017-09-28 20:35:16 | 2017-09-28 20:40:36 |
| 148769 | AS Monaco | Besiktas JK | 2017-09-28 20:40:36 | 2017-09-28 20:54:01 |
| 34760 | AS Monaco | Besiktas JK | 2017-09-28 20:54:01 | 2017-09-28 21:02:34 |
| 182633 | AS Monaco | Besiktas JK | 2017-09-28 21:02:34 | 2017-09-28 21:13:20 |
| 230996 | AS Monaco | Besiktas JK | 2017-09-28 21:13:20 | 2017-09-28 21:20:41 |
| 66761 | AS Monaco | Besiktas JK | 2017-09-28 21:20:41 | 2017-09-28 21:25:49 |
| 243059 | AS Monaco | Besiktas JK | 2017-09-28 21:25:49 | 2017-09-28 21:43:19 |
+--------+---------------+-------------------+---------------------+---------------------+
所以我希望,当球队改变时,不要从前一支球队获得价值。因此索引206117,APOEL x托特纳姆的最后一行在列__ttScraped
列中将为空答案 0 :(得分:1)
import numpy as np
x["next_dtScraped"] = np.where(x["away_team"] == x["away_team"].shift(-1),x["dtScraped"].shift(-1),None)
答案 1 :(得分:0)
您可以使用reduce:
mask = x["away_team"] == x["away_team"].shift(-1)
mask = reduce(lambda x,y: x and y, mask)
if mask:
x["dtScraped"].shift(-1)
else:
None
x["away_team"] == x["away_team"].shift(-1)
给你一个布尔列表,所以使用reduce你可以看出它是否严格等于。
对于您的错误我不知道您尝试做什么,但您应该发布错误,以便我们知道如何提供帮助。
编辑:想想我的语法错误是什么,试试这个:
x["next_dtScraped"] = x["dtScraped"].shift(-1) if (x["away_team"] == x["away_team"].shift(-1)) else None
答案 2 :(得分:0)
这应该这样做:
df["next_dtScraped"] = df["next_dtScraped"].apply(lambda x: df["dtScraped"].shift(-1) if df["away_team"] == df["away_team"].shift(-1) else x)
或者:
x["next_dtScraped"] = x.apply(lambda c: c["dtScraped"].shift(-1) if c["away_team"] == c["away_team"].shift(-1) else None)
不确定你需要哪一个:)