im试图遍历日期列表的列,并说第二个日期是否比第一个日期晚10分钟或更长,则返回“ 1”,否则返回“ 0”,如果第三个日期比第二个日期晚10分钟或更大,则提示“ '1'否则'0'等。
很抱歉,如果我已经回答了这个问题,我似乎找不到任何帮助。
列表大小各不相同。有人知道我该怎么做吗?
df = df_data_collective.groupBy("customer_id").agg(
F.expr("collect_list(start_dt)").alias("start_times")
)
这将输出客户ID和喜欢的日期时间列表
['2020-04-02T08:15:50+01:00', '2020-04-02T08:15:53+01:00', '2020-04-02T08:15:56+01:00', '2020-04-02T08:16:01+01:00', '2020-04-02T08:16:07+01:00', '2020-04-02T08:21:05+01:00', '2020-04-02T08:21:17+01:00', '2020-04-02T08:21:30+01:00', '2020-04-02T08:21:43+01:00', '2020-04-02T08:21:49+01:00', '2020-04-02T08:22:11+01:00', '2020-04-02T08:22:16+01:00', '2020-04-02T08:24:02+01:00', '2020-04-02T08:24:09+01:00', '2020-04-02T08:24:37+01:00', '2020-04-02T08:36:26+01:00', '2020-04-02T08:39:25+01:00', '2020-04-02T08:39:41+01:00', '2020-04-02T08:39:52+01:00', '2020-04-02T08:40:18+01:00', '2020-04-02T08:40:27+01:00', '2020-04-02T08:40:33+01:00', '2020-04-02T08:40:49+01:00', '2020-04-02T08:41:03+01:00', '2020-04-02T08:41:29+01:00', '2020-04-02T08:42:00+01:00', '2020-04-02T08:42:23+01:00', '2020-04-02T08:42:57+01:00', '2020-04-02T08:44:43+01:00', '2020-04-02T08:44:49+01:00']
我对for循环有非常基本的了解,但是仍在培训中,希望了解是否有人可以提供任何建议?
答案 0 :(得分:0)
from datetime import datetime, timedelta
dt_str_list = ['2020-04-02T08:15:50+01:00', '2020-04-02T08:15:53+01:00',
'2020-04-02T08:15:56+01:00', '2020-04-02T08:16:01+01:00',
'2020-04-02T08:16:07+01:00', '2020-04-02T08:21:05+01:00',
'2020-04-02T08:21:17+01:00', '2020-04-02T08:21:30+01:00',
'2020-04-02T08:21:43+01:00', '2020-04-02T08:21:49+01:00',
'2020-04-02T08:22:11+01:00', '2020-04-02T08:22:16+01:00',
'2020-04-02T08:24:02+01:00', '2020-04-02T08:24:09+01:00',
'2020-04-02T08:24:37+01:00', '2020-04-02T08:36:26+01:00',
'2020-04-02T08:39:25+01:00', '2020-04-02T08:39:41+01:00',
'2020-04-02T08:39:52+01:00', '2020-04-02T08:40:18+01:00',
'2020-04-02T08:40:27+01:00', '2020-04-02T08:40:33+01:00',
'2020-04-02T08:40:49+01:00', '2020-04-02T08:41:03+01:00',
'2020-04-02T08:41:29+01:00', '2020-04-02T08:42:00+01:00',
'2020-04-02T08:42:23+01:00', '2020-04-02T08:42:57+01:00',
'2020-04-02T08:44:43+01:00', '2020-04-02T08:44:49+01:00']
dt_list = [datetime.strptime(dt_str, '%Y-%m-%dT%H:%M:%S%z')
for dt_str in dt_str_list]
minute_10 = timedelta(minutes=10)
flags = [1 if dt_list[i] - dt_list[i-1] > minute_10 else 0
for i in range(1, len(dt_list))]
答案 1 :(得分:0)
您可以使用MySecondClass::MySecondClass(PyObject* p){
// get the attribute from p; equivalent of cpp_attr = p.attr
PyObject* cpp_attr = PyObject_getAttrString(p, (char*)"attr"));
// somehow get back the pointer to MyClass object created in function1
}
方法:
str.split()
输出:
from datetime import datetime, timedelta
lst = ['2020-04-02T08:15:50+01:00', '2020-04-02T08:15:53+01:00', '2020-04-02T08:15:56+01:00', '2020-04-02T08:16:01+01:00', '2020-04-02T08:16:07+01:00', '2020-04-02T08:21:05+01:00', '2020-04-02T08:21:17+01:00', '2020-04-02T08:21:30+01:00', '2020-04-02T08:21:43+01:00', '2020-04-02T08:21:49+01:00', '2020-04-02T08:22:11+01:00', '2020-04-02T08:22:16+01:00', '2020-04-02T08:24:02+01:00', '2020-04-02T08:24:09+01:00', '2020-04-02T08:24:37+01:00', '2020-04-02T08:36:26+01:00', '2020-04-02T08:39:25+01:00', '2020-04-02T08:39:41+01:00', '2020-04-02T08:39:52+01:00', '2020-04-02T08:40:18+01:00', '2020-04-02T08:40:27+01:00', '2020-04-02T08:40:33+01:00', '2020-04-02T08:40:49+01:00', '2020-04-02T08:41:03+01:00', '2020-04-02T08:41:29+01:00', '2020-04-02T08:42:00+01:00', '2020-04-02T08:42:23+01:00', '2020-04-02T08:42:57+01:00', '2020-04-02T08:44:43+01:00', '2020-04-02T08:44:49+01:00']
def s(d):
h,m,s = d.split(':',2)
h = int(h[-2:])*60*60
m = int(m)*60
s = int(s[:2])
return h+m+s
c = [1 if s(lst[i-1])-s(d) >= 600 and i else 0 for i,d in enumerate(lst)]
print(c)
答案 2 :(得分:0)
首先,必须将 start_dt
转换为 timestamp
格式,然后在收集列表之后,我们可以应用 {{1} } 功能与 transform(with index as i)
一起获得所需的输出。 (从 unix_timestamp
开始可用转换)
spark2.4