嗨,我有一个数据框,如下所示:
starttime endtime positions
0 2019-05-16 05:34:26.870 2019-05-16 05:34:41.721 [7, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24...
1 2019-05-16 05:33:56.143 2019-05-16 05:34:10.995 [9, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23...
2 2019-05-16 05:33:35.659 2019-05-16 05:33:50.510 [13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 2...
3 2019-05-16 05:33:04.933 2019-05-16 05:33:19.784 [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,...
4 2019-05-16 05:34:11.507 2019-05-16 05:34:26.358 [3, 4, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, ...
我要对行进行排序,以使列表包含list(range(min(val),max(val)))
形式的连续值。
我尝试了
df[df["positions"] == list(range(min(df["positions"],max(df["positions"]))))]
但是我得到如下错误:
ValueError:长度必须匹配才能进行比较
是因为每个列表都有不同的长度吗?如果可以,该怎么解决?
答案 0 :(得分:2)
一种方法是在列表列上使用.apply
:
df['position'].apply(lambda x: x == list(range(min(x), max(x) + 1)))
# Example input
df = pd.DataFrame({'starttime': list(range(3)),
'endtime': list(range(1, 4)),
'positions': None})
# Manually insert lists into the 'positions' column entries
df.iat[0, 2] = [1, 4, 9]
df.iat[1, 2] = list(range(6))
df.iat[2, 2] = list(range(-4, 3))
# Get a boolean Series
df['positions'].apply(lambda x: x == list(range(min(x), max(x) + 1)))
0 False
1 True
2 True