Question

例如，我在一个csv文件中有一个布尔列：

您可以看到这里1每5行重新分配一次。我想在python中重复超过10时就识别出这种重复模式[1,0,0,0]（我有〜20.000行/文件）。模式可以在任何位置开始我该如何在python中解决这个问题，避免.....

Answer 1

# Generate 20000 of 0s and 1s
data = pd.Series(np.random.randint(0, 2, 20000))

# Keep indices of 1s
idx = df[df > 0].index

# Check distance of current index with next index whether is 4 or not, 
# Say if position 2 and position 6 is found as 1, so 6 - 2 = 4
found = []
for i, v in enumerate(idx):
    if i == len(idx) - 1:
        break
    next_value = idx[i + 1]
    if (next_value - v) == 4:
        found.append(v)

print(found)

如何识别熊猫系列中的[1，X，X，X，1]重复模式

1 个答案: