在Python中查找大型二进制集中最长连续数字集的起点和终点

时间:2017-08-24 09:53:06

标签: python python-3.x binary

我正在尝试使用Python 3找到一个大型二进制集中最长的连续数字集的起点和终点。目前我找到了最长的连续数字1和0,现在我必须找到其中每个数字的起点和终点是。到目前为止,我的代码是:

对于1:

def getMaxSegmentLength(readable):
    current_length = 0
    max_length = 0


    for x in readable:
        if x == '1':
            current_length += 1
        else:
            max_length = max(max_length, current_length)
            current_length = 0

        return max(max_length, current_length)


def main():
    with open('C:/01.txt', 'r') as inputf:
        s = inputf.read()
        n = getMaxSegmentLength(s)
    print("The longest streak of 1's = " + str(n))


if __name__ == '__main__':
    main()

对于0:

def getMaxSegmentLength(readable):
    current_length = 0
    max_length = 0


    for x in readable:
        if x == '0':
            current_length += 1
        else:
            max_length = max(max_length, current_length)
            current_length = 0

        return max(max_length, current_length)


def main():
    with open('C:/01.txt', 'r') as inputf:
        s = inputf.read()
        m = getMaxSegmentLength(s)
    print("The longest streak of 0's = " + str(m))


if __name__ == '__main__':
    main()

此代码查找包含在单独文件中的非常大的二进制集中最长的连续数字集。我也知道总共有多少0和1,我还没有开始下一步找到起点和终点。任何帮助都非常感谢,因为我是Python 3的新手。

2 个答案:

答案 0 :(得分:0)

简单,跟踪1开始的​​条纹和变量(max_streak)以保持最大条纹的起点。每当发现更大的条纹时,请更新max_streak。

def getMaxSegmentLength(readable, digit):
'''find the longest streak of digit in the readable string'''
    current_length = 0
    max_length = 0

    starts_at= -1
    max_starts_at= -1

    for i, x in enumerate(readable):
        if x == digit:
            current_length += 1
            if current_length == 1:
                starts_at = i

        elif max_length < current_length:
            max_length = current_length
            max_starts_at = starts_at
            current_length = 0

    if max_length < current_length:
        max_length = current_length
        max_starts_at = starts_at

    max_ends_at = max_starts_at+max_length-1

    # return a tuple of start point and end point index
    return max_starts_at, max_ends_at


def main():
    with open('F:/input.txt', 'r') as inputf:
        s = inputf.read()

        # check for 1's
        n = getMaxSegmentLength(s, '1')
        print("The longest streak of 1's = " + str(n))

        # check for 0's
        n = getMaxSegmentLength(s, '0')
        print("The longest streak of 0's = " + str(n))

if __name__ == '__main__':
    main()

答案 1 :(得分:0)

您可以使用正则表达式匹配每个序列,然后更新相关数字的字典:

import re

# example input string
input = "00111101100010100010101111011011011"

best = {
    "0": { "start": 0, "len": 0 },
    "1": { "start": 0, "len": 0 }
};
for m in re.compile(r"(.)\1*").finditer(input):
    if best[m.group()[0]]["len"] < len(m.group()):
        best[m.group()[0]] = { "start": m.start(), "len": len(m.group()) }

print (best)

输出:

{'1': {'start': 2, 'len': 4}, '0': {'start': 9, 'len': 3}}