如何计算字符串中字母之间的平均位数?

时间:2019-05-22 09:32:42

标签: python string average

我有这样的字符串:

F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F

我想计算'F','+'和'-'之间的平均位置数。
因此,对于此示例,它将是:

Average chars between Fs:   1
Average chars between +s:   2.25
Average chars between -s:   3

最有效的方法是什么?

3 个答案:

答案 0 :(得分:4)

这是一个变体。

首先,我收集所有i个行为者发生的索引char;然后我计算出差异的mean

from collections import defaultdict
from itertools import islice
from statistics import mean

strg = "F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F"

dct = defaultdict(list)

for i, char in enumerate(strg):
    dct[char].append(i)

for char, occurrences in dct.items():
    avg = mean(b - a for a, b in zip(occurrences, islice(occurrences, 1, None))) - 1
    print(f"Average chars between {char}s:  {avg}")

此打印:

Average chars between Fs:  1
Average chars between -s:  3
Average chars between +s:  2.25

在第一个for循环之后,dct中将出现这样的条目:

'-': [1, 3, 11, 13, 15, 23, 29, 33, 35, 37]

并且-如前所述-第二个for循环计算差值的平均值。

答案 1 :(得分:1)

我将按照以下方式使用正则表达式(re模块)来实现此目的:

import re
txt = "F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F"
chars = set(list(txt))
between = dict()
for i in chars:
    between[i] = re.findall('(?<='+re.escape(i)+').*?(?='+re.escape(i)+')',txt)
for i in chars:
    if len(between[i])==0:
        between[i] = 0.0
    else:
        between[i] = sum([len(i) for i in between[i]])/len(between[i])
print(between)

输出:

{'F': 1.0, '+': 2.25, '-': 3.0}

说明:我正在以非贪婪的方式从左到右查找给定字符(使用零长度断言)的出现之间的子字符串(因此"F-F-F"给出["-","-"]而不是["-F-"] ),然后简单地计算其长度的平均值。请注意,我使用re.escape处理具有特殊含义的字符(例如+)。

答案 2 :(得分:0)

不使用任何库的另一种方法:

string = 'F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F'
string = list(string)
chars = set(string)
for char in chars:
    ind = [i for i, x in enumerate(string) if x == char]
    diff = [ind[i+1]-ind[i] - 1 for i in range(len(ind)-1)]
    print(f'Average chars between {char}s:  {sum(diff) / len(diff)}')

输出:

Average chars between +s:  2.25
Average chars between Fs:  1.0
Average chars between -s:  3.0