Python:减少圈复杂度

时间:2016-08-24 11:55:12

标签: python cyclomatic-complexity

我需要帮助减少以下代码的圈复杂度:

def avg_title_vec(record, lookup):
    avg_vec = []
    word_vectors = []
    for tag in record['all_titles']:
        titles = clean_token(tag).split()
        for word in titles:
            if word in lookup.value:
                word_vectors.append(lookup.value[word])
    if len(word_vectors):
        avg_vec = [
            float(val) for val in numpy.mean(
                numpy.array(word_vectors),
                axis=0)]

    output = (record['id'],
              ','.join([str(a) for a in avg_vec]))
    return output

示例输入:

record ={'all_titles': ['hello world', 'hi world', 'bye world']}

lookup.value = {'hello': [0.1, 0.2], 'world': [0.2, 0.3], 'bye': [0.9, -0.1]}

def clean_token(input_string):
    return input_string.replace("-", " ").replace("/", " ").replace(
    ":", " ").replace(",", " ").replace(";", " ").replace(
    ".", " ").replace("(", " ").replace(")", " ").lower()

所以在lookup.value中出现的所有单词,我都是他们的矢量形式的平均值。

1 个答案:

答案 0 :(得分:0)

实际上它可能并不算作正确的答案,因为最终圈复的复杂性并没有减少。

这个变体有点短,但我看不出它可以被推广的任何方式。而且你似乎需要你拥有的那些if

def avg_title_vec(record, lookup):
    word_vectors = [lookup.value[word] for tag in record['all_titles']
                    for word in clean_token(tag).split() if word in lookup.value]
    if not word_vectors:
        return (record['id'], None)
    avg_vec = [float(val) for val in numpy.mean(
               numpy.array(word_vectors),
               axis=0)]

    output = (record['id'],
              ','.join([str(a) for a in avg_vec]))
    return output

根据this,你的CC是6,已经很好了。您可以通过使用辅助函数来减少函数的CC,例如

def get_tags(record):
    return [tag for tag in record['all_titles']]

def sanitize_and_split_tags(tags):
    return [word for tag in tags for word in
            re.sub(r'[\-/:,;\.()]', ' ', tag).lower().split()]

def get_vectors_words(words):
    return [lookup.value[word] for word in words if word in lookup.value]

它将降低平均CC,但整体CC将保持不变或增加。我不知道如何摆脱if检查单词是否在lookup.value中还是检查我们是否有任何可以使用的向量。