Question

我有一个数据集，其中一列包含经过预处理的字符串，即comments，但是我很难在整列中获得单个文本的可读性得分，我现在得到的是row可读性得分。

我尝试转换字符串中的整个列，然后使用textstat获得可读性得分

'''

    import textstat

    data["new"] = data["comments"]

    data['new'] = data.to_string(columns = ['new'])

    mess = data["new"]

    def text_proces(mess):


    score1 = textstat.flesch_reading_ease(mess)

    score = textstat.automated_readability_index(mess)

    print(score1)

    print(score)


    print(data["comments"].apply(text_proces))

'''

我得到的输出

'''

    score1

    7.3

    score

    10.1

    score1

    6.6

    score

    7.4

    0    None
    1    None
    2    None
    3    None
    4    None

''' 另外，我不知道无在输出中的含义

预期：整个专栏“评论”只有两个或多个唯一分数

'''

   score1 = 89.3

   score = 35.4

'''

如何获得可读性得分，例如烟雾索引或数据框中整列的flesch易读性？

0 个答案: