Question

我具有以下功能来标记句子中的不同短语，例如：

由于[word]，
带有[word]，
没有[word]，
[单词]后跟数字
数字后跟一个[单词]

我在运行以下代码时收到错误消息：
sentence = list [w1，w2，...]表示w中的单词 ^ SyntaxError：语法无效

 def features(sentence, index):
   sentence=list[w1, w2, ...] 
   index= w[index] for i in w
   return {
            'word': sentence[index], 
            'prefix-1': "due to/due" in sentence[index],
            'prefix-2': "other" in sentence[index],
            'prefix-3': "with" in sentence[index],
            'prefix-4': "without" in sentence[index],
            'is_numeric': sentence[index].isdigit(),
            'prev_word': '' if index == 0 else sentence[index - 1],
            'next_word': '' if index == len(sentence) - 1 else sentence[index + 1],
    }

Answer 1

抱歉，还不能发表评论。

这听起来很家庭作业：p 我不确定我是否理解您的问题/疑问，但我确实注意到一件事。可能是'word': 'c'}是您的意外结果吗？ features的文档字符串显示为""" sentence: [w1, w2, ...], index: the index of the word """，因此您需要使用列表来调用它。

尝试：features(['cholera', 'due', 'virus', 'A', '890'], 0)

或：features('cholera due virus A 890'.split(), 0) 您可以使用split方法将字符串分割成一个以空格为分隔符的列表。

l ='cholera due virus A 890'.split()
l == ['cholera', 'due', 'virus', 'A', '890'] # is True

标记句子中不同短语的功能

1 个答案: