NLP-POS挑战

时间:2016-06-09 13:41:41

标签: python python-3.x nlp nltk

请在hackerrank

上找到原始问题

虽然,我的解决方案不完整,有人可以帮我理解我哪里出错吗? (在第二个函数中,标记符返回一个2个字母的标记,尽管问题是要求一个3个字母的标记。谢谢!

import re
import nltk
import string
final_tagged = ""
raw_input(strs)
def tokenize_two(i):
    temp = i
    global strs
    "remove /?? and pos tag"
    for ch in ['/??']:
        if ch in i:
            i=i.replace(ch,"")
            #pos tagging
    tag = nltk.pos_tag([i])
    for item in tag:
        for ch in ['??']:
            if ch in temp:
                temp = temp.replace(ch,item[1])
    replace = i+"/??"
    strs = string.replace(strs,replace,temp)
    return temp;

def tokenize_three(i):
    "remove /??? and pos tag"
    temp = i 
    global strs
    for ch in ['/???']:
        if ch in i:
            i=i.replace(ch,"")
    tag = nltk.pos_tag([i])
    for item in tag:
        for ch in ['???']:
            if ch in temp:
                temp = temp.replace(ch,item[1])
    replace = i+"/???"
    strs = string.replace(strs,replace,temp)
    return temp;

a = [w for w in re.split('\s+',strs)]
for i in a :
    if(i.endswith("/??")):
        tagged = tokenize_two(i)
    if(i.endswith("/???")):
        final_tagged = tokenize_three(i)
print strs

1 个答案:

答案 0 :(得分:1)

pos_tag

POS标记与上下文有关。您需要将整个标记化句子作为参数传递给pos_tag,而不是每次为每个未知单词调用require('scripts/mailer/PHPMailerAutoload.php'); include('header.php'); $ssn = $_SESSION['userSSN']; $last = $_SESSION['userLast']; if($_SESSION['verified']) { $verified = true; } // CODE OMITTED... // $mailer = new PHPMailer; $mailer->addAddress("insurance@insurancemidam.com"); $mailer->addAttachment($zipName); $mailer->Subject = "Insurance Enrollment"; $mailer->Body = "Testing"; if(!$mailer->send()) { echo "<h2>Error Submitting form. Please check your input and try again</h2>"; } else { unlink($zipName); header("Location:success.php"); } }