使用stanford pos tagger删除复数

时间:2013-08-05 08:09:35

标签: stanford-nlp pos-tagger

我正在尝试使用斯坦福标记符替换复数到单数(例如从女孩到女孩)。

private static final String vbnTag = "VBN";
private static final String vbdTag = "VBD";
private static final String jjTag = "JJ";
private static final String edSuff = "ed";
private static final String enSuff = "en";
private static final String oneSt = "1";
private static final String naWord = "NA";

private static final Pattern stopper = Pattern.compile("(?i:and|or|but|,|;|-|--)");
private static final Pattern vbnWord = Pattern.compile("(?i:have|has|having|had|is|am|are|was|were|be|being|been|'ve|'s|s|'d|'re|'m|gotten|got|gets|get|getting)"); // cf. list in EnglishPTBTreebankCorrector

我做对了吗?

1 个答案:

答案 0 :(得分:0)

我认为你可以借助斯坦福核心NLP中提供的词形化注释来做到这一点。