从R中的文本中提取代词

时间:2016-07-18 12:53:27

标签: r tm

sample_text <- ' Ramesh is my frien. He is a very good man' 

现在我需要从文字中提取所有代词(PRPPRP$

acqTag <- tagPOS(sample_text)

我得到以下

$POStagged 
 [1] "Ramesh/NNP is/VBZ my/PRP$ frien/NN ./. He/PRP is/VBZ a/DT very/RB good/JJ man/NN"
$POStags
 [1] "NNP"  "VBZ"  "PRP$" "NN"   "."    "PRP"  "VBZ"  "DT"   "RB"   "JJ"   "NN"  

现在如何从这里获取代词? PRP or PRP$

1 个答案:

答案 0 :(得分:1)

您想要什么作为输出?这似乎给了我想你想要的东西:

library("stringr")

prp <- str_extract_all(acqTag$POStagged,"\\w+/PRP\\$?")
str_replace(unlist(prp), "/PRP\\$?", "")
#[1] "my" "He"