abstract="Thyroid-associated orbitopathy (TO) is an autoimmune-mediated orbital inflammation that can lead to disfigurement and blindness. Multiple genetic loci have been associated with Graves' disease, but the genetic basis for TO is largely unknown. This study aimed to identify loci associated with TO in individuals with Graves' disease, using a genome-wide association scan (GWAS) for the first time to our knowledge in TO.Genome-wide association scan was performed on pooled DNA from an Australian Caucasian discovery cohort of 265 participants with Graves' disease and TO (cases) and 147 patients with Graves' disease without TO (controls). "
sent = nltk.tokenize.wordpunct_tokenize(abstract)
pos_tag = nltk.pos_tag(sent)
nes = nltk.ne_chunk(pos_tag)
places = []
for ne in nes:
if type(ne) is nltk.tree.Tree:
if (ne.label() == 'GPE'):
places.append(u' '.join([i[0] for i in ne.leaves()]))
if len(places) == 0:
['Thyroid', 'Australian', 'Caucasian', 'Graves']
答案 0 :(得分:5)
因此,在富有成效的评论之后,我深入研究了不同的NER工具,以便最好地识别国籍和国家提及,并发现SPACY有一个NORP实体,可以有效地提取国籍。 https://spacy.io/docs/usage/entity-recognition
答案 1 :(得分:2)
查看Stanford NER tagger!
from nltk.tag.stanford import NERTagger
import os
st = NERTagger('../ner-model.ser.gz','../stanford-ner.jar')
tagging = st.tag(text.split())
答案 2 :(得分:1)
这里使用NLTK执行实体提取的geograpy。它将所有地点和位置存储为地名录。然后,它在地名词典上执行查找以获取相关位置和位置。查找文档以获取更多用法详细信息 -
from geograpy import extraction
e = extraction.Extractor(text="Thyroid-associated orbitopathy (TO) is an autoimmune-
mediated orbital inflammation that can lead to disfigurement and blindness.
Multiple genetic loci have been associated with Graves' disease, but the genetic
basis for TO is largely unknown. This study aimed to identify loci associated with
TO in individuals with Graves' disease, using a genome-wide association scan
(GWAS) for the first time to our knowledge in TO.Genome-wide association scan was
performed on pooled DNA from an Australian Caucasian discovery cohort of 265
participants with Graves' disease and TO (cases) and 147 patients with Graves'
disease without TO (controls).")
print e.places()
答案 3 :(得分:0)
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp(u"Apple is opening its first big office in San Francisco and California.")
print([(ent.text, ent.label_) for ent in doc.ents])