我有一个程序来获取缩写(即查找括号中的单词),然后根据缩写中的字符数,返回那么多单词并对其进行定义。到目前为止,它适用于诸如以大写字母开头的前面单词或大多数前面的单词以大写字母开头的定义。对于后者,它会跳过小写字母(例如“ in”)并转到下一个字母。但是,我的问题是相应单词的数量全部为小写。
当前输出:
诸位帅哥(AAD)
临床试验中方法,测量和疼痛评估的倡议(IMMPACT)
试用(IMMPACT)。一些患者喜欢常规护理(UC)
所需的输出:
诸位帅哥(AAD)
临床试验中方法,测量和疼痛评估的倡议(IMMPACT)
常规护理(UC)
import re
s = """Too many people, but not All Awesome Dudes (AAD) only care about the
Initiative on Methods, Measurement, and Pain Assessment in Clinical
Trials (IMMPACT). Some patient perfer the usual care (UC) approach of
doing nothing"""
allabbre = []
for match in re.finditer(r"\((.*?)\)", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()
count=0
for k,i in enumerate(words[::-1]):
if i[0].isupper():count+=1
if count==size:break
words=words[-k-1:]
definition = " ".join(words)
abbr_keywords = definition + " " + "(" + abbr + ")"
pattern='[A-Z]'
if re.search(pattern, abbr):
if abbr_keywords not in allabbre:
allabbre.append(abbr_keywords)
print(abbr_keywords)
答案 0 :(得分:1)
该标志用于All are Awesome Dudes (AAD)
import re
s = """Too many people, but not All Awesome Dudes (AAD) only care about the
Initiative on Methods, Measurement, and Pain Assessment in Clinical
Trials (IMMPACT). Some patient perfer the usual care (UC) approach of
doing nothing
"""
allabbre = []
for match in re.finditer(r"\((.*?)\)", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()
count=size-1
flag=words[-1][0].isupper()
for k,i in enumerate(words[::-1]):
first_letter=i[0] if flag else i[0].upper()
if first_letter==abbr[count]:count-=1
if count==-1:break
words=words[-k-1:]
definition = " ".join(words)
abbr_keywords = definition + " " + "(" + abbr + ")"
pattern='[A-Z]'
if re.search(pattern, abbr):
if abbr_keywords not in allabbre:
allabbre.append(abbr_keywords)
print(abbr_keywords)