Question

我有一个以下格式的文本文件，我必须提取所有范围的运动和位置值。在某些文件中，该值在下一行给出，而在某些文件中，则不给出

File1.txt：

Functional Assessment: Patient currently displays the following functional 
limitations and would benefit from treatment to maximize functional use and 
pain reduction: Range of Motion: limited . ADLs: limited . Gait: limited . 
Stairs: limited . Squatting: limited . Work participation status: limited . 
Current Status: The patient's current status is improving. 

Location: Right side

预期输出：limited | Right side

File2.txt：

Functional Assessment: Patient currently displays the following functional 
limitations and would benefit from treatment to maximize functional use and 
pain reduction: 
Range of Motion: 
painful 
and
limited

Strength: 
limited

预期输出：painful and limited |没有给出

这是我正在尝试的代码：

if "Functional Assessment:" in line:
    result=str(line.rsplit('Functional Assessment:'))
    romvalue = result.rsplit('Range of Motion:')[-1].split()[0]
    outputfile.write(romvalue)
    partofbody = result.rsplit('Location:')[-1].split()[0]
    outputfile.write(partofbody)

此代码无法获得所需的输出。有人可以帮忙吗。

Answer 1

您可以在以Functional Assessment:开头的行之后收集所有行，并加入它们并使用以下正则表达式：

(?sm)\b(Location|Range of Motion):\s*([^\W_].*?)\s*(?=(?:\.\s*)?[^\W\d_]+:|\Z)

请参见regex demo。

详细信息

(?sm)-re.S和re.M修饰符
\b-单词边界
(Location|Range of Motion)-第1组：Location或Range of Motion
:\s*-一个冒号和0+个空格
([^\W_].*?)-第2组：
\s*-超过0个空格
(?=(?:\.\s*)?[^\W\d_]+:|\Z)-当前位置右侧的正向前瞻
- (?:\.\s*)?-.和0+空格的可选序列
- [^\W\d_]+:-超过1个字母，后跟:
- |-或
- \Z-字符串的结尾。

这里是Python demo：

reg = re.compile(r'\b(Location|Range of Motion):\s*([^\W_].*?)\s*(?=(?:\.\s*)?[^\W\d_]+:|\Z)', re.S | re.M)
for file in files:
    flag = False
    tmp = ""
    for line in file.splitlines():
        if line.startswith("Functional Assessment:"):
            tmp = tmp + line + "\n"
            flag = not flag
        elif flag:
            tmp = tmp + line + "\n"
    print(dict(list(reg.findall(tmp))))

输出（对于您发布的两个文本）：

{'Location': 'Right side', 'Range of Motion': 'limited'}
{'Range of Motion': 'painful \nand\nlimited'}

从python中的文本文件中提取某些值

1 个答案: