如何在长字符串中搜索子字符串并在Python中创建列表?

时间:2018-03-20 10:48:42

标签: python string substring

我有一个很长的字符串:

query = "PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
         PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

         SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD, 
                    ?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD, 
                    ?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
                      WHERE {?URI a sct:125676002. }"

现在我需要创建一个包含所有以'?'开头的子串的列表。所以列表应该如下所示:

schema = ['Age', 'Sex', 'Chest_Pain_Type', 'Trestbps', 'Chol', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'ThalachD', 
             'Exercise_Induced_Angina', 'OldpeakD', 'CaD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis']

我尝试了str.startswith(str, beg=0,end=len(string))

但它没有按照我的预期运作。怎么能在Python中做到这一点?

1 个答案:

答案 0 :(得分:5)

使用正则表达式:

import re
query = """PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
         PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

         SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD, 
                    ?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD, 
                    ?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
                      WHERE {?URI a sct:125676002. }"""

#print re.findall("\?\w+", query)
print([i.replace("?", "") for i in re.findall("\?\w+", query)])

<强>输出:

['Age', 'SexTypes', 'Chest_Pain_Type', 'trestbpsD', 'cholD', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'thalachD', 'Exercise_Induced_Angina', 'oldpeakD', 'caD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis', 'URI']