我有一个很长的字符串:
query = "PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD,
?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD,
?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
WHERE {?URI a sct:125676002. }"
现在我需要创建一个包含所有以'?'开头的子串的列表。所以列表应该如下所示:
schema = ['Age', 'Sex', 'Chest_Pain_Type', 'Trestbps', 'Chol', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'ThalachD',
'Exercise_Induced_Angina', 'OldpeakD', 'CaD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis']
我尝试了str.startswith(str, beg=0,end=len(string))
但它没有按照我的预期运作。怎么能在Python中做到这一点?
答案 0 :(得分:5)
使用正则表达式:
import re
query = """PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD,
?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD,
?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
WHERE {?URI a sct:125676002. }"""
#print re.findall("\?\w+", query)
print([i.replace("?", "") for i in re.findall("\?\w+", query)])
<强>输出:强>
['Age', 'SexTypes', 'Chest_Pain_Type', 'trestbpsD', 'cholD', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'thalachD', 'Exercise_Induced_Angina', 'oldpeakD', 'caD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis', 'URI']