我有一个数据集包含字符串'useful_crit',以字符串的形式作为数据类型“object”。
Pat_ID Useful_crit
1 **inclusive range**:age 35 to 75 - type 2 diabetes **exclusive range**: type 1 diabetes
2 **inclusive range**:patients aged 21 and above **exclusive range**:patients who are mentally `
每列中的字符串包含两个常用词:包含范围和独占范围。现在,我想从同一个字符串创建两列“包含范围”和“独占范围”。所以输出就像是,
Pat_ID inclusive range exclusive range
1 age 35 to 75 - type 2 diabetes type 1 diabetes
2 patients aged 21 and above patients who are mentally
如何在python中执行此操作?
答案 0 :(得分:0)
这是单程
In [2519]: (df.Useful_crit.str.split('(\**inclusive\**:|\**exclusive\**:)')
.apply(pd.Series)[[2,4]])
Out[2519]:
2 4
0 age 35 to 75 - type 2 diabetes type 1 diabetes
1 patients aged 21 and above patients who are mentally
In [2520]: df.join(df.Useful_crit.str.split('(\**inclusive\**:|\**exclusive\**:)')
.apply(pd.Series)[[2,4]]
.rename(columns={2: 'inclusive', 4: 'exclusive'}))
Out[2520]:
Pat_ID Useful_crit \
0 1 **inclusive**:age 35 to 75 - type 2 diabetes *...
1 2 **inclusive**:patients aged 21 and above **exc...
inclusive exclusive
0 age 35 to 75 - type 2 diabetes type 1 diabetes
1 patients aged 21 and above patients who are mentally