我想用包含在另一列中的字符串的ISIN或CUSIP部分更新一列:
my_DestSystemNote1_string = 'ISIN=XS1906311763|CUSIP= |CalTyp=1'
dfDest = [('DestSystemNote1', ['ISIN=XS1906311763|CUSIP= |CalTyp=1',
'ISIN=XS0736418962|CUSIP= |CalTyp=1',
'ISIN=XS1533910508|CUSIP= |CalTyp=1',
'ISIN=US404280AS86|CUSIP=404280AS8|CalTyp=1',
'ISIN=US404280BW89|CUSIP=404280BW8|CalTyp=21',
'ISIN=US06738EBC84|CUSIP=06738EBC8|CalTyp=21',
'ISIN=XS0736418962|CUSIP= |CalTyp=1',]),
]
# create pandas df
dfDest = pd.DataFrame.from_items(dfDest)
display(dfDest)
print("")
DestSystemNote1
包含需要从中提取ISIN或CUSIP的源字符串:
DestSystemNote1 Found_ISIN Found_CUSIP
ISIN=XS1906311763|CUSIP= |CalTyp=1 XS1906311763
ISIN=XS0736418962|CUSIP= |CalTyp=1 XS0736418962
ISIN=XS1533910508|CUSIP= |CalTyp=1 XS1533910508
ISIN=US404280AS86|CUSIP=404280AS8|CalTyp=1 US404280AS86 404280AS8
ISIN=US404280BW89|CUSIP=404280BW8|CalTyp=21 US404280BW89 404280BW8
ISIN=US06738EBC84|CUSIP=06738EBC8|CalTyp=21 US06738EBC84 06738EBC8
ISIN=XS0736418962|CUSIP= |CalTyp=1 XS0736418962
ISIN始终以“ ISIN =“开头,并在“ |”之前结束字符
CUSIPS始终以“ CUSIP =“开头,并在“ |”之前结束字符
我的尝试如下:
my_DestSystemNote1_string = 'ISIN=XS1906311763|CUSIP= |CalTyp=1'
code = my_DestSystemNote1_string.split("ISIN=",1)[1]
code = code[:12]
print(code)
XS1906311763
所以我要到达那里,但想对其进行参数化处理,以找到传递的字符串(strStart)的第n个出现位置,然后将所有字符以char + 1结束,直到但不包括;另一个字符串(strEnd)的第n次出现。
Pete
答案 0 :(得分:1)
根据此答案(Find the nth occurrence of substring in a string):
def findnth(haystack, needle, n):
parts= haystack.split(needle, n+1)
if len(parts)<=n+1:
return -1
return len(haystack)-len(parts[-1])-len(needle)
您可以按照以下方式进行操作:
def split_between(input_string, start_str, start_occurence, end_str, end_occurence):
start_index = findnth(input_string, start_str, start_occurence-1) + len(start_str)
end_index = findnth(input_string, end_str, end_occurence-1)
return input_string[start_index:end_index]
input_string="ISIN=111111|ISIN=222222|333333|ISIN=444444"
split_between(input_string, "ISIN=", 2, "|", 2)
# returns "222222"