如何使用Oracle SQL从长数据字符串中提取特定数字?

时间:2016-02-27 18:03:43

标签: oracle

我需要从表格列中包含的数据字符串中提取的数字。

示例字符串:

newfile = open("test.csv", "w")
for row in readerObj:
  newrow = []
  for item in row:
    if " TB" in item:
      item = item.replace(" TB", "")
      item = re.sub('[^0-9]', '', item)
      item = float(item) * 1024
      item = round(item, 2)
    elif " MB" in item:
      item =  item.replace(" MB", "")
      item = re.sub('[^0-9]', '', item)
      item = float(item) / 1000
      item = round(item, 2)
    elif " GB" in item:
      item = item.replace(" GB", "")
      item = re.sub('[^0-9]', '', item)
      item = float(item)
      item = round(item, 2)
    newrow.append(str(item))
  newfile.write(','.join(newrow) + '\n')
newfile.close()
fileToClean.close()

在上面的示例字符串中,我需要提取的数字为<strong>Customer Name</strong>: Hit - julaifnaf afbafbaf Caraballo Pichardo vs PICHARDO ALBERTO<br /> <strong>Address</strong>: NA - abdcinfainaf 42982542542 vs xx<br /> <strong>Country of citizenship</strong>: NA<br /> <strong>Country of residency</strong>: NA<br /> <strong>Date of birth</strong>: NA - xx vs Nov-72<br /> <strong>Place of birth</strong>: NA<br /> <strong>Identification Number</strong>: **1**<br /> <strong>emailDetails</strong>: <br/> <b>Subject: </b>abcdejnfanfa <br/> <b>Sent To: </b>abced@test.com<br/> 。 蜇的长度和记录的位置各不相同, 但是要提取的数字总是在1之后和Identification Number</strong>:之前。

我可以用什么功能来提取这些数据?

2 个答案:

答案 0 :(得分:0)

试试这个:

select 
    regexp_replace(column_name,'.*<strong>Identification Number</strong>:[^>\d]*(\d+)[^>\d]*<br\s*/>.*', '\1', 1, 0, 'inm') as id 
from html;

PS它不是一个非常可靠的解决方案,因为你无法使用RegExp解析任何 HTML。

输出:

         ID
-----------
          1

答案 1 :(得分:0)

SELECT TO_NUMBER(
         REGEXP_SUBSTR(
           column_name,
           '<strong>Identification Number</strong>:.*?(\d+).*?<br />',
           1,
           1,
           NULL,
           1
         )
       ) AS id_number
FROM   table_name;