我使用了postgres
我有这个查询
SELECT
row_number() OVER (ORDER BY corresp.ID_CORRESP) as rNUM ,
transfers.id_transfer AS TRANSFER_ID_TRANSFER,
corresp.id_corresp as ID_CORRESP,
corresp.ORDERNBR_CORRESP as ORDERNBR_CORRESP,
transfers.text_transfer AS TEXT
FROM Transfers transfers
left outer join correspondence corresp on corresp.id_corresp = transfers.id_corresp
left outer join tranf_corresp_tocc_employee on tranf_corresp_tocc_employee.id_transfer = transfers.id_transfer
left outer join employee on tranf_corresp_tocc_employee.id_employe = employee.id_employe
left outer join employee_lang on employee.id_employe = employee_lang.id_employe
left outer join unit on employee.id_unit = unit.id_unit
left outer join unit_lang on unit_lang.id_unit =unit.id_unit
left outer join action on action.id_action = transfers.id_action
left outer join action_lang on action_lang.id_action = action.id_action
WHERE transfers.status_transfer ='P'
但 transfers.text_transfer AS TEXT 的问题会返回此类结果
<div align="right"><font color="3366FF"><b><font size="3">it's test</font></b></font></div>
我搜索从此结果中提取正确数据的方式意味着提取it's test
所以我想在我的查询中添加相同的代码来从html标签中提取数据,我认为我应该使用这个函数 REGEXP_REPLACE
已更新:
当我尝试运行此查询时
CREATE LANGUAGE plperlu;
我有这个错误:
ERROR: could not load library "C:/Program Files/PostgreSQL/9.2/lib/plperl.dll": %1 is not a valid Win32 application.
********** Error **********
ERROR: could not load library "C:/Program Files/PostgreSQL/9.2/lib/plperl.dll": %1 is not a valid Win32 application.
SQL state: 58P01
我在C:/ Program Files / PostgreSQL / 9.2 / lib下有plperl.dll
已更新
我尝试用这个例子的另一种方式:
CREATE FUNCTION testFunction
(@HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
WHILE @Start > 0
AND @End > 0
AND @Length > 0
BEGIN
SET @HTMLText = STUFF(@HTMLText,@Start,@Length,'')
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
END
RETURN LTRIM(RTRIM(@HTMLText))
END
但我有这个错误:
ERROR: syntax error at or near "@"
LINE 2: (@HTMLText VARCHAR(MAX))
^
********** Error **********
ERROR: syntax error at or near "@"
SQL state: 42601
Character: 31
答案 0 :(得分:1)
如果您想在数据库中执行此操作,请使用PL / Perl,PL / Python或类似工具进行正确的HTML剥离。
例如,如果您从CPAN或HTML::Strip
(Debian / ubuntu)或libperl-html-strip
(Fedora / RHEL)软件包安装perl-HTML-Strip
:
CREATE LANGUAGE plperlu;
CREATE OR REPLACE FUNCTION striphtml(html text) RETURNS text
LANGUAGE plperlu
AS $$
use strict; use warnings; use 5.10.1;
use HTML::Strip;
my $hs = HTML::Strip->new(decode_entities => 1);
my $stripped = $hs->parse($_[0]);
$hs->eof;
return $stripped;
$$;
然后:
regress=> SELECT striphtml('<div align="right"><font color="3366FF"><b><font size="3">it''s test</font></b></font></div>');
striphtml
-----------
it's test
(1 row)
或者您可以使用HTML::Parser
更干净地完成这项工作。
还有许多其他选择。选择一个现有的并使用它。