我在具有可变alpha长度的列中有脏数据。我只想删除任何不是0-9的东西。
我不想运行函数或proc。我有一个类似于在文本之后抓取数值的脚本,它看起来像这样:
Update TableName
set ColumntoUpdate=cast(replace(Columnofdirtydata,'Alpha #','') as int)
where Columnofdirtydata like 'Alpha #%'
And ColumntoUpdate is Null
我认为它会起作用,直到我发现我认为某些数据字段只是Alpha#12345789的格式不是......
需要剥离的数据示例
AB ABCDE # 123
ABCDE# 123
AB: ABC# 123
我只想要123.确实所有数据字段都具有#之前的数字。
我尝试了substring和PatIndex,但我的语法不正确或者其他东西。任何人对解决这个问题的最佳方法有任何建议吗?
谢谢!
答案 0 :(得分:65)
有关从SQL Server中的字符串中提取数字的信息,请参阅此blog post。以下是示例中使用字符串的示例:
DECLARE @textval NVARCHAR(30)
SET @textval = 'AB ABCDE # 123'
SELECT LEFT(SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000),
PATINDEX('%[^0-9.-]%', SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000) + 'X') -1)
答案 1 :(得分:37)
答案 2 :(得分:21)
如果数字之间可能存在某些字符(例如千位分隔符),您可以尝试以下操作:
declare @table table (DirtyCol varchar(100))
insert into @table values
('AB ABCDE # 123')
,('ABCDE# 123')
,('AB: ABC# 123')
,('AB#')
,('AB # 1 000 000')
,('AB # 1`234`567')
,('AB # (9)(876)(543)')
;with tally as (select top (100) N=row_number() over (order by @@spid) from sys.all_columns),
data as (
select DirtyCol, Col
from @table
cross apply (
select (select C + ''
from (select N, substring(DirtyCol, N, 1) C from tally where N<=datalength(DirtyCol)) [1]
where C between '0' and '9'
order by N
for xml path(''))
) p (Col)
where p.Col is not NULL
)
select DirtyCol, cast(Col as int) IntCol
from data
输出是:
DirtyCol IntCol
--------------------- -------
AB ABCDE # 123 123
ABCDE# 123 123
AB: ABC# 123 123
AB # 1 000 000 1000000
AB # 1`234`567 1234567
AB # (9)(876)(543) 9876543
要进行更新,请添加ColToUpdate
以选择data
cte:
;with num as (...),
data as (
select ColToUpdate, /*DirtyCol, */Col
from ...
)
update data
set ColToUpdate = cast(Col as int)
答案 3 :(得分:15)
这对我很有用:
CREATE FUNCTION [dbo].[StripNonNumerics]
(
@Temp varchar(255)
)
RETURNS varchar(255)
AS
Begin
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^0-9]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')
Return @Temp
End
然后像这样调用函数来查看已消毒的东西旁边的原始内容:
SELECT Something, dbo.StripNonNumerics(Something) FROM TableA
答案 4 :(得分:9)
如果您的服务器支持TRANSLATE功能,那么这是一个优雅的解决方案(在sql server上,它可以在sql server 2017+以及sql azure上使用)。
首先,它用@字符替换任何非数字字符。 然后,它删除所有@字符。 您可能需要添加您知道可能存在于TRANSLATE调用的第二个参数中的其他字符。
Restart-Computer
答案 5 :(得分:1)
要添加到Ken's答案,这会处理逗号和空格以及括号
--Handles parentheses, commas, spaces, hyphens..
declare @table table (c varchar(256))
insert into @table
values
('This is a test 111-222-3344'),
('Some Sample Text (111)-222-3344'),
('Hello there 111222 3344 / How are you?'),
('Hello there 111 222 3344 ? How are you?'),
('Hello there 111 222 3344. How are you?')
select
replace(LEFT(SUBSTRING(replace(replace(replace(replace(replace(c,'(',''),')',''),'-',''),' ',''),',',''), PATINDEX('%[0-9.-]%', replace(replace(replace(replace(replace(c,'(',''),')',''),'-',''),' ',''),',','')), 8000),
PATINDEX('%[^0-9.-]%', SUBSTRING(replace(replace(replace(replace(replace(c,'(',''),')',''),'-',''),' ',''),',',''), PATINDEX('%[0-9.-]%', replace(replace(replace(replace(replace(c,'(',''),')',''),'-',''),' ',''),',','')), 8000) + 'X') -1),'.','')
from @table
答案 6 :(得分:1)
聚会很晚了,我发现了以下内容,尽管我做得非常出色..如果有人还在看
cyl carb
<dbl> <dbl>
6 NA
6 NA
4 5000
6 NA
8 NA
6 NA
8 NA
4 5000
4 5000
6 NA
答案 7 :(得分:0)
Declare @MainTable table(id int identity(1,1),TextField varchar(100))
INSERT INTO @MainTable (TextField)
VALUES
('6B32E')
declare @i int=1
Declare @originalWord varchar(100)=''
WHile @i<=(Select count(*) from @MainTable)
BEGIN
Select @originalWord=TextField from @MainTable where id=@i
Declare @r varchar(max) ='', @len int ,@c char(1), @x int = 0
Select @len = len(@originalWord)
declare @pn varchar(100)=@originalWord
while @x <= @len
begin
Select @c = SUBSTRING(@pn,@x,1)
if(@c!='')
BEGIN
if ISNUMERIC(@c) = 0 and @c <> '-'
BEGIN
Select @r = cast(@r as varchar) + cast(replace((SELECT ASCII(@c)-64),'-','') as varchar)
end
ELSE
BEGIN
Select @r = @r + @c
END
END
Select @x = @x +1
END
Select @r
Set @i=@i+1
END
答案 8 :(得分:0)
这是一个从字符串中提取所有数字的版本;即,给定I'm 35 years old; I was born in 1982. The average family has 2.4 children.
,这将返回35198224
。也就是说,你可以将数字数据格式化为代码(例如#123,456,789
/ 123-00005
),但是如果你想要提取具体数字(例如来自文本的数字/只是数字字符。它也只处理数字;所以不会返回负号(-
)或句号.
)。
declare @table table (id bigint not null identity (1,1), data nvarchar(max))
insert @table (data)
values ('hello 123 its 45613 then') --outputs: 12345613
,('1 some other string 98 example 4') --outputs: 1984
,('AB ABCDE # 123') --outputs: 123
,('ABCDE# 123') --outputs: 123
,('AB: ABC# 123') --outputs: 123
; with NonNumerics as (
select id
, data original
--the below line replaces all digits with blanks
, replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(data,'0',''),'1',''),'2',''),'3',''),'4',''),'5',''),'6',''),'7',''),'8',''),'9','') nonNumeric
from @table
)
--each iteration of the below CTE removes another non-numeric character from the original string, putting the result into the numerics column
, Numerics as (
select id
, replace(original, substring(nonNumeric,1,1), '') numerics
, replace(nonNumeric, substring(nonNumeric,1,1), '') charsToreplace
, len(replace(nonNumeric, substring(nonNumeric,1,1), '')) charsRemaining
from NonNumerics
union all
select id
, replace(numerics, substring(charsToreplace,1,1), '') numerics
, replace(charsToreplace, substring(charsToreplace,1,1), '') charsToreplace
, len(replace(charsToreplace, substring(charsToreplace,1,1), '')) charsRemaining
from Numerics
where charsRemaining > 0
)
--we select only those strings with `charsRemaining=0`; i.e. the rows for which all non-numeric characters have been removed; there should be 1 row returned for every 1 row in the original data set.
select * from Numerics where charsRemaining = 0
此代码的工作原理是通过用空格替换给定字符串中的所有数字(即我们想要的字符)。然后它通过原始字符串(包括数字)删除所有剩下的字符(即非数字字符),从而只留下数字。
我们分两步完成此操作的原因,而不仅仅是首先删除所有非数字字符的原因是只有10位数,而且有大量可能的字符;所以取代那个小名单的速度相对较快;然后给我们一个实际存在于字符串中的那些非数字字符的列表,这样我们就可以替换那个小集。
该方法使用公共表表达式(CTE)来使用递归SQL。
答案 9 :(得分:0)
Create function fn_GetNumbersOnly(@pn varchar(100))
Returns varchar(max)
AS
BEGIN
Declare @r varchar(max) ='', @len int ,@c char(1), @x int = 0
Select @len = len(@pn)
while @x <= @len
begin
Select @c = SUBSTRING(@pn,@x,1)
if ISNUMERIC(@c) = 1 and @c <> '-'
Select @r = @r + @c
Select @x = @x +1
end
return @r
End
答案 10 :(得分:0)
我为这个
创建了一个函数Create FUNCTION RemoveCharacters (@text varchar(30))
RETURNS VARCHAR(30)
AS
BEGIN
declare @index as int
declare @newtexval as varchar(30)
set @index = (select PATINDEX('%[A-Z.-/?]%', @text))
if (@index =0)
begin
return @text
end
else
begin
set @newtexval = (select STUFF ( @text , @index , 1 , '' ))
return dbo.RemoveCharacters(@newtexval)
end
return 0
END
GO
答案 11 :(得分:0)
以下是答案:
<UserControl x:Class="Formats.Triangle_Down"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:local="clr-namespace:Formats"
mc:Ignorable="d" Height="100" Width="100">
<Grid Margin="0,0,10,10">
<Grid.RowDefinitions>
<RowDefinition Height="7*"/>
<RowDefinition Height="11*"/>
</Grid.RowDefinitions>
<Button Background="Transparent" Content="Button" Foreground="Transparent" Margin="0,0,50,0">
<Button.Template>
<ControlTemplate TargetType="Button">
<Grid Background="{TemplateBinding Background}">
<Polygon x:Name="triangle" Points="10,10 20,30 30,10" Stroke="Lime"
StrokeThickness="2" Fill="Lime" Margin="0,0,-91,-85">
</Polygon>
</Grid>
</Button.Template>
</Button>
</Grid>
答案 12 :(得分:0)
这对我有用:
我删除了单引号。
然后我使用","
替换"."
。
这肯定会对某人有所帮助:
" & txtFinalscore.Text.Replace(",", ".") & "
答案 13 :(得分:0)
在您的情况下,#似乎总是在#符号之后,因此将CHARINDEX()与LTRIM()和RTRIM()一起使用可能会表现最佳。但是,这是一种摆脱任何非数字的有趣方法。它利用一个计数表和一个数字表来限制接受哪些字符,然后利用XML技术将其连接回没有非数字字符的单个字符串。这项技术的妙处在于,它可以扩展为包含任何允许的字符,并删除所有不允许的字符。
DECLARE @ExampleData AS TABLE (Col VARCHAR(100))
INSERT INTO @ExampleData (Col) VALUES ('AB ABCDE # 123'),('ABCDE# 123'),('AB: ABC# 123')
DECLARE @Digits AS TABLE (D CHAR(1))
INSERT INTO @Digits (D) VALUES ('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9')
;WITH cteTally AS (
SELECT
I = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
@Digits d10
CROSS APPLY @Digits d100
--add more cross applies to cover longer fields this handles 100
)
SELECT *
FROM
@ExampleData e
OUTER APPLY (
SELECT CleansedPhone = CAST((
SELECT TOP 100
SUBSTRING(e.Col,t.I,1)
FROM
cteTally t
INNER JOIN @Digits d
ON SUBSTRING(e.Col,t.I,1) = d.D
WHERE
I <= LEN(e.Col)
ORDER BY
t.I
FOR XML PATH('')) AS VARCHAR(100))) o
答案 14 :(得分:0)
您可以创建SQL CLR标量函数,以便能够使用正则表达式(例如替换模式)。
Here,您可以找到有关如何创建此类功能的示例。
具有此功能将仅通过以下几行来解决该问题:
SELECT [dbo].[fn_Utils_RegexReplace] ('AB ABCDE # 123', '[^0-9]', '');
SELECT [dbo].[fn_Utils_RegexReplace] ('ABCDE# 123', '[^0-9]', '');
SELECT [dbo].[fn_Utils_RegexReplace] ('AB: ABC# 123', '[^0-9]', '');
更重要的是,您将能够解决更复杂的问题,因为正则表达式将直接在T-SQL语句中带来全新的选择世界。
答案 15 :(得分:0)
CREATE FUNCTION FN_RemoveNonNumeric (@Input NVARCHAR(512))
RETURNS NVARCHAR(512)
AS
BEGIN
DECLARE @Trimmed NVARCHAR(512)
SELECT @Trimmed = @Input
WHILE PATINDEX('%[^0-9]%', @Trimmed) > 0
SELECT @Trimmed = REPLACE(@Trimmed, SUBSTRING(@Trimmed, PATINDEX('%[^0-9]%', @Trimmed), 1), '')
RETURN @Trimmed
END
GO
SELECT dbo.FN_RemoveNonNumeric('ABCDE# 123')
答案 16 :(得分:0)
DECLARE @STR VARCHAR(400)
DECLARE @specialchars VARCHAR(50)='%[〜,@,#,$,%,&,*,(,),!^ ?:]%'
SET @STR ='1,45 4,3 68.00-'
PATININDEX(@specialchars,@STR)> 0
---使用“替换”功能删除特殊字符
SET @STR = Replace(Replace(REPLACE(@STR,SUBSTRING(@STR,PATINDEX(@specialchars,@STR),1),''),'-',''),'','' )
选择@STR