我有eng和devnagri名称的数据库,如
' PRABHU MATTHU RATHOD | प्रभुमथथूराठोड'
我打破这些名字作为名字,中间名,姓氏 英文名称正常工作,但印地文名称显示问题
我试过这个以查找名称中的最后一个空间索引
@MA_Name = प्रभु मथथू राठोड
REVERSE(SUBSTRING(REVERSE(@MA_Name), 1,CHARINDEX(' ', REVERSE(@MA_Name)) - 1));
这里失败CHARINDEX(' ', REVERSE(@MA_Name)) - 1)
正在返回-1
我不知道为什么
答案 0 :(得分:5)
尝试将case
语句用于没有空格的特殊名称。类似的东西:
(CASE WHEN @MA_NAME LIKE N'% %'
THEN REVERSE(SUBSTRING(REVERSE(@MA_Name), 1,CHARINDEX(N' ', REVERSE(@MA_Name)) - 1))
ELSE @MA_NAME
END)
这假设没有空格,名称就是姓氏。
编辑:
名称可能看起来像是有空格,但空格是' '
以外的字符。您可以通过以下方式弄清楚它是什么:
select ascii(substring(@MA_NAME, 7, 1))
(或者是该空间的正确索引。)
一旦知道了角色是什么,就可以将查询结构化为:
(case when @MA_NAME like N'% %' then <what you have now>
when @MA_NAME like N'%OTHERCHAR%' then <similar but with different space>
else <whatever>
end)
答案 1 :(得分:1)
这已经过编辑,结合了我们学到的知识,因此可以作为答案接受,对未来的访问者有用。
考虑以下代码:
declare @ma_name nvarchar(200)
declare @r nvarchar(200)
declare @i int
select @MA_Name = N'प्रभु मथथू राठोड' -- Thanks to Gordon Lindof for reminder to use N-prefix
set @r = reverse(@ma_name)
select @r
set @i = charindex(' ', @r )
select @i
结果是:
डोठार ूथथम ुभर्प
和
0
似乎正在发生的事情是反转功能正在反转代码点而不是字符。仅用4个代码点的子串来解释:
\u0925-->थ \u0942-->ू \u0020--> \u0930-->र
u0942是一个组合字符。顺序u0925后跟u0942是单个字符。 REVERSE并不理解这一点并天真地颠倒了代码点。结果是:
u0930 u0020 u0942 u0925
现在组合字符附加到空间。所以现在它不是一个空间,它是一个无所不在的空间。 (对不起,对印地语一无所知,没有任何不尊重的意思。)
但CHARINDEX并不是那么天真。它看到你正在寻找一个空间,但它只找到修改过的空格字符。
海报通过使用FOR循环搜索空间来解决他的问题。
以下是解释情况的一些来源:
-- CharList generates a comma-separated list of decimal values representing the list of nchar's
-- in an nvarchar. In this context it's not important how it works.
if object_id('CharList')is not null drop function CharList
go
create function dbo.CharList(@c nvarchar(max))returns varchar(max)
as
begin
declare @x varbinary(max)
declare @h varchar(max)
declare @i int
set @x = cast ( @c as varbinary(max))
set @h = ''
set @i = 1
while @i <= len(@x)
begin
if @i > 1
set @h = @h + ','
set @h = @h + cast( cast(substring(@x,@i, 1)as int)
+ 256*cast(substring(@x,@i+1,1)as int) as varchar)
set @i=@i+2
end
return @h
end
go
-- For this code sample I'm going to use latin characters. (Sorry can't read Hindi.)
-- This string contains lowercase 'e' with an acute accent.
-- In Unicode this can be represented two different ways.
-- It can be represented as a single codepoint: decimal 233.
-- Or it can be built from the letter 'e', followed by
-- the combining character for the acute accent: decimal 769
-- The purpose of this source is to demonstrate combining characters, so I'll use the
-- two-codepoint version.
declare @m nvarchar(max)
set @m = N'Re' + nchar(769) + N'al'
select @m, dbo.CharList(@m) -- Réal 82,101,769,97,108
-- You see, the word 'Réal' consists of 4 characters, but is represented by 5 codepoints.
select charindex ( N'e', @m ) -- 0
select charindex ( N'e'+nchar(769), @m ) -- 2
select charindex ( N'é', @m ) -- 2
select charindex ( N'a', @m ) -- 4
-- CharIndex is smart enough to understand that. It understands that there is no letter 'e'
-- in this string of characters, even though the codepoint 101 appears in the string.
-- It does find the letter 'é' when expressed with the two-codepoint version.
-- It will even find it when expressed as the single-codepoint version, even though
-- the codepoint 233 appears nowhere in the string.
-- And finally, it has no problem finding the 'a', but note that it returns 4.
-- 'a' is the 3rd character of the string, but appears at the 4th codepoint in the list.
set @m = reverse ( @m )
select @m, dbo.CharList(@m ) -- láeR 108,97,769,101,82
-- Reverse is not as clever as CharIndex. It doesn't care about combining characters.
-- It just reverses the list of codepoints.
-- Now the acute accent combining character appears after the 'a', and so the string now
-- shows the 'a' with the acute accent, and the letter 'e' has lost its accent.
select charindex ( N'e', @m ) -- 4
select charindex ( N'e'+nchar(769), @m ) -- 0
select charindex ( N'é', @m ) -- 0
select charindex ( N'a', @m ) -- 0
-- Now, CharIndex will find a letter 'e' where there was none before.
-- It can't find 'é' in either the one-codepoint nor two-codepoint forms,
-- because it's not there anymore.
-- A search for 'a' fails, because the string doesn't contain a plain 'a' anymore.
答案 2 :(得分:-1)
这是一种在SQL中并不总是容易的问题(解析和格式化)。根据我的经验,返回原始数据并让调用/客户端程序执行所有字符串操作通常会更好。它本质上通常是程序性的(因此不太适合设置操作)并且包含许多分支逻辑(在SQL中可能很尴尬)。