所以我想使用这段代码在字符串的每5个字符中插入一个分词符。
([^\s-]{5})([^\s-]{5})
不幸的是,它也会破坏实体字符(&#xxx;
)。
有人能为我提供一个不会破坏实体代码的例子吗?
我要破解的字符串来自xml,因此实际实体会进一步转义(&#xxx;
)。
修改代码示例
preg_replace('/([^\s-]{5})([^\s-]{5})/', '$1­$2', $subject)
Given the word "Fårevejle"
Expect "Få­revejle" as result
But it outputs "F­5;revejle" instead
答案 0 :(得分:4)
假设您要将每个单词拆分为五个字符,除非它们已经用连字符分隔,将实体视为单个字符,请尝试:
$result = preg_replace(
'/ # Start the match
(?: # at one of the following positions:
(?<= # Either right after...
[\s-] # a space or dash
) # end of lookbehind
| # or...
\G # wherever the last match ended.
) # End of start condition.
( # Now match and capture the following:
(?> # Match the following in an atomic group:
&\#\w+; # an entity
| # or
[^\s-] # a non-space, non-dash character
){5} # exactly 5 times.
) # End of capture
(?=[^\s-]) # Assert that we\'re not at the end of a "word"/x',
'\1­', $subject);
此更改
supercalifragilisticexpidon'tremember!
alrea-dy se-parated
count entity as one character&#345;blahblah
F&#xe5;revejle
到
super­calif­ragil­istic­expid­on'tr­ememb­er!
alrea-dy se-parat­ed
count entit­y as one chara­cter&#345;­blahb­lah
F&#xe5;rev­ejle