我遇到了有时无法正确转换的base64图像的问题。我需要一种方法来测试图像是否在正确的base64格式之前转换它,所以我可以尝试进一步研究问题。我在网上找到了一些正则表达式,但我认为他们只期望没有标题的字符串。我有标题的字符串。我试图添加标题,但它一直在破坏。
原文:
^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
我添加了标题,但它不起作用:
^([data:image/png;base64,][A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
谢谢
答案 0 :(得分:3)
您可能会在原始正则表达式中注意到[square brackets]
的使用,这些创建的字符集与[data:image/png;base64,]
内的任何字符匹配将匹配d,a,t,a,....,6,4,
,
。相反,您可能想要创建非捕获组,因为我认为您正在尝试使标头可选,例如此(?:data:image/png;base64,)?
^((?:data:image/png;base64,)?[A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
^ # Anchors to the beginning to the string.
( # Opens CG1
(?:data:image/png;base64, # Opens NCG1
# Literal data:image/png;base64,
)? # Closes NCG1
# ? repeats zero or one times
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{4} # Repeats 4 times.
)* # Closes CG1
# * repeats zero or more times
( # Opens CG2
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{4} # Repeats 4 times.
| # Alt (CG2)
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{3} # Repeats 3 times.
= # Literal =
| # Alt (CG2)
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{2} # Repeats 2 times.
== # Literal ==
) # Closes CG2
$ # Anchors to the end to the string.
但是,如果您想要标题,则可以完全删除非捕获组和?
量词。
^(data:image/png;base64,[A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
答案 1 :(得分:2)
正则表达式
^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
所有这些字符的含义是什么:
^
...找到一个从行或字符串缓冲区开头的字符串。
(
... )
...定义一个标记组,用于反向引用括号内表达式找到的字符串,或者应用此处使用的乘数。将表达式分组仅用于应用乘数通常比使用非标记组更好,即使用(?:
... )
,其中问号和开头括号后的冒号使该组成为非标记基。
[
... ]
...定义一个正面的字符类,这意味着方括号内的任何字符都应该找到一次以进行正匹配。 [^
... ]
将是一个负字符类定义,这意味着应该找到除方括号中的一个字符之外的任何字符。
[A-Za-z0-9+/]
...字符是ASCII表格中的大写字母或小写字母或数字或加号或斜杠。
{4}
...是一个乘数,意味着前面的表达式或字符恰好是四次。
*
...也是一个乘数,意味着先前的表达式或字符为0或更多次。
|
...表示 OR 。
$
...表示没有匹配行终止符或字符串缓冲区结尾的行尾。
所以这个表达意味着:
要在行或字符串缓冲区的开头另外允许可选一个标题字符串,表达式应该修改为:
^(?:data:image/png;base64,)?(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
非标记组(?:data:image/png;base64,)
后面的问号在这里表示前一个表达式(只是一个固定的字符串)零次或一次。
如您所见,我通过在左括号后插入?:
将2个标记组更改为2个非标记组。