我目前有以下正则表达式用于验证在表单中输入company name
:
$regexpRange = $min.','.$max;
$regexpPattern = '/^(?=[A-Za-z\d\'\s\,\.]{'.$regexpRange.'}$)(?=.*[a-z\d])[a-zA-Z\d]+[A-Za-z\d\'\s\,\.]+$/m';
我需要将其更新为国际标准以允许国际字符。 我没有这方面的经验
有人可以帮助我了解如何解决这个问题吗?
答案 0 :(得分:2)
以下是必需的步骤:
使用[17] pry(main)> n
=> #<Node id: 7, name: "Hercules", family_tree_id: 57, user_id: 57, media_id: 120, media_type: "Video", created_at: "2015-03-12 08:54:29", updated_at: "2015-03-31 21:48:05", circa: nil, is_comment: nil>
[18] pry(main)> n.user_tags
=> [#<ActsAsTaggableOn::Tag id: 4, name: "gerry@test.com", taggings_count: 2>, #<ActsAsTaggableOn::Tag id: 6, name: "danny@test.com", taggings_count: 1>]
[19] pry(main)> u
=> #<User id: 52, email: "gerry@test.com", encrypted_password: "$2a$10$KaX1kvtIw1.jGITnt9Czqeq3xTzhY3OM052NSHsL5Lf...", reset_password_token: nil, reset_password_sent_at: nil, remember_created_at: nil, sign_in_count: 5, current_sign_in_at: "2015-04-03 17:10:28", last_sign_in_at: "2015-04-03 00:38:24", current_sign_in_ip: "127.0.0.1", last_sign_in_ip: "127.0.0.1", created_at: "2015-03-05 01:36:31", updated_at: "2015-04-03 17:10:28", first_name: "Gerry ", confirmation_token: nil, confirmed_at: "2015-03-05 01:36:52", confirmation_sent_at: nil, unconfirmed_email: nil, invitation_relation: "uncle", avatar: nil, invitation_token: nil, invitation_created_at: "2015-03-05 01:36:31", invitation_sent_at: "2015-03-05 01:36:31", invitation_accepted_at: "2015-03-05 01:36:52", invitation_limit: nil, invited_by_id: 1, invited_by_type: "User", invitations_count: 0, bio: nil, last_name: "Atrick", gender: 0>
[20] pry(main)> u.email
=> "gerry@test.com"
[21] pry(main)> Node.includes(:user_tags).tagged_with(u.email)
ActsAsTaggableOn::Tag Load (2.7ms) SELECT "tags".* FROM "tags" WHERE (LOWER(name) = LOWER('gerry@test.com'))
Node Load (2.9ms) SELECT "nodes".* FROM "nodes" JOIN taggings nodes_taggings_baebc90 ON nodes_taggings_baebc90.taggable_id = "nodes".id AND nodes_taggings_baebc90.taggable_type = 'Node' AND nodes_taggings_baebc90.tag_id = 4
ActsAsTaggableOn::Tagging Load (2.5ms) SELECT "taggings".* FROM "taggings" WHERE "taggings"."context" = 'user_tags' AND "taggings"."taggable_type" = 'Node' AND "taggings"."taggable_id" IN (6, 7)
ActsAsTaggableOn::Tag Load (1.0ms) SELECT "tags".* FROM "tags" WHERE "tags"."id" IN (4, 6)
=> [#<Node id: 6, name: "10PP Form Video", family_tree_id: 57, user_id: 57, media_id: 118, media_type: "Video", created_at: "2015-03-09 20:57:19", updated_at: "2015-04-03 00:25:38", circa: nil, is_comment: nil>,
#<Node id: 7, name: "Hercules", family_tree_id: 57, user_id: 57, media_id: 120, media_type: "Video", created_at: "2015-03-12 08:54:29", updated_at: "2015-03-31 21:48:05", circa: nil, is_comment: nil>]
模式选项。这会打开u
和 PCRE_UTF8
(PHP文档忘记提及那个):
PCRE_UCP
此选项使PCRE将模式和主题视为UTF-8字符串而不是单字节字符串。但是,只有在构建PCRE以包含UTF支持时才可用。如果没有,使用此选项会引发错误。有关此选项如何更改PCRE行为的详细信息,请参阅pcreunicode页面。
PCRE_UTF8
此选项更改了PCRE处理
PCRE_UCP
,\B
,\b
,\D
,\d
,\S
,{{1}的方式},\s
和一些POSIX字符类。默认情况下,仅识别ASCII字符,但如果设置了PCRE_UCP,则使用Unicode属性来对字符进行分类。有关详细信息,请参阅pcrepattern页面中有关通用字符类型的部分。如果设置PCRE_UCP,则匹配其影响的项目之一需要更长时间。只有在使用Unicode属性支持编译PCRE时,该选项才可用。
\W
与\w
完全相同(已相当于\d
),但您必须替换这些PCRE_UCP
范围以说明重音字符:
\p{N}
替换为[a-z]
[a-zA-Z]
替换为\p{L}
[a-z]
替换为\p{Ll}
[A-Z]
表示:来自Unicode category X 的字符,其中\p{Lu}
表示字母,\p{X}
表示< em>小写字母和L
表示大写字母。您可以从the docs获取列表。
请注意,您可以在字符类中使用Ll
:例如Lu
。
并确保在PHP中对字符串使用UTF8编码。另外,请确保使用支持Unicode的函数来处理这些字符串。