我想通过javascript从以下字符串列表中提取具有特定模式的子字符串。
但我在设置正则表达式时遇到问题。
输入字符串列表
搜索瓦特= TOT&安培; DA = YZR&安培; t__nil_searchbox = BTN&安培; SUG =安培; O =&安培; q=%EB%B9%84%EC%BD%98
搜索q=%EB%B9%84%EC%BD%98
&安培;去=%乳油%A0 ... 4%EB%B9%84%EC%BD%98安培; SC = 8-2&安培; SP = -1&安培; SK =安培; CVID = f05407c5bcb9496990d2874135aee8e9
其中= nexearch&安培; query=%EB%B9%84%EC%BD%98
&安培; SM = top_hty&安培; FBM = 0&安培,即= UTF8
预期的模式匹配结果
%EB%B9%84%EC%BD%98
以上案例。
的正则表达式
/(query|q)=.*
+ 此处有额外规定 + /
其终点为$
或first appeared &
问题
我应该为 ADDITIONAL REGEX 撰写什么?
您可以对其进行测试HERE。感谢。
答案 0 :(得分:2)
将第一个捕获组转为非捕获组,然后添加一个否定的字符类而不是.*
\b(?:query|q)=([^&\n]*)
> var s = "where=nexearch& query=%EB%B9%84%EC%BD%98&sm=top_hty&fbm=0&ie=utf8"
undefined
> var pat = /\b(?:query|q)=([^&\n]*)/;
> pat.exec(s)[1]
'%EB%B9%84%EC%BD%98'
答案 1 :(得分:1)
我个人建议采用另一种方法,使用更多程序的函数来匹配所需的参数值而不是“简单”的正则表达式。虽然它最初可能看起来更复杂,但如果您需要在将来找到不同的或额外的参数值,它确实允许轻松扩展。
那说:
/* haystack:
String, the string in which you're looking for the
parameter-values,
needles:
Array, the parameters whose values you're looking for
*/
function queryGrab(haystack, needles) {
// creating a regular expression from the array of needles,
// given an array of ['q','query'], this will result in:
// /^(q)|(query)/gi
var reg = new RegExp('^(' + needles.join(')|(') + ')', 'gi'),
// finding either the index of the '?' character in the haystack:
queryIndex = haystack.indexOf('?'),
// getting the substring from the haystack, starting
// after the '?' character:
keyValues = haystack.substring(queryIndex + 1)
// splitting that string on the '&' characters,
// to form an array:
.split('&')
// filtering that array (with Array.prototype.filter()),
// the 'keyValue' argument is the current array-element
// from the array over which we're iterating:
.filter(function(keyValue) {
// if RegExp.prototype.test() returns true,
// meaning the supplied string ('keyValue')
// is matched by the created regular expression,
// the current element is retained in the filtered
// array:
return reg.test(keyValue);
// converting that filtered-array to a string
// on the naive assumption each searched-string
// should return only one match:
}).toString();
// returning a substring of the keyValue, from after
// the position of the '=' character:
return keyValues.substring(keyValues.indexOf('=') + 1);
}
// essentially irrelevant, just for the purposes of
// providing a demonstration; here we get all the
// elements of class="haystack":
var haystacks = document.querySelectorAll('.haystack'),
// the parameters we're looking for:
needles = ['q', 'query'],
// an 'empty' variable for later use:
retrieved;
// using Array.prototype.forEach() to iterate over, and
// perform a function on, each of the .haystack elements
// (using Function.prototype.call() to use the array-like
// NodeList instead of an array):
Array.prototype.forEach.call(haystacks, function(stack) {
// like filter(), the variable is the current array-element
// retrieved caches the found parameter-value (using
// a variable because we're using it twice):
retrieved = queryGrab(stack.textContent, needles);
// setting the next-sibling's text:
stack.nextSibling.nodeValue = '(found: ' + retrieved + ')';
// updating the HTML of the current node, to allow for
// highlighting:
stack.innerHTML = stack.textContent.replace(retrieved, '<span class="found">$&</span>');
});
function queryGrab(haystack, needles) {
var reg = new RegExp('^(' + needles.join(')|(') + ')', 'gi'),
queryIndex = haystack.indexOf('?'),
keyValues = haystack.substring(queryIndex + 1)
.split('&')
.filter(function(keyValue) {
return reg.test(keyValue);
}).toString();
return keyValues.substring(keyValues.indexOf('=') + 1);
}
var haystacks = document.querySelectorAll('.haystack'),
needles = ['q', 'query'],
retrieved;
Array.prototype.forEach.call(haystacks, function(stack) {
retrieved = queryGrab(stack.textContent, needles);
stack.nextSibling.nodeValue = '(found: ' + retrieved + ')';
stack.innerHTML = stack.textContent.replace(retrieved, '<span class="found">$&</span>');
});
ul {
margin: 0;
padding: 0;
}
li {
margin: 0 0 0.5em 0;
padding-bottom: 0.5em;
border-bottom: 1px solid #ccc;
list-style-type: none;
width: 100%;
}
.haystack {
display: block;
color: #999;
}
.found {
color: #f90;
}
<ul>
<li><span class="haystack">search?w=tot&DA=YZR&t__nil_searchbox=btn&sug=&o=&q=%EB%B9%84%EC%BD%98</span>
</li>
<li><span class="haystack">search?q=%EB%B9%84%EC%BD%98&go=%EC%A0…4%EB%B9%84%EC%BD%98&sc=8-2&sp=-1&sk=&cvid=f05407c5bcb9496990d2874135aee8e9</span>
</li>
<li><span class="haystack">where=nexearch&query=%EB%B9%84%EC%BD%98&sm=top_hty&fbm=0&ie=utf8</span>
</li>
</ul>
JS Fiddle (for easier off-site experimentation)
参考文献:
答案 2 :(得分:1)
正则表达式不是解析这些查询字符串的最佳方法。有图书馆和工具,但如果你想自己做:
function parseQueryString(url) {
return _.object(url . // build an object from pairs
split('?')[1] . // take the part after the ?
split('&') . // split it by &
map(function(str) { // turn parts into 2-elt array
return str.split('='); // broken at =
})
);
}
这使用了Underscore的_.object
,它从键/值对数组中创建了一个对象,但是如果你不想使用它,你可以在几行中编写自己的等价物。
现在你要找的价值只是
params = parseQueryString(url);
return params.q || params.query;