我正在从服务器提取文本。我提取的数据没有组织以供进一步使用。我提取的文本如下所示:-
>>[Extracted] id: 194805284, got 55 points from jones (252906152669) date: 15/04/19 08:44:40 you have 30 points remaining
我不想要所有这些文本,我只想要id,点,数字和日期。
注意:我可能会不时提取一次以上的消息。
因此,要提取id,点,数字和日期,我用span标签包裹了每个单词,然后使用此代码:
var getData = {
//gets the id, points, date and number respectively
number1 : $('span:contains("id:")').next().text(),
amount : $('span:contains("got")').next().text(),
time : $('span:contains("date:")').next().text(),
number : $('span:contains("date:")').prev().text()
}
我使用此代码的原因是,我可能会自动提取1条以上的消息,因此对于每条提取的消息,除id,点,日期和数字外,它包含的每个单词都是相同的。 / p>
我使用上面的代码提取了我想要的数据,但是这次有2条[提取的]消息,请看下面。
HTML
<p>[Extracted] id: 194805284, got 55 points from jones (252906152669)
date: 15/04/19 08:44:40 you have 30 points remanining [Extracted] id: 193537533, got 3 points from Micheal (907794804)
date: 14/04/19 10:15:32, you have 100 points remaining</p>
<div class="processed-data">
</div>
CSS:
span {
border: 1px solid red;
}
JS:
// wrap every word with <span> tag
var words = $("p").text().split(" ");
$("p").empty();
$.each(words, function(i, v) {
$("p").append($("<span>").text(v));
});
//extract the id, points, time and number respectively
var getData = {
number1: $('span:contains("id:")').next().text(),
amount: $('span:contains("got")').next().text(),
//amount : $('span:contains("got")').next().text().substring(1),
time: $('span:contains("date:")').next().text(),
number: $('span:contains("date:")').prev().text()
}
// Output the extracted data to .processed-data div
$('.processed-data').append("thisTime = { [id: " + getData.number1 + " amount: " + getData.amount + ", time: " + getData.time + " number: " + getData.number + "]}'");
这里是a JSFiddle
输出:
thisTime = {[id: 194805284,193537533, amount: 553, time: 15/04/1914/04/19 number: (252906152669) (907794804) ]}'
我期望的结果是: 对于每个[提取的]消息,以获取其自己的数组。通过使用循环或其他方式。
示例:
现在我明白了
thisTime = {
[id: 194805284,193537533, // All the ids are stored in 1 array data
amount: 553, // All the points are stored in 1 array data e.t.c
time: 15/04/1914/04/19
number: (252906152669) (907794804)]
}
我想得到:
thisTime = {
[id: 194805284,
amount: 55,
time: 15/04/19
number: (252906152669)],
[id:193537533,
amount: 3,
time: 14/04/19
number: (907794804)]
}
我只希望提取的每个消息都有自己的数组。
答案 0 :(得分:1)
您可以轻松地使用正则表达式(Regex)来解决此问题-是否有任何特殊原因将每个单词都用一个跨度包装?
以下正则表达式应匹配字符串中的所有标记:
id:\s+(\d+),\s+got\s+(\d+)\s+points\s+from\s+.+?\s+\((\d+)\)\s+date:\s+(\d+)\/(\d+)\/(\d+)\s+(\d+):(\d+):(\d+)
我在这里使用\s+
代替空格,因为似乎上面模板中的间距不一致,并且为了安全起见,我喜欢对任何数量的空白使用\s+
。 / p>
您可以像这样提取一条消息...
const regex = /id:\s+(\d+),\s+got\s+(\d+)\s+points\s+from\s+.+?\s+\((\d+)\)\s+date:\s+(\d+)\/(\d+)\/(\d+)\s+(\d+):(\d+):(\d+)/; // construct the regex literal
const message = // some string matching your "extracted" template
const match = message.match(regex); // now your match contains all the data
const [fullMatch, idString, pointString, dayString, monthString, yearString, hourString, minuteString, secondString] = match; // you don't have to destructure, but this is the order of the capturing groups.
通过执行以下操作,您还应该也可以匹配多个...
let match;
while (match = regex.exec(message)) {
// now match can be handled the same way as above. You could alternatively push the matches to a list as well here.
}
答案 1 :(得分:1)
我建议您使用正则表达式来解决它,我认为比您使用的Jquery方法更好。
查看可能的正则表达式解决方案:
var text = '[Extracted] id: 194805284, got 55 points from jones (252906152669) date: 15/04/19 08:44:40 you have 30 points remanining [Extracted] id: 193537533, got 3 points from Micheal (907794804) date: 14/04/19 10:15:32, you have 100 points remaining';
var textArray = text.split('[Extracted]');
var regularExpression = /id:\s+([0-9]+).+got\s+([0-9]+).+[^\(]+\(([0-9]+)\)\s+date:\s+([0-9\/\s:]+)/i;
var output = [];
var item;
for(var i = 1; i < textArray.length; i++){
item = textArray[i].match(regularExpression);
output.push({
id: item[1].trim(),
amount: item[2].trim(),
time: item[4].trim(),
number: item[3].trim()
});
}
console.log(output);
答案 2 :(得分:1)
您的问题是 getData 。我建议分解在 Extracted 上和在空格之后分割的字符串。之后,您可以选择按句子分组的子跨度并进行过滤,以创建包含一个或多个对象的数组。
var sentences = $("p").text().split("\[Extracted\]").slice(1);
$("p").empty();
$.each(sentences, function(i, v) {
var words = ['Extracted'].concat(v.trim().split(/ +/));
$.each(words, function(idx, word) {
$("p").append($("<span/>", {text: word.trim()}));
});
});
var result = {thisTime: $("p span:contains(Extracted)").map(function(idx, txt) {
var x = $(this).nextUntil('span:contains(Extracted)');
return {id: x.filter('span:contains("id:")').next().text(),
amount: x.filter('span:contains("got")').next().text(),
time: x.filter('span:contains("date:")').next().text(),
number: x.filter('span:contains("date:")').prev().text()};
}).get()};
$('.processed-data').append(JSON.stringify(result));
span {
border: 1px solid red;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<p>[Extracted] id: 194805284, got 55 points from jones (252906152669)
date: 15/04/19 08:44:40 you have 30 points remanining [Extracted] id: 193537533, got 3 points from Micheal (907794804)
date: 14/04/19 10:15:32, you have 100 points remaining</p>
<div class="processed-data">
</div>