单击词出现的返回句子

时间:2013-04-22 15:24:13

标签: javascript nlp

跟进上一个问题:Use Javascript to get the Sentence of a Clicked Word

我一直在摸索这个问题一段时间了。但是,我今天早上醒来并开始阅读:http://branch.com/b/nba-playoffs-round-1

瞧!分支允许用户选择一个句子然后分享,保存等等......这正是我想要做的。看起来他们正在用<span>标签包装每个句子。

以前,人们建议找到每个<p>标记,然后在标记内打破每个句子。但是,我正在制作Chrome扩展程序,这几乎可以在任何网站上使用,因此单词可能会显示在<p>标记之外,可能位于<h1>类型标记中,甚至可能位于{{1} }。

对分支是如何做到这一点的任何见解?

2 个答案:

答案 0 :(得分:0)

它们似乎将所有内容包装在<span>中,并且还添加了有关字符数的元数据。从他们的来源:

<p><span class="highlight js-highlight-this" data-end-char="23"
data-highlight-count="0" data-start-char="0" id="highlight-86552-0">No
doubt they can lose.</span> <span class="highlight js-highlight-this"
data-end-char="132" data-highlight-count="0" data-start-char="24" id=
"highlight-86552-24">As Adi says, I don't think they will, but OKC - in
particular - still looms as a legit threat to the throne.</span>
<span class="highlight js-highlight-this" data-end-char="336"
data-highlight-count="0" data-start-char="133" id="highlight-86552-133">The
Thunder are better on both ends this year than last, have the experience of
having been there before, and you know Durant doesn't want to spend the
rest of his career playing second fiddle to LeBron.</span> <span class=
"highlight js-highlight-this" data-end-char="588" data-highlight-count="0"
data-start-char="337" id="highlight-86552-337">The problem, and I think the
reason so many assume the Heat will repeat, is that we haven't seen this
version of the Thunder (with Kevin Martin rather than James Harden in the
6th man role) in the playoffs before so the mystery factor comes into
play.</span></p>

然而,另一种更灵活的方法是简单地使用正则表达式匹配来从任何元素的文本中提取句子,无论是span,p,h1等。

在这种情况下,您可以通过正则表达式匹配找到句子,然后使用javascript动态地用<span>元素围绕每个句子。然后你可以将你的事件监听器附加到那些动态创建的标签上,以进行突出显示以及你想要在悬停,点击等上做的任何其他事情。

答案 1 :(得分:0)

我认为你可以做这样的事情,与你所追求的完全不同。但是可能会给你一些进一步的想法。

<div>In cryptography, a keyed-hash message authentication code (HMAC) is a specific construction for calculating a message authentication code (MAC) involving a cryptographic hash function in combination with a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and the authentication of a message. Any cryptographic hash function, such as MD5 or SHA-1, may be used in the calculation of an HMAC; the resulting MAC algorithm is termed HMAC-MD5 or HMAC-SHA1 accordingly. The cryptographic strength of the HMAC depends upon the cryptographic strength of the underlying hash function, the size of its hash output, and on the size and quality of the key.</div>
<button id="get">Get Selected</button>

function getText() {
    var selectedText

    if (typeof window.getSelection === "function") {
        selectedText = window.getSelection();
    } else if (typeof document.getSelection === "function") {
        selectedText = document.getSelection();
    } else if (document.selection && typeof document.selection.createRange() === "function") {
        selectedText = document.selection.createRange().text;
    } else {
        selectedText = "";
        alert("No method to get selected text");
    }

    if (!selectedText || selectedText === "") {
        if (document.activeElement.selectionStart) {
            selectedText = document.activeElement.value.substring(
            document.activeElement.selectionStart.document.activeElement.selectionEnd);
        }
    }

    alert(selectedText);
}

document.getElementById("get").addEventListener("click", getText, false);

on jsfiddle

你还可以看到我已经扩展了这个想法的进一步答案here on SO

作者提出了另一个问题,但here is the other jsfiddle

window.getSelection

  

摘要

     

返回表示所选文本范围的选择对象   用户。

     

规范

     

DOM级别0.不属于任何标准。

     

预计将在新的DOM范围规范中指定

还有一个名为Rangy的库,应该处理这种瘦浏览器,从未尝试过,但你可能想看看。

  

跨浏览器的JavaScript范围和选择库。它提供了一个   基于标准的简单API,用于执行常见的DOM范围和   在所有主流浏览器中的选择任务,抽象地抽象   Internet之间的这种功能的不同实现   资源管理器包括版本8和符合DOM的浏览器。