这个问题很简单,但是添加一个额外的条款对我来说是一个很大的问题。这里的问题是我不需要Word文件中所有突出显示的“单词”和“短语”。我写了以下代码:
using Word = Microsoft.Office.Interop.Word;
private void button1_Click(object sender, EventArgs e)
{
try
{
Word.ApplicationClass wordObject = new Word.ApplicationClass();
wordObject.Visible = false;
object file = "D:\\mywordfile.docx";
object nullobject = System.Reflection.Missing.Value;
Word.Document thisDoc = wordObject.Documents.Open(ref file, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject);
List<string> wordHighlights = new List<string>();
//Let myRange be some Range which has my text under consideration
int prevStart = 0;
int prevEnd = 0;
int thisStart = 0;
int thisEnd = 0;
string tempStr = "";
foreach (Word.Range cellWordRange in myRange.Words)
{
if (cellWordRange.HighlightColorIndex.ToString() == "wdNoHighlight")
{
continue;
}
else
{
thisStart = cellWordRange.Start;
thisEnd = cellWordRange.End;
string cellWordText = cellWordRange.Text.Trim();
if (cellWordText.Length >= 1) // valid word length, non-whitespace
{
if (thisStart == prevEnd) // If this word is contiguously highlighted with previous highlighted word
{
tempStr = String.Concat(tempStr, " "+cellWordText); // Concatenate with previous contiguously highlighted word
}
else
{
if (tempStr.Length > 0) // If some string has been concatenated in previous iterations
{
wordHighlights.Add(tempStr);
}
tempStr = "";
tempStr = cellWordText;
}
}
prevStart = thisStart;
prevEnd = thisEnd;
}
}
foreach (string highlightedString in wordHighlights)
{
MessageBox.Show(highlightedString);
}
}
catch (Exception j)
{
MessageBox.Show(j.Message);
}
}
现在考虑以下文字:
Lethévertachôledanla diminutionducholestérol,la combustion des graisses,lapréventionduobbèteetles AVC,et conjurerlasémence。
现在假设有人突出显示“ducholestérol”,我的代码显然会选择两个单词“ du ”和“cholestérol”。如何将连续突出显示的区域显示为单个单词?我的意思是“ducholestérol”应该作为List
中的一个实体返回。我们用char扫描文档char的任何逻辑,将突出显示的起点标记为选择的起始点,将突出显示的端点标记为选择的终点?
P.S。:如果有一个具有任何其他语言所需功能的库,请告诉我,因为该场景不是特定于语言的。我只需要以某种方式获得所需的结果。
编辑:根据Oliver Hanappi的建议,使用Start
和End
修改了代码。但问题仍然存在,如果有两个这样突出显示的短语,只用空格分隔,程序会将两个短语视为一个。只是因为它读取Words
而不是空格。可能需要在if (thisStart == prevEnd)
周围进行一些修改吗?
答案 0 :(得分:2)
使用Find可以更有效地执行此操作,它将更快地搜索并选择匹配的所有连续文本。请参阅此处的参考http://msdn.microsoft.com/en-us/library/office/bb258967%28v=office.12%29.aspx
以下是VBA中打印所有突出显示文本的示例:
Sub TestFind()
Dim myRange As Range
Set myRange = ActiveDocument.Content ' search entire document
With myRange.Find
.Highlight = True
Do While .Execute = True ' loop while highlighted text is found
Debug.Print myRange.Text ' myRange is changed to contain the found text
Loop
End With
End Sub
希望这有助于您更好地理解。
答案 1 :(得分:1)
您可以查看范围的Start和End属性,并检查第一个范围的结尾是否等于第二个范围的开头。
作为替代方案,您可以move一个单词的范围(参见WdUnits.wdWord),然后检查移动的开始和结束是否等于第二个单词的开头和结尾。
答案 2 :(得分:0)
grahamj42答案还可以,我已将其翻译成C#。 如果您想在整个文档中找到匹配项,请使用:
Word.Range content = thisDoc.Content
但请记住,这只是mainStoryRange,如果你想匹配单词,例如你需要使用的脚注:
Word.StoryRanges stories = null;
stories = thisDoc.StoryRanges;
Word.Range footnoteRange = stories[Word.WdStoryType.wdFootnotesStory]
我的代码:
Word.Find find = null;
Word.Range duplicate = null;
try
{
duplicate = range.Duplicate;
find = duplicate.Find;
find.Highlight = 1;
object str = "";
object missing = System.Type.Missing;
object objTrue = true;
object replace = Word.WdReplace.wdReplaceNone;
bool result = find.Execute(ref str, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref objTrue, ref str, ref replace, ref missing, ref missing, ref missing, ref missing);
while (result)
{
// code to store range text
// use duplicate.Text property
result = find.Execute(ref str, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref objTrue, ref str, ref replace, ref missing, ref missing, ref missing, ref missing);
}
}
finally
{
if (find != null) Marshal.ReleaseComObject(find);
if (duplicate != null) Marshal.ReleaseComObject(duplicate);
}
答案 3 :(得分:-1)
我从Oliver的逻辑开始,事情看起来很好,但测试显示这种方法没有考虑到空白。因此,仅由空格分隔的突出显示的短语没有被分开。我使用了grahamj42提供的VB代码方法,并将其作为类库添加,并在我的C#windows窗体项目中包含了引用。
我的C#Windows表单项目:
using Word = Microsoft.Office.Interop.Word;
然后我将try
块更改为:
Word.ApplicationClass wordObject = new Word.ApplicationClass();
wordObject.Visible = false;
object file = "D:\\mywordfile.docx";
object nullobject = System.Reflection.Missing.Value;
Word.Document thisDoc = wordObject.Documents.Open(ref file, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject);
List<string> wordHighlights = new List<string>();
// Let myRange be some Range, which has been already selected programatically here
WordMacroClasses.Highlighting macroObj = new WordMacroClasses.Highlighting();
List<string> hiWords = macroObj.HighlightRange(myRange, myRange.End);
foreach (string hitext in hiWords)
{
wordHighlights.Add(hitext);
}
这是VB类库中的Range.Find
代码,它只接受Range
及其Range.Last
并返回List(Of String)
:
Public Class Highlighting
Public Function HighlightRange(ByVal myRange As Microsoft.Office.Interop.Word.Range, ByVal rangeLimit As Integer) As List(Of String)
Dim Highlights As New List(Of String)
Dim i As Integer
i = 0
With myRange.Find
.Highlight = True
Do While .Execute = True ' loop while highlighted text is found
If (myRange.Start < rangeLimit) Then Highlights.Add(myRange.Text)
Loop
End With
Return Highlights
End Function
End Class