在与上述类似的文档中,我可以使用以下代码获取所有段落:
var paras = body.getParagraphs();
请注意,上面的代码不仅返回顶级段落,而且还返回ListItem
,Table
等内部的所有子级段落。
如何在选定范围内做同样的事情?以下代码仅返回顶级元素。
const selection = DocumentApp.getActiveDocument().getSelection();
var rangeElements = selection.getRangeElements();
例如,上表包含9个非空的段落,如果要选择它们,我想一一处理它们。
我要实现的目标类似于通过尽可能保留格式,表格,列表项等来翻译所选内容中的文本。
答案 0 :(得分:2)
.getRangeElements()
返回一个RangeElements的数组。范围元素是一个包装器对象,用于帮助我们处理部分选择。我们可以在此数组中的每个项目上调用.getElement()
,以获取Element object,这是一个非常通用对象,几乎可以表示任何Google文档。 Elements
有一个.getType()
方法,该方法返回一个ElementType枚举;其中有很多!
让我们使用到目前为止所了解的信息来查看Google文档中可能的类型(以created one similar to yours (img)为例)
function selectionHasWhichTypes() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements();
rangeElems.forEach(function(elem){
var elem = elem.getElement();
Logger.log(elem.getType());
});
}
//Logger OUTPUT:
PARAGRAPH
PARAGRAPH
PARAGRAPH
PARAGRAPH
PARAGRAPH
LIST_ITEM
LIST_ITEM
LIST_ITEM
PARAGRAPH
PARAGRAPH
PARAGRAPH
TABLE
PARAGRAPH
啊哈!看来我们现在只需要处理PARAGRAPH,LIST_ITEM和TABLE ElementTypes ,但也请记住他们的孩子(我们会发现这是5个可以生孩子的3个)。这听起来像是递归函数的工作,它将不断地挖掘子元素,直到我们找到并处理所有子元素为止。
因此,我们尝试一下。下一部分可能看起来令人困惑,但本质上是要找到一个元素,检查它是否有子元素,然后查看那些元素以查看它们有孩子,等等。我们也要检查是否要获取新元素类型...
function selectionHasWhichTypes() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements();
rangeElems.forEach(function(elem){
var elem = elem.getElement();
elemsHaveWhatChildElems(elem, elem.getType());
});
}
function elemsHaveWhatChildElems(elem, typeChain){
var elemType = elem.getType();
if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH"){ //Lets see if element is one of our basic 3. If so they could have children.
var numChildren = elem.getNumChildren(); //How many children are there?
if(numChildren > 0){
for(var i = 0; i < numChildren; i++){ //Let's go through them.
var child = elem.getChild(i);
elemsHaveWhatChildElems(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
}
}else{
Logger.log(typeChain); //Let's log the chain of Parent to Child elements.
}
}else{
Logger.log("*" + typeChain); //Let's mark the new elemTypeChains we have not seen.
}
}
//Logger OUTPUT:
*PARAGRAPH.TEXT
PARAGRAPH
*PARAGRAPH.HORIZONTAL_RULE
PARAGRAPH
*PARAGRAPH.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
PARAGRAPH
*PARAGRAPH.TEXT
PARAGRAPH
*TABLE.TABLE_ROW
*TABLE.TABLE_ROW
PARAGRAPH
好的,因此日志的每一行都是一串Elements及其子元素。我们有一些新ElementTypes (HORIZONTAL_RULE,TABLE_ROW和TEXT)。如果一条链只有Paragraph
并且没有子链,则用'PARAGRAPH'表示。我们可以忽略它,因为它是空白行。我们也可以忽略HORIZONTAL_RULE
,因为此显然不会包含文本。
如果我们已到达TEXT元素,则意味着我们可以像执行LIST_ITEM和PARAGRAPH一样执行我们的功能(即,对于OP来说就是翻译)。但是,我们仍然必须处理TableRow对象(其日志记录如下:TABLE.TABLE_ROW
)。 类似于我们的主要3个元素,并且可以与我们的if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH")
一起使用,后者更改为if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW")
。
这给我们链中的另一个新元素; TableCell(日志类似:TABLE.TABLE_ROW.TABLE_CELL
),我们可以再次 将其添加到if语句中,使其成为:if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL")
是时候了解表元素类型了。
function selectionHasWhichtypeChains() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements();
rangeElems.forEach(function(elem){
var elem = elem.getElement();
elemsHaveWhatChildElems(elem, elem.getType());
});
}
function elemsHaveWhatChildElems(elem, typeChain){
var elemType = elem.getType();
if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL"){ //Lets see if element is one of our basic 5 if so they could have children.
var numChildren = elem.getNumChildren(); //How many children are there?
if(numChildren > 0){
for(var i = 0; i < numChildren; i++){ //Let's go through them.
var child = elem.getChild(i);
elemsHaveWhatChildElems(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
}
}else{
Logger.log(typeChain); //Let's log the chain of Parent to Child elements.
}
}else{
Logger.log("*" + typeChain); //Let's mark the new elemTypeChains we have not seen.
}
}
//Logger OUTPUT:
*PARAGRAPH.TEXT
PARAGRAPH
*PARAGRAPH.HORIZONTAL_RULE
PARAGRAPH
*PARAGRAPH.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
PARAGRAPH
*PARAGRAPH.TEXT
PARAGRAPH
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.HORIZONTAL_RULE
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
PARAGRAPH
太棒了!!我们深入到每个父元素的深处,并达到了 Text Element 或空白段强>!从这里我们可以稍微修改我们的代码,以添加我们要在维护文档结构的同时执行的功能:
function myFunction() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements(); //Get main Elements of selection
rangeElems.forEach(function(elem){ //Let's rn through each to find ALL of their children.
var elem = elem.getElement(); //We have an ElementType. Let's get the full element.
getNestedTextElements(elem, elem.getType()); //Time to go down the rabbit hole.
});
}
function getNestedTextElements(elem, typeChain){
var elemType = elem.getType();
if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL"){ //Lets see if element is one of our basic 5, if so they could have children.
var numChildren = elem.getNumChildren(); //How many children are there?
if(numChildren > 0){
for(var i = 0; i < numChildren; i++){ //Let's go through them.
var child = elem.getChild(i);
getNestedTextElements(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
}
}
}else if(elemType == "TEXT"){
//THIS IS WHERE WE CAN PERFORM OUR OPERATIONS ON THE TEXT ELEMENT
var text = elem.getText();
}else{
Logger.log("*" + typeChain); //Let's log the new elem we dont deal with now - for future proofing.
}
}
BOOM!完成。我知道这是一篇很长的文章,但是我将解决方案的每个部分分为几个部分,以帮助新的Apps Script编码人员理解选区的结构(我想是文档主体)以及如何在结构非常复杂(许多嵌套元素)时对其进行修改。 我真的希望这会有所帮助。如果有人看到可以改进的地方,请告诉我。
作为OP的注意事项:请注意,这不一定要处理Element的部分选择,但是可以通过稍微修改第一个函数以检查isPartial()
来轻松解决。在RangeElement上。