我有大约100列宽的行,但我只想写大约8000行。当我写出这些行中的3000行(分批为500行)时,它始终以每500行大约2-3秒的速度写入。
然而,当我尝试写出包含8000行类似数据(更多列)的更大数据集时,它对前3000行(每500行约3-4秒)执行得很好,但是从2500-3000行开始,性能变得越来越慢,而且excel就会爬行。例如:
write rows address: Sheet1!A3:DC502
batch write time: 3.0766400244386167 seconds
write rows address: Sheet1!A503:DC1002
batch write time: 3.3348399202363796 seconds
write rows address: Sheet1!A1003:DC1502
batch write time: 3.7307800745354034 seconds
write rows address: Sheet1!A1503:DC2002
batch write time: 4.149179874582915 seconds
write rows address: Sheet1!A2003:DC2502
batch write time: 3.8166401331085944 seconds
write rows address: Sheet1!A2503:DC3002
batch write time: 7.215600102149649 seconds
write rows address: Sheet1!A3003:DC3502
batch write time: 31.93173993128445 seconds
write rows address: Sheet1!A3503:DC4002
batch write time: 95.68281983804563 seconds
write rows address: Sheet1!A4003:DC4502
batch write time: 148.84947986377625 seconds
write rows address: Sheet1!A4503:DC5002
batch write time: 203.41412001861877 seconds
write rows address: Sheet1!A5003:DC5502
batch write time: 270.2974798251381 seconds
write rows address: Sheet1!A5503:DC6002
...
前30个左右的列包含公式,单元格中包含条件格式。其余的只是将数据写入纯白色细胞。我只是使用range.values将数据交给excel,这就是花了这么长时间。 我可以做些什么来获得稳定的表现?
这是我的代码:
async writeRows(data, formulas, sheetName, startCol, startRow) {
return await Excel.run(async (ctx) => {
let sheet = ctx.workbook.worksheets.getItem(sheetName);
let endRow = startRow;
let startRowOffset = startRow;
let batchSize = 500;
for (let i = 0; i < data.length; i = i + batchSize) {
let t0 = performance.now();
let min = Math.min(batchSize, data.length - i);
let endCol = intToColumn(data[0].length);
startRow = startRowOffset + i;
endRow = startRow + min - 1;
let address = sheetName + "!" + startCol + startRow + ":" + endCol + endRow;
console.log("write rows address: ", address);
let range = sheet.getRange(address);
ctx.application.suspendApiCalculationUntilNextSync();
range.values = data.slice(i, i + min)
range.formulas = formulas.slice(i, i + min);
await ctx.sync();
let t1 = performance.now();
console.log("batch write time: ", (t1 - t0) / 1000, ' seconds');
}
return endRow;
});
}
如果你认为它只是重型公式,那么就是运行相同行而不向range.formulas分配任何内容的时间:
batch write time: 2.072960040280297 seconds
batch write time: 1.893160016976646 seconds
batch write time: 2.239300093637197 seconds
batch write time: 2.4051598865728154 seconds
batch write time: 2.4535400113378855 seconds
batch write time: 4.228719875053808 seconds
batch write time: 21.932359953223656 seconds
batch write time: 65.58508005044697 seconds
batch write time: 99.76420028338683 seconds
batch write time: 133.58046007197566 seconds
batch write time: 181.46535997193905 seconds
...
以下是任务管理器的截图:
有什么想法吗?
答案 0 :(得分:1)
你在循环中有一个ctx.sync
。这可能是一个性能杀手。尝试重构该方法,以便通过单个同步写入所有内容。在我对这个问题的回答中看到模式可能会有所帮助:Document not in sync after replace text。
答案 1 :(得分:1)
我使用类似的批处理方法将大约35,000行加载到officejs中,但批次之间的性能降低很多。 35批1000条线总共需要7秒钟:
这是我的代码:
getWorksheetDataInChunks() {
return ready.then(() => {
return Excel.run(async (context) => {
const sheet = context.workbook.worksheets.getActiveWorksheet();
const dataRange = sheet
.getUsedRange()
.load('columnIndex, rowIndex, columnCount, rowCount, address');
await context.sync();
const rowsTotal = dataRange.rowCount;
const batchSize = config.batchSize;
const data = [];
for (let i = 0; i < rowsTotal; i += batchSize) {
const chunk = `chunk${i / batchSize}`;
const chunkStart = `${chunk}-start`;
const chunkEnd = `${chunk}-end`;
performance.mark(chunkStart);
const rowsRemaining = rowsTotal - i;
const rowOffset = rowsRemaining >= batchSize ? batchSize : rowsRemaining;
const currentRange = sheet
.getCell(dataRange.rowIndex + i, dataRange.columnIndex)
.getResizedRange(rowOffset - 1, dataRange.columnCount - 1)
.load('values, columnIndex, rowIndex, columnCount, rowCount, address');
await context.sync();
data.push(...currentRange.values);
performance.mark(chunkEnd);
performance.measure(chunk, chunkStart, chunkEnd);
}
return data;
}).catch(buildErrorHandler('getWorksheetDataInChunks'));
});
},
你能测试读取数据的速度和写入速度一样慢吗?
在循环中没有context.sync()的情况下你不会成功,因为超出极限excel的方式将在一个同步中处理。这是您首先进行批处理的唯一原因。尝试使用Excel.run()为每个循环迭代创建一个新的上下文,也许你可以&#34;清理&#34;在上一批之后。
答案 2 :(得分:1)
最大问题:
未设置列格式(range.numberFormat
)。在将数据分配给range.values
之前,先设置这些,这样做会有所不同。
//Initial pass over every column to set types
if (type === "Date") {
range.numberFormat = <any>'m/d/yyyy';
} else if (type === "Double") {
range.numberFormat = <any>"#,##0.00";
} else {
range.numberFormat = <any>"#";
}
最后时间:
按照你期望的那样,以1000行的批次写出并保持良好的一致时间:
batch write time: 3.0076802402327596 seconds
batch write time: 3.0477398637461594 seconds
batch write time: 2.9507200432747487 seconds
batch write time: 3.0690198313384025 seconds
batch write time: 2.988500015243975 seconds
batch write time: 3.048739928042458 seconds
batch write time: 3.0736401102757082 seconds
batch write time: 3.097120038203488 seconds
batch write time: 2.068400111446943 seconds
其他有用的更改:
最初,我在二维数组中逐个单元构建我的公式。例如:
//explicity writing out each formula
range.formulas = [
[=A1+B1, =B2+C2],
[=A2+B2, =B2+C2],
...
[=A100+B100, =B100+C100]
];
正确的方法是:
//writing out formula for the first cell, and let excel expand it to the range.
let range = sheet.getRange('A1:A100');
range.formulas = '=A1+B1' as any
let range2 = sheet.getRange('B1:B100');
range2.formulas = '=B1+C1' as any
来源:感谢@Slai分享此链接:https://github.com/OfficeDev/office-js-docs-pr/blob/master/docs/excel/performance.md
其他强>
由于我正在编写关于前30个左右列的公式,最初,我在前30列中写出了从A列到DC列的整行,其中为null。我认为这不必要地写了30列空值,这可能会减慢速度。最后,我决定省略那些空列,只写出带有数据的列。
我也玩过ctx.sync(),但是在Office 2016的桌面版上,我个人并没有注意到这种或那种方式有太多不同(在循环内部,在外面,嵌套等)。但是,如果我在办公室里做任何事情,比如@Rick Kirkham在他的链接中提到的话,我会更加关注。
通过阅读本文:https://github.com/OfficeDev/office-js/issues/12#issuecomment-374741210,我还将我的桌面版Excel更新为晚于build 9021。
答案 3 :(得分:0)
我不会在您的代码中了解其他人,但仅限excel
。据我所知,你试图在Excel中推送数据。所以,我的建议 - 将整个csv文件组合在一起并将其导出到Excel或通过Excel将其作为批量导入。