我拥有pdf的所有段落坐标。像[[16.88, 767.724, 530.464, 816.638], [27.093, 681.063, 124.125, 709.911], [30.806, 648.372, 124.712, 670.008], [30.806, 620.256, 124.712, 641.892], [30.806, 592.176, 124.712, 613.812], [30.806, 564.096, 124.712, 585.732], [30.806, 536.016, 124.712, 557.652], [30.813, 340.983, 156.693, 369.831], [52.244, 211.098, 129.372, 331.59], [279.891, 397.454, 572.939, 716.682], [302.43, 277.287, 337.13, 311.143], [367.775, 336.464, 402.475, 370.319], [367.803, 278.696, 402.504, 312.551], [438.246, 336.464, 472.947, 370.319], [438.246, 277.287, 472.947, 311.143], [372.366, 214.928, 474.946, 248.783], [517.9, 278.696, 552.6, 312.551], [342.311, 182.842, 535.856, 200.872], [24.657, 44.612, 574.588, 162.24]]
坐标显示在pdf图像上,如下所示。
我设法编写算法以获取具有相应标签和顺序的内容(坐标)。
我想做的就是将这些标签应用于源pdf。根据顺序,我需要标记pdf。
您能建议我任何人如何实现这一目标。 请..