保龄球O M R W ECON 0s 45 6 WD NB始终输掉Dhoni 对我们来说很难-Raina TABoult 4 0 3 0 925 M 2 3 1 0钦奈超级国王 淡季过后, JETED 6 0 = 4 O 0 0赞美Dhoni的支持 CHMorris 4 0 4 ns o9 8 1 1反对Delhi Capitals AR Patel 3 o 3 1 1033 6 3 2 o o“看球,击球”-Dhoni's 最终的公式 要求10秒钟CSK队长击中554杆 在IPL排行榜的第20位,获得PR El 227球 比赛。那是他所有跑步的13% 让我参加了这场比赛 。德里首都旅馆(目标:180从20翻滚)谈话要点-多尼正在ber @EEIEER-
这是我的弦 我想要使用Excel
答案 0 :(得分:0)
基于对您想做什么的简短描述,我建议:
String csvContent = imgData.replaceAll(" ",";");
以下示例假定您已设法检索数据,然后将其进行后处理以提供csv格式。内容将被写入文件,您可以双击以查看数据是否按照您的要求分为几列。
String[] data = new String[] {
"BOWLING O M R W ECON 0s 45 6", //notice that your OCR software does not properly recognise the string here
"TABoult 4 0 3 0 925 M 2 3",
"JETED 6 0 = 4 O 0 0"
};
BufferedWriter writer = new BufferedWriter( new FileWriter( System.getProperty( "user.home" ) + System.getProperty( "file.separator" ) + "data.csv" ) );
for( String record : data ) {
writer.write( record.replaceAll( " ", ";" ) );
writer.write( "\n" );
}
writer.close();
就像我在上面的注释中一样,您的OCR无法正常工作。我建议您研究一下JSOUP html解析器以获取信息并从那里继续。否则,您将对结果不满意。
答案 1 :(得分:0)
driver.get(“ https://www.espncricinfo.com/series/8048/scorecard/1178425/chennai-super-kings-vs-delhi-capitals-50th-match-indian-premier-league-2019”); WebElement元素= driver.findElement(By.xpath(“ // article [@ class ='sub-module scorecard'] [1]”))); JavascriptExecutor js =(JavascriptExecutor)驱动程序; js.executeScript(“ arguments [0] .scrollIntoView(true);”,元素);
File screen = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
File file = new File("C:\\Users\\user\\Desktop\\screenshot1\\screenshotOfElement2.png");
FileHandler.copy(screen, file);
ITesseract instance = new Tesseract();
instance.setDatapath("C:\\selenium_work\\ScrappingText.PDF\\tessdata");
String result = instance.doOCR(file);
//System.out.println(result);
String[] lines = result.split("\\n");
this one what am trying