I'm a college student working on a semester long project and I have hit a wall with my program. Before I go any further, do know that I looked through the similar threads on stack overflow and none of them seem to match my situation.
I have a string input generated from a pdf that contains abundant data from a table. Problem is, some of the table entries for the department column go from 1 row to 2 due to the formatting and I am unable to address it. For example,
PS 253 (handled fine by my algorithm)
MA
243HON (breaks everything)
I need to ultimately be able to put them on the same row and remove the " \n" after MA to send it along to the rest of the program. I attempted checking for \n one or two index places after the department code (MA) and changing the index from which I get 243HON, which did not work.
I have also tried String = string.replaceAll("MA \n", "MA ") as seen in the code. Removing the space between MA and \n does nothing. Here is the relevant part of my code. Thank you!
public static String[] departments = {"\nAS","\nSF","\nAE","\nAF","\nAT","\nLAR","\nAMS","\nBIO","\nBA","\nCHM","\nLCH","\nCIV","\nCSO",
"\nCOM","\nCEC","\nCS","\nCYB","\nEC","\nEE","\nEGR","\nEP","\nES","\nFA","\nGCS","\nHS","\nHON","\nHF","\nHU","\nMA","\nME","\nWX",
"\nMSL","\nNSC","\nPE","\nPS","\nPSY","\nSIM","\nSS","\nSE","\nSP","\nSYS","\nUNIV","\nUA"};
public static String[] departmentsFix = {"\nAS \n","\nSF \n","\nAE \n","\nAF \n","\nAT \n","\nLAR \n","\nAMS \n","\nBIO \n","\nBA \n","\nCHM \n","\nLCH \n","\nCIV \n","\nCSO \n",
"\nCOM \n","\nCEC \n","\nCS \n","\nCYB \n","\nEC \n","\nEE \n","\nEGR \n","\nEP \n","\nES \n","\nFA \n","\nGCS \n","\nHS \n","\nHON \n","\nHF \n","\nHU \n","\nMA \n","\nME \n","\nWX \n",
"\nMSL \n","\nNSC \n","\nPE \n","\nPS \n","\nPSY \n","\nSIM \n","\nSS \n","\nSE \n","\nSP \n","\nSYS \n","\nUNIV \n","\nUA \n"};
public static void main(String[] args) {
// TODO Auto-generated method stub
Loader loader = new Loader();
try {
File file = new File("C:\\Users\\User\\Desktop\\EclipseWorkspace\\SE 300\\ER_SCHED_PRT.pdf");
PDDocument document = PDDocument.load(file);
PDFTextStripper s = new PDFTextStripper();
loader.content = s.getText(document);
String[] splitString = loader.content.split("Instructor", 2);
loader.content = splitString[1];
int index = 0;
for (String y : departmentsFix) {
//find any departments with a \n after them and replace it with a space
loader.content = loader.content.replaceAll(y, departments[index] + " ");
index++;
}
答案 0 :(得分:-1)
我刚刚修复了它。通过find函数,我发现格式不是\ nMA \ n,而是\ nMA \ r \ n。对此进行更改可以解决一个无关紧要的小错误,该错误可以通过使用额外的空间来弥补。仍然感谢您的帮助。