我想从SD卡中的PDF文件中读取文本。如何从存储在SD卡中的PDF文件中获取文本?
我试过了:
public class MainActivity extends ActionBarActivity implements TextToSpeech.OnInitListener {
private TextToSpeech tts;
private String line = null;
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
tts = new TextToSpeech(getApplicationContext(), this);
final TextView text1 = (TextView) findViewById(R.id.textView1);
findViewById(R.id.button1).setOnClickListener(new OnClickListener() {
private String[] arr;
@Override
public void onClick(View v) {
File sdcard = Environment.getExternalStorageDirectory();
// Get the text file
File file = new File(sdcard, "test.pdf");
// ob.pathh
// Read text from file
StringBuilder text = new StringBuilder();
try {
BufferedReader br = new BufferedReader(new FileReader(file));
// int i=0;
List<String> lines = new ArrayList<String>();
while ((line = br.readLine()) != null) {
lines.add(line);
// arr[i]=line;
// i++;
text.append(line);
text.append('\n');
}
for (String string : lines) {
tts.speak(string, TextToSpeech.SUCCESS, null);
}
arr = lines.toArray(new String[lines.size()]);
System.out.println(arr.length);
text1.setText(text);
} catch (Exception e) {
e.printStackTrace();
}
}
});
}
@Override
public void onInit(int status) {
if (status == TextToSpeech.SUCCESS) {
int result = tts.setLanguage(Locale.US);
if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
Log.e("TTS", "This Language is not supported");
} else {
// speakOut();
}
} else {
Log.e("TTS", "Initilization Failed!");
}
}
}
注意:如果文件是文本文件(test.txt)但不适用于pdf(test.pdf),它可以正常工作
但是这里的文字不是从PDF中获取的,而是像字节码一样。我怎样才能做到这一点?
提前致谢。
答案 0 :(得分:16)
我有iText的解决方案。
摇篮,
compile 'com.itextpdf:itextg:5.5.10'
爪哇,
try {
String parsedText="";
PdfReader reader = new PdfReader(yourPdfPath);
int n = reader.getNumberOfPages();
for (int i = 0; i <n ; i++) {
parsedText = parsedText+PdfTextExtractor.getTextFromPage(reader, i+1).trim()+"\n"; //Extracting the content from the different pages
}
System.out.println(parsedText);
reader.close();
} catch (Exception e) {
System.out.println(e);
}
答案 1 :(得分:2)
PDF格式不是您的普通文本文件..您需要对PDF进行更多研究,这是您将获得的最佳答案 How to read pdf in my android application?