How to extract the text contents of files stored as binary in database

时间:2018-03-25 19:09:11

标签: c# sql-server asp.net-mvc

Is there a way to get the contents of files stored as binary in database? I want to get the content of pdf file to search in it.

For example search for particular word. I'm using ASP.NET MVC with EF6 and SQL Server.

This code for storing files in the database:

[HttpPost]
public ActionResult FileUpload(FileDetail Fd, HttpPostedFileBase files)
{
    String FileExt = Path.GetExtension(files.FileName).ToUpper();

    if (FileExt == ".PDF")
    {
        Stream str = files.InputStream;
        BinaryReader Br = new BinaryReader(str);
        Byte[] FileDet = Br.ReadBytes((Int32)str.Length);

        Fd.FileName = files.FileName;
        Fd.FileContent = FileDet;
        db.FileDetails.Add(Fd);
        db.SaveChanges();
        //other code
    }
    else
    {
        //other code
    }
}

Edit I will use iTextsharp thanks

1 个答案:

答案 0 :(得分:3)

you should just be able to load the relevant item from db.FileDetails to get a FileDetail instance, then read the .FileContent value - essentially the reverse of how you stored it.

If you mean that you are struggling to parse the text from the PDF - that's an entirely separate matter.