如何提升::序列化为sqlite :: blob?

时间:2013-12-05 20:21:24

标签: c++ sqlite serialization boost blob

我正在开展一项需要多项计划能力的科学项目。在浏览了可用的工具后,我决定使用Boost库,它为我提供了C ++标准库不提供的所需功能,如日期/时间管理等。

我的项目是一组命令行,它处理来自旧的,自制的纯文本文件数据库的大量数据:导入,转换,分析,报告。

现在我达到了我需要持久性的程度。所以我包含了我发现非常有用的boost :: serialization。我能够存储和恢复“中等”数据集(不是那么大但不是那么小),它们大约是(7000,48,15,10)-dataset。

我还使用SQLite C API来存储和管理命令默认值,输出设置和变量元信息(单位,比例,限制)。

我想到了一些东西:序列化为blob字段而不是单独的文件。可能有一些我还没有看到的缺点(总是存在),但我认为它可以是一个适合我需要的好解决方案。

我能够将文本序列化为std :: string,所以我可以这样做:没有困难,因为它只使用普通字符。但我想二进制序列化为blob。

在填写INSERT查询时,如何继续使用标准流?

1 个答案:

答案 0 :(得分:10)

哈。我之前从未使用过sqlite3 C API。我从未编写过输出streambuf实现。但看到我将来可能会在c ++代码库中使用sqlite3,我以为我会花一些时间与

事实证明你可以 open a blob field for incremental IO。但是,尽管您可以读取/写入BLOB,但您无法更改大小(通过单独的UPDATE语句除外)。

因此,我演示的步骤变为:

  1. 将记录插入表中,绑定某个(固定)大小的“零点”
  2. 打开新插入记录中的blob字段
  3. 将blob句柄包装在源自blob_buf的自定义std::basic_streambuf<>对象中,并可与std::ostream一起使用以写入该blob
  4. 将一些数据序列化为ostream
  5. 冲洗
  6. 自毁/清理
  7. 有效:)

    main中的代码:

    int main()
    {
        sqlite3 *db = NULL;
        int rc = sqlite3_open_v2("test.sqlite3", &db, SQLITE_OPEN_READWRITE, NULL);
        if (rc != SQLITE_OK) {
            std::cerr << "database open failed: " << sqlite3_errmsg(db) << "\n";
            exit(255);
        }
    
        // 1. insert a record into a table, binding a "zero-blob" of a certain (fixed) size
        sqlite3_int64 inserted = InsertRecord(db);
    
        {
            // 2. open the blob field in the newly inserted record
            // 3. wrap the blob handle in a custom `blob_buf` object that derives from `std::basic_streambuf<>` and can be used with `std::ostream` to write to that blob
            blob_buf buf(OpenBlobByRowId(db, inserted));
            std::ostream writer(&buf); // this stream now writes to the blob!
    
            // 4. serialize some data into the `ostream`
            auto payload = CanBeSerialized { "hello world", { 1, 2, 3.4, 1e7, -42.42 } };
    
            boost::archive::text_oarchive oa(writer);
            oa << payload;
    
    #if 0   // used for testing with larger data
            std::ifstream ifs("test.cpp");
            writer << ifs.rdbuf();
    #endif
    
            // 5. flush
            writer.flush();
    
            // 6. destruct/cleanup 
        }
    
        sqlite3_close(db);
        // ==7653== HEAP SUMMARY:
        // ==7653==     in use at exit: 0 bytes in 0 blocks
        // ==7653==   total heap usage: 227 allocs, 227 frees, 123,540 bytes allocated
        // ==7653== 
        // ==7653== All heap blocks were freed -- no leaks are possible
    }
    

    您将认识到所概述的步骤。

    要测试它,假设您创建一个新的sqlite数据库:

    sqlite3 test.sqlite3 <<< "CREATE TABLE DEMO(ID INTEGER PRIMARY KEY AUTOINCREMENT, FILE BLOB);"
    

    现在,运行程序后,您可以查询它:

    sqlite3 test.sqlite3 <<< "SELECT * FROM DEMO;"
    1|22 serialization::archive 10 0 0 11 hello world 5 0 1 2 3.3999999999999999 10000000 -42.420000000000002
    

    如果启用测试代码(放置的数据多于blob_size允许的数据),您将看到blob被截断:

    contents truncated at 256 bytes
    

    完整计划

    #include <sqlite3.h>
    #include <string>
    #include <iostream>
    #include <ostream>
    #include <fstream>
    #include <boost/serialization/vector.hpp>
    #include <boost/archive/text_oarchive.hpp>
    
    template<typename CharT, typename TraitsT = std::char_traits<CharT> >
    class basic_blob_buf : public std::basic_streambuf<CharT, TraitsT> 
    {
        sqlite3_blob* _blob; // owned
        int max_blob_size;
    
        typedef std::basic_streambuf<CharT, TraitsT> base_type;
        enum { BUFSIZE = 10 }; // Block size - tuning?
        char buf[BUFSIZE+1/*for the overflow character*/];
    
        size_t cur_offset;
        std::ostream debug;
    
        // no copying
        basic_blob_buf(basic_blob_buf const&)             = delete;
        basic_blob_buf& operator= (basic_blob_buf const&) = delete;
    public:
        basic_blob_buf(sqlite3_blob* blob, int max_size = -1) 
            : _blob(blob), 
            max_blob_size(max_size), 
            buf {0}, 
            cur_offset(0),
            // debug(std::cerr.rdbuf()) // or just use `nullptr` to suppress debug output
            debug(nullptr)
        {
            debug.setf(std::ios::unitbuf);
            if (max_blob_size == -1) {
                max_blob_size = sqlite3_blob_bytes(_blob);
                debug << "max_blob_size detected: " << max_blob_size << "\n";
            }
            this->setp(buf, buf + BUFSIZE);
        }
    
        int overflow (int c = base_type::traits_type::eof())
        {
            auto putpointer = this->pptr();
            if (c!=base_type::traits_type::eof())
            {
                // add the character - even though pptr might be epptr
                *putpointer++ = c;
            }
    
            if (cur_offset >= size_t(max_blob_size))
                return base_type::traits_type::eof(); // signal failure
    
            size_t n = std::distance(this->pbase(), putpointer);
            debug << "Overflow " << n << " bytes at " << cur_offset << "\n";
            if (cur_offset+n > size_t(max_blob_size))
            {
                std::cerr << "contents truncated at " << max_blob_size << " bytes\n";
                n = size_t(max_blob_size) - cur_offset;
            }
    
            if (SQLITE_OK != sqlite3_blob_write(_blob, this->pbase(), n, cur_offset))
            {
                debug << "sqlite3_blob_write reported an error\n";
                return base_type::traits_type::eof(); // signal failure
            }
    
            cur_offset += n;
    
            if (this->pptr() > (this->pbase() + n))
            {
                debug << "pending data has not been written";
                return base_type::traits_type::eof(); // signal failure
            }
    
            // reset buffer
            this->setp(buf, buf + BUFSIZE);
    
            return base_type::traits_type::not_eof(c);
        }
    
        int sync()
        {
            return base_type::traits_type::eof() != overflow();
        }
    
        ~basic_blob_buf() { 
            sqlite3_blob_close(_blob);
        }
    };
    
    typedef basic_blob_buf<char> blob_buf;
    
    struct CanBeSerialized
    {
        std::string sometext;
        std::vector<double> a_vector;
    
        template<class Archive>
        void serialize(Archive & ar, const unsigned int version)
        {
            ar & boost::serialization::make_nvp("sometext", sometext);
            ar & boost::serialization::make_nvp("a_vector", a_vector);
        }
    };
    
    #define MAX_BLOB_SIZE 256
    
    sqlite3_int64 InsertRecord(sqlite3* db)
    {
        sqlite3_stmt *stmt = NULL;
        int rc = sqlite3_prepare_v2(db, "INSERT INTO DEMO(ID, FILE) VALUES(NULL, ?)", -1, &stmt, NULL);
    
        if (rc != SQLITE_OK) {
            std::cerr << "prepare failed: " << sqlite3_errmsg(db) << "\n";
            exit(255);
        } else {
            rc = sqlite3_bind_zeroblob(stmt, 1, MAX_BLOB_SIZE);
            if (rc != SQLITE_OK) {
                std::cerr << "bind_zeroblob failed: " << sqlite3_errmsg(db) << "\n";
                exit(255);
            }
            rc = sqlite3_step(stmt);
            if (rc != SQLITE_DONE)
            {
                std::cerr << "execution failed: " << sqlite3_errmsg(db) << "\n";
                exit(255);
            }
        }
        rc = sqlite3_finalize(stmt);
        if (rc != SQLITE_OK)
        {
            std::cerr << "finalize stmt failed: " << sqlite3_errmsg(db) << "\n";
            exit(255);
        }
    
        return sqlite3_last_insert_rowid(db);
    }
    
    sqlite3_blob* OpenBlobByRowId(sqlite3* db, sqlite3_int64 rowid)
    {
        sqlite3_blob* pBlob = NULL;
        int rc = sqlite3_blob_open(db, "main", "DEMO", "FILE", rowid, 1/*rw*/, &pBlob);
    
        if (rc != SQLITE_OK) {
            std::cerr << "blob_open failed: " << sqlite3_errmsg(db) << "\n";
            exit(255);
        }
        return pBlob;
    }
    
    int main()
    {
        sqlite3 *db = NULL;
        int rc = sqlite3_open_v2("test.sqlite3", &db, SQLITE_OPEN_READWRITE, NULL);
        if (rc != SQLITE_OK) {
            std::cerr << "database open failed: " << sqlite3_errmsg(db) << "\n";
            exit(255);
        }
    
        // 1. insert a record into a table, binding a "zero-blob" of a certain (fixed) size
        sqlite3_int64 inserted = InsertRecord(db);
    
        {
            // 2. open the blob field in the newly inserted record
            // 3. wrap the blob handle in a custom `blob_buf` object that derives from `std::basic_streambuf<>` and can be used with `std::ostream` to write to that blob
            blob_buf buf(OpenBlobByRowId(db, inserted));
            std::ostream writer(&buf); // this stream now writes to the blob!
    
            // 4. serialize some data into the `ostream`
            auto payload = CanBeSerialized { "hello world", { 1, 2, 3.4, 1e7, -42.42 } };
    
            boost::archive::text_oarchive oa(writer);
            oa << payload;
    
    #if 0   // used for testing with larger data
            std::ifstream ifs("test.cpp");
            writer << ifs.rdbuf();
    #endif
    
            // 5. flush
            writer.flush();
    
            // 6. destruct/cleanup 
        }
    
        sqlite3_close(db);
    }
    

    PS。我一直在处理错误...非常粗暴。您将要引入一个帮助函数来检查sqlite3错误代码并转换为异常。 :)