没有DELETE的墓碑式细胞

时间:2014-08-22 09:40:01

标签: cassandra cql

我正在运行Cassandra集群

Software version: 2.0.9
Nodes: 3
Replication factor: 2

我有一个非常简单的表格,我插入和更新数据。

CREATE TABLE link_list (
      url text,
      visited boolean,
      PRIMARY KEY ((url))
    );

行上没有过期,我没有做任何DELETE。一旦我运行我的应用程序,它会因为逻辑删除的单元格数量的增加而迅速减速:

Read 3 live and 535 tombstoned cells

几分钟内就可以达到数千人。

我的问题是如果我没有做任何删除,那么负责产生这些细胞的是什么?

//更新

这是我用 com.datastax.driver 与Cassandra交谈的实现。

public class LinkListDAOCassandra implements DAO {


    public void save(Link link) {
        save(new VisitedLink(link.getUrl(), false));
    }

    @Override
    public void save(Model model) {
        save((Link) model);
    }

    public void update(VisitedLink link) {
        String cql = "UPDATE link_list SET visited = ? WHERE url = ?";
        Cassandra.DB.execute(cql, ConsistencyLevel.QUORUM, link.getVisited(), link.getUrl());
    }

    public void save(VisitedLink link) {
        String cql = "SELECT url FROM link_list_inserted WHERE url = ?";

        if(Cassandra.DB.execute(cql, ConsistencyLevel.QUORUM, link.getUrl()).all().size() == 0) {
            cql = "INSERT INTO link_list_inserted (url) VALUES (?)";
            Cassandra.DB.execute(cql, ConsistencyLevel.QUORUM, link.getUrl());

            cql = "INSERT INTO link_list (url, visited) VALUES (?,?)";
            Cassandra.DB.execute(cql, ConsistencyLevel.QUORUM, link.getUrl(), link.getVisited());
        }
    }

    public VisitedLink getByUrl(String url) {
        String cql = "SELECT * FROM link_list WHERE url = ?";

        for(Row row : Cassandra.DB.execute(cql, url)) {
            return new VisitedLink(row.getString("url"), row.getBool("visited"));
        }

        return null;
    }

    public List<Link> getLinks(int limit) {
        List<Link> links = new ArrayList();
        ResultSet results;

        String cql = "SELECT * FROM link_list WHERE visited = False LIMIT ?";

        for(Row row : Cassandra.DB.execute(cql, ConsistencyLevel.QUORUM, limit)) {
            try {
                links.add(new Link(new URL(row.getString("url"))));
            }
            catch(MalformedURLException e) { }
        }

        return links;
    }
}

这是执行实现

public ResultSet execute(String cql, ConsistencyLevel cl, Object... values) {
        PreparedStatement statement = getSession().prepare( cql ).setConsistencyLevel(cl);
        BoundStatement boundStatement = new BoundStatement( statement );
        boundStatement.bind(values);

        return session.execute(boundStatement);
    }

//更新2

cfstats的一个有趣的发现表明只有一个表有墓碑。它是link_list_visited。这是否意味着更新具有二级索引的列会创建逻辑删除?

Table (index): link_list.link_list_visited
                SSTable count: 2
                Space used (live), bytes: 5055920
                Space used (total), bytes: 5055991
                SSTable Compression Ratio: 0.3491883995187955
                Number of keys (estimate): 256
                Memtable cell count: 15799
                Memtable data size, bytes: 1771427
                Memtable switch count: 1
                Local read count: 85703
                Local read latency: 2.805 ms
                Local write count: 484690
                Local write latency: 0.028 ms
                Pending tasks: 0
                Bloom filter false positives: 0
                Bloom filter false ratio: 0.00000
                Bloom filter space used, bytes: 32
                Compacted partition minimum bytes: 8240
                Compacted partition maximum bytes: 7007506
                Compacted partition mean bytes: 3703162
                Average live cells per slice (last five minutes): 3.0
                Average tombstones per slice (last five minutes): 674.0

1 个答案:

答案 0 :(得分:1)

手动保存索引的辅助索引和额外列系列之间的唯一主要区别是辅助索引仅包含有关当前节点的信息(即,它不包含有关其他节点数据的信息)和由于主表上的更新而对辅助索引的操作是原子操作。除此之外,您可以将其视为具有相同弱点的常规列族,主列族上的大量更新将导致索引表上的大量删除,因为主表上的更新将被转换作为索引表上的删除/插入操作。

希望它有所帮助!