如何使用astyanax和复合列在cassandra中执行远程查询

时间:2013-03-15 01:05:55

标签: cassandra composite-key astyanax

我正在使用cassandra和astyanax开发博客。这当然只是一种练习。

我已经用这种方式建模了CF_POST_INFO列族:

private static class PostAttribute {

    @Component(ordinal = 0)
    UUID postId;

    @Component(ordinal = 1)
    String category;

    @Component
    String name;

    public PostAttribute() {}

    private PostAttribute(UUID postId, String category, String name) {
        this.postId = postId;
        this.category = category;
        this.name = name;
    }

    public static PostAttribute of(UUID postId, String category, String name) {
        return new PostAttribute(postId, category, name);
    }
}

    private static AnnotatedCompositeSerializer<PostAttribute> postSerializer = new AnnotatedCompositeSerializer<>(PostAttribute.class);

private static final ColumnFamily<String, PostAttribute> CF_POST_INFO =
        ColumnFamily.newColumnFamily("post_info", StringSerializer.get(), postSerializer);

以这种方式保存帖子:

        MutationBatch m = keyspace().prepareMutationBatch();

    ColumnListMutation<PostAttribute> clm = m.withRow(CF_POST_INFO, "posts")
            .putColumn(PostAttribute.of(post.getId(), "author", "id"), post.getAuthor().getId().get())
            .putColumn(PostAttribute.of(post.getId(), "author", "name"), post.getAuthor().getName())
            .putColumn(PostAttribute.of(post.getId(), "meta", "title"), post.getTitle())
            .putColumn(PostAttribute.of(post.getId(), "meta", "pubDate"), post.getPublishingDate().toDate());

    for(String tag : post.getTags()) {
        clm.putColumn(PostAttribute.of(post.getId(), "tags", tag), (String) null);
    }

    for(String category : post.getCategories()) {
        clm.putColumn(PostAttribute.of(post.getId(), "categories", category), (String)null);
    }

这个想法是在一段时间内有一些像桶一样的行(例如每月一行或一年)。

现在,如果我想以最后5个帖子为例,我该如何进行愤怒查询呢?我可以根据帖子ID(UUID)执行愤怒查询但我不知道可用的帖子ID而不进行另一个查询来获取它们。这里的cassandra最佳实践是什么?

当然欢迎任何有关数据模型的建议,我是cassandra的新手。

1 个答案:

答案 0 :(得分:2)

如果您的用例按我认为的方式工作,您可以修改PostAttribute,以便第一个组件是TimeUUID,您可以将其存储为时间序列数据,并且您可以轻松地拉出最旧的5或最新的5使用标准技术。无论如何...这里有一个对我来说的样子,因为如果你已经在使用复合材料,你真的不需要制作多个柱子。

public class PostInfo {
    @Component(ordinal = 0)
    protected UUID timeUuid;

    @Component(ordinal = 1)
    protected UUID postId;

    @Component(ordinal = 2)
    protected String category;

    @Component(ordinal = 3)
    protected String name;

    @Component(ordinal = 4)
    protected UUID authorId;

    @Component(ordinal = 5)
    protected String authorName;

    @Component(ordinal = 6)
    protected String title;

    @Component(ordinal = 7)
    protected Date published;

    public PostInfo() {}

    private PostInfo(final UUID postId, final String category, final String name, final UUID authorId, final String authorName, final String title, final Date published) {
        this.timeUuid = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
        this.postId = postId;
        this.category = category;
        this.name = name;
        this.authorId = authorId;
        this.authorName = authorName;
        this.title = title;
        this.published = published;
    }

    public static PostInfo of(final UUID postId, final String category, final String name, final UUID authorId, final String authorName, final String title, final Date published) {
        return new PostInfo(postId, category, name, authorId, authorName, title, published);
    }
}

    private static AnnotatedCompositeSerializer<PostInfo> postInfoSerializer = new AnnotatedCompositeSerializer<>(PostInfo.class);

private static final ColumnFamily<String, PostInfo> CF_POSTS_TIMELINE =
        ColumnFamily.newColumnFamily("post_info", StringSerializer.get(), postInfoSerializer);

你应该这样保存:

MutationBatch m = keyspace().prepareMutationBatch();

ColumnListMutation<PostInfo> clm = m.withRow(CF_POSTS_TIMELINE, "all" /* or whatever makes sense for you such as year or month or whatever */)
        .putColumn(PostInfo.of(post.getId(), post.getCategory(), post.getName(), post.getAuthor().getId(), post.getAuthor().getName(), post.getTitle(), post.getPublishedOn()), /* maybe just null bytes as column value */)
m.execute();

然后你可以像这样查询:

OperationResult<ColumnList<PostInfo>> result = getKeyspace()
    .prepareQuery(CF_POSTS_TIMELINE)
    .getKey("all" /* or whatever makes sense like month, year, etc */)
    .withColumnRange(new RangeBuilder()
        .setLimit(5)
        .setReversed(true)
        .build())
    .execute();
ColumnList<PostInfo> columns = result.getResult();
for (Column<PostInfo> column : columns) {
    // do what you need here
}