Question

从Cassandra加载单个（或几个）宽行到C＃的性能最有效的方法是什么？我的宽行有10.000-100.000列。主键由多个值组成，但列键是单个字符串，列值是单个计数器（请参阅下面的架构）。

使用＆＃34;跟踪＆＃34;在cqlsh中我可以看到Cassandra可以在44米内选择一个包含17.000列的宽行，但是使用Datastax驱动程序将这些数据一直加载到C＃需要700毫秒。有更快的方法吗？我需要在50-100ms内加载全行。（有没有更原生的方式？最小化网络流量的方法？更快的驱动程序？驱动程序的另一种配置？或其他什么？）

我实际上并不需要所有17.000列。我只需要'support'> = 2的列，或者需要按'support'排序的前1000列。但由于“支持”是我的列值，我不知道在CQL中如何查询这样的内容。

这是我的表：

CREATE TABLE real_time.grouped_feature_support (
    algorithm_id int,
    group_by_feature_id int,
    select_feature_id int,
    group_by_feature_value text,
    select_feature_value text,
    support counter,
    PRIMARY KEY ((algorithm_id, group_by_feature_id, select_feature_id, group_by_feature_value), select_feature_value)

这是我使用Datastax驱动程序访问数据的方法：

var table = session.GetTable<GroupedFeatureSupportDataEntry>();
var query = table.Where(x => x.CustomerAlgorithmId == customerAlgorithmId
    && x.GroupByFeatureId == groupedFeatureId
    && myGroupedFeatureValues.Contains(x.GroupByFeatureValue)
    && x.GroupByFeatureValue == groupedFeatureValue
    && x.SelectFeatureId == selectFeatureId)
    .Select(x => new
    {
        x.GroupByFeatureValue,
        x.SelectFeatureValue,
        x.Support,
    })
    .Take(1000000);
var result = query.Execute();

Answer 1

如果您在检索大型结果集时寻求最佳性能，则不应使用Linq-to-cql或其他任何映射组件。

您可以使用technique documented on the driver readme检索行，在您的情况下，它将类似于：

var query = "SELECT * from grouped_feature_support WHERE" + 
            " algorithm_id = ? AND group_by_feature_id = ? " +
            " AND select_feature_id = ? AND group_by_feature_value = ?";
//Prepare the query once in your application lifetime
var ps = session.Prepare(query);
//Reuse the prepared statement by binding different parameters to it
var rs = session.Execute(ps.Bind(parameters));
foreach (var row in rs)
{
  //The enumerator will yield all the rows from Cassandra
  //Retrieving them in the back in blocks of 5000 (determined by the pagesize).
}
//You can also use a IEnumerable<T> Linq Extensions to filter
var filteredRows = rs.Where(r => r.GetValue<long>("support") > 2);

从Cassandra加载到C＃的最快方法是什么？

1 个答案: