我正在尝试使用prepare coordinates
while trialcount < expectedNumTrialsNeeded
draw random sample (generated array of randomly sampled coordinates and index into it)
check if sample is not degenerate and has allowed angle
compute number of inliers:
this calls my selectbox function, to select a smaller set of points, which are near enough to check - takes 30k sec of 55k
take those points and compute their distance to the line, all points within threshold are inliers - takes 12k secs of 55k
if number of inliers > best number of inliers yet, this is new best set
update expected number of trials needed to find sample of only inliers
increment trialcount
end
return best set
进行select where
,但我收到了以下错误:
Datasax Cassandra Connector
我真的不明白为什么java.io.IOException: Exception during preparation of SELECT "path" FROM "tracking"."user_page_action" WHERE token("user_id") > ? AND token("user_id") <= ? AND user_id = ? ALLOW FILTERING: user_id cannot be restricted by more than one relation if it includes an Equal
会增加其他限制。
这就是我试图阅读的内容:
connector
就像他们的documentation
一样 spark.cassandraTable(keySpace,table).select(column).where(whereColumn + " = ?", whereColumnValue).collect()
是表格的user_id
,我还使用primary key
在终端中尝试了select where
,但它确实有效。
我看了类似的问题,但没有帮助
Dataframe where clause doesn't work when use spark cassandra connector
答案 0 :(得分:0)
正如您所注意到的,spark-cassandra-connector在令牌上添加了范围限制。通常,您的查询会根据令牌范围由连接器拆分为多个查询,以强制执行针对副本的每个查询,从而确保数据位置。 在您的情况下,您使用user_id = value提供完整的分区键(可以说,在这种情况下,Spark不是正确的工具,但我不知道您的应用程序在做什么)。有一些关于Spark-Cassandra-Connector项目的讨论要解决这个问题,我不知道它是否发生过。
但是,如果你切换到Cassandra 2.2或3(我假设你正在运行Cassandra 2.1),Cassandra将接受生成的查询(分区键受到相等和范围限制的查询)。我在2.2.6和3.0.5上测试过它。