我想使用Cassandra进行网站分析,特别针对客户细分。 此分析将为每个客户/访客收集页面浏览数据,其中包括:
userName,country,city,gender,timeStamp,source,campaign,pageUrl,timeOnPage
此数据必须按所有维度进行切片,以便于客户细分。例如:
SELECT * Users WHERE Country = USA AND pageUrl =" http://mystore.com/bestsettingproduct" AND timeStamp< DateTime.Now AND timeStamp> DateTime.Now - Days.30
OR
SELECT *用户WHERE Campaign ="上次电子邮件活动" AND AND timeStamp< DateTime.Now AND timeStamp> DateTime.Now - Days.30
据我所知,在Cassandra你只能通过密钥查询。 鉴于这些动态查询,其中一个或多个维度可以包含在where子句中,那么什么是一个好的数据模型呢?
我正在考虑使用以下键创建几个表:
表1(userName,country,city,gender,timeStampWeek,timeStamp source,campaign,pageUrl,timeOnPage,
主键((timeStampWeek,country),city,gender,timeStamp source,campaign,pageUrl,timeOnPage));
表2(userName,country,city,gender,timeStampWeek,timeStamp source,campaign,pageUrl,timeOnPage,
主键((timeStampWeek,city),country,gender,timeStamp,source,campaign,pageUrl,timeOnPage));
表3(userName,country,city,gender,timeStampWeek,timeStamp,source,campaign,pageUrl,timeOnPage,
主键((timeStampWeek,广告系列),country,city,gender,timeStamp,source,pageUrl,timeOnPage));
等等所有尺寸组合?但这似乎很疯狂?是否有更智能的方法来模拟这些查询?