我有2个数据集,实体A的属性集合和实体B的属性集合。为简单起见,假设实体A是用户,实体B是餐馆账单。实体A的属性可以是姓名,年龄,国家,州,城市等,而账单的属性可以是餐馆名称,国家,州,城市,食品等。可以使用什么统计模型来评估饮食习惯。即。当国家或州或城市或其他任何东西在两组之间匹配时,出现的次数会更多。换句话说,哪个属性匹配具有更高的意义。我们怎样才能在R中找到这个?
样品采集:
collection1 = [
{UserId:x, Country:US, State:New York, City: New York},
{UserId:y, Country:US, State:Florida, City:Orlando},
{UserId:z, Country:US, State:Florida, City:FortMyers}
]
collection2 = [
{UserId:x, RestaurantName:a, Country:US, State:New York, City: New York},
{UserId:y, RestaurantName:b, Country:US, State:Florida, City:Orlando}
{UserId:z, RestaurantName:c, Country:Canada, State:Ontario, City:Toronto}
]