SQLalchemy:列值组合的所有排列的分位数

时间:2014-03-05 20:27:34

标签: python sql-server sqlalchemy quantile

我们有一个sql server查询,我们需要为越来越多的变量生成ntiles,这样变量就会以各种排列相互组合。这是一个例证我的意思的摘录:

声明1:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID 
                    order by Objects_Created) AS Ntile_Mon_Objects_Created,

声明2:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID, *Country*
          order by Objects_Created) AS Ntile_Country_Objects_Created

声明3:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID, *User*_Type
                 order by Objects_Created) AS Ntile_UT_Objects_Created

您可以看到这些语句是相同的,只是在第二个和第三个语句中创建了斜体列“country”和“user type”。因此,我们将ntiles用于不同特异性水平的相同变量“Objects_Created”,并且我们还必须采用ntiles来进行这些变量的各种可能的排列,例如:

声明4:

ntile(10) over (partition by  MAUorALL, User_Type, fsi.Month_ID, *Country, User_Type*
            order by Objects_Created) AS Ntile_Country_UT_Objects_Created

我们可以手动将这些排列编码到一个点,但是如果我们可以使用sqlalchemy来执行这些变量的所有排列,那么它可能会使事情变得更容易。有没有人有一个我可以重新用途的例子?

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

我不知道fsi如何与其他列相关,但假设所有数据都在一个模型中(很容易使用sqlalchemy查询扩展),如下所示:

class User(Base):
    __tablename__ = 't_users'
    id = Column(Integer, primary_key=True)
    MAUorALL = Column(String)
    User_Type = Column(String)
    Country = Column(String)
    Month_ID = Column(Integer)
    Objects_Created = Column(Integer)

通过简单地使用itertools.permutations(或itertools.combinations,取决于您想要实现的目标)来创建查询,即可完成任务。下面的代码会生成User表的查询,其中包含各种ntiles。我假设阅读代码足以理解正在发生的事情:

# configuration: {label: Column}
column_labels = {
        'Country': User.Country,
        'UT': User.User_Type,
        }

def get_ntile(additional_columns=None):
    """ @return: sqlalchemy expression for selecting a given ntile() using
    predefined as well as *additional* columns.
    """
    partition_by = [
        User.MAUorALL,
        User.User_Type,
        User.Month_ID,
        ]
    label = "Ntile_Objects_Created"
    if additional_columns:
        lbls = []
        for col_name in additional_columns:
            col = column_labels[col_name]
            partition_by.append(col)
            lbls.append(col_name)
        label = "Ntile_{}_Objects_Created".format("_".join(lbls))
    xprs = over(
            func.ntile(10),
            partition_by = partition_by,
            order_by = User.Objects_Created,
            ).label(label)
    return xprs

def get_query(additional_columns=['UT', 'Country']):
    """ @return: a query object which selects a User with additional ntiles
    for predefined columns (fixed) and all possible permutations of
    *additional_columns*
    """
    from itertools import permutations#, combinations
    tiles = [get_ntile(comb)
            for r in range(len(additional_columns) + 1)
            for comb in permutations(additional_columns, r)
            ]
    q = session.query(User, *tiles)
    return q

q = get_query()
print [_c["name"] for _c in q.column_descriptions]
# >>> ['User', 'Ntile_Objects_Created', 'Ntile_UT_Objects_Created', 'Ntile_Country_Objects_Created', 'Ntile_UT_Country_Objects_Created', 'Ntile_Country_UT_Objects_Created']

for tile in q.all():
    print tile