我正在SQL语言之上构建一个dsl,其目标是完善一些粗略的边缘,键入安全性和次要查询优化。
在我的第一次传递中,我使用GADT来强制执行目标语言将执行的几个限制,但输出在大多数情况下是垃圾。即选择语句中的混合长度列,如
select mean x, x by date, time from tbl
我通过将列标记为Selections(向量)或Reductions(标量)来阻止这些混合选择,因此编译器可以静态地阻止这些操作。但是,我从未弄清楚如何阻止从群组中进行选择。
要刮掉的下一个功能是将类型检查添加到列算术运算中,我的第一个坏主意是添加另一个幻像类型来携带列类型。这不会工作,因为它会破坏我的Schema类型,我不能再选择混合类型的列。这使我认为无标记嵌入Columns可能是合适的,因为我可以根据上下文获得适当的表示(类型,大小等)。缺点是这似乎推迟了我想编码的所有不变量直到运行时。为了说明这一点,这是GADT中add
的粗略伪代码和无标记方法。
-- | GADT
add :: Column ty s1 -> Column ty s2
add a b = BinExpr Add a b
-- | Tagless
add :: Column c => c -> c -> c
add a b | (a :: ColType) == (b :: ColType) = binexpr Add a b
| TypeMismatch a b
GADT方法还具有能够指定Num ty => ...
以防止在字符串列上添加的优点。这种类型的问题似乎会传播到我的所有代码中,因为我现在必须遵循运行时检查以确保防止混合大小的Select
或Group
仅在Query Selection
上工作。所以我的问题是:在使用Tagless风格的灵活性时,我能以某种方式保留GADT的静态保证吗?
我目前的GADT
{-# LANGUAGE
GADTs
, DataKinds
, TypeFamilies #-}
-- Kinds
-- | State Tags
-- These constructors are promoted to the kind level to be used as
-- phantom types to enforce static query invariants. For instance
-- these are strategically used to prevent selecting from a Groupby
-- or preventing selection of mixlengthed columns
data Sz = Selection | Reduction deriving Show
--------------------------------------------------------------------
-- Schema
data Schema = Schema {
name :: Maybe String,
spec :: [Column Selection]
} deriving Eq
--------------------------------------------------------------------
-- Column Datatype
data Column (a :: Sz) where
Column :: String -> Column Selection
Assign :: String -> Column a -> Column a
FKey :: String -> Schema -> Column Selection
BinExpr :: BinOp -> Column a -> Column b -> Column (ColOp a b)
LogExpr :: LogOp -> Column a -> Column b -> Column Selection
AggExpr :: AggOp -> Column Selection -> Column Reduction
type family ColOp (a :: Sz) (b :: Sz) where
-- | If both sides a scalar then output
-- is still a scalar, otherwise return
-- a vector, as we can perform the op
-- elemwise or brodcast a scalar across
-- the vector
ColOp Reduction Reduction = Reduction
ColOp a b = Selection
--------------------------------------------------------------------
-- Query types
class Tabular repr where
meta :: repr (a :: Sz) -> Schema
data Query (a :: Sz) where
Table :: Schema -> Query Selection
Select :: (Tabular t, Show (t b)) => [Column a] -> t b -> Query Selection
Where :: (Tabular t, Show (t Selection)) => [Column Selection] -> t Selection -> Query Selection
Group :: (Tabular t, Show (t Selection)) => [Column Selection] -> t Selection -> Query Reduction
无标记列原型
{-# language FlexibleInstances #-}
data Sz = Selection | Reduction
deriving Show
data ColType
= Int | Float | String | Bool
| TypeMismatch ColType ColType
deriving (Show, Eq)
data BinOp = Add | Mul
data AggOp = Mean | Count
-- Tagless AST
class Show repr => Column repr where
column :: String -> ColType -> repr
assign :: String -> repr -> repr
binExpr :: BinOp -> repr -> repr -> repr
aggExpr :: AggOp -> repr -> repr
-- Show Instances
instance Show BinOp where
show Add = " + "
show Mul = " * "
instance Show AggOp where
show Mean = "mean "
show Count = "count "
-- I Would like to bring these into the type level (ie performed statically)
opSz :: Sz -> Sz -> Sz
opSz Reduction Reduction = Reduction
opSz a b = Selection
typeCheck :: ColType -> ColType -> ColType
typeCheck a b | a == b = a
| otherwise = TypeMismatch a b
-- Tagless Interpreters
-- | Stringify
instance Column ([Char]) where
column s t = s
assign s c = s ++ ": " ++ show c
binExpr o l r = l ++ show o ++ r
aggExpr op l = show op ++ l
-- | Column Size
instance Column Sz where
column s t = Selection
assign s c = c :: Sz
binExpr o l r = opSz l r
aggExpr op l = Reduction
-- | Column Type
instance Column ColType where
column s t = t
assign s c = c :: ColType
binExpr o l r = typeCheck l r
aggExpr o l = l :: ColType
add :: Column c => c -> c -> c
add l r = binExpr Add l r
columns :: Column c => [c]
columns = [ column "a" Float
, column "b" Int
, column "c" String ]
a :: Column c => c
a = columns !! 0
b :: Column c => c
b = columns !! 1
-- add a b :: ColType => TypeMismatch Float Int
编辑:我可以通过向类
添加类型参数来获得类似的行为class Column repr where
column :: String -> repr Selection
assign :: String -> repr (a :: Sz) -> repr (a :: Sz)
binExpr :: BinOp -> repr (a :: Sz) -> repr (b :: Sz) -> repr (OpSz a b)
aggExpr :: AggOp -> repr Selection -> repr Reduction
但我仍然很好奇我如何定义一个解释器,它将ColType
带到类型级别,以便我可以定义add :: C t -> C t -> C t
,因此add a :: (C String) b :: (C Int)
不会进行类型检查