Question

假设我有一个带有一堆递归类型声明的模块，并且已经有很多消费者执行open M然后使用type1，type2和type3

module M = struct
  type type1 = 
    | T1_Case1 of string * type2
    | T1_Case2 of type3
  and type2 = type3 list
  and type3 = 
    | T3_Case1 of type1
    | T3_Case2 of int
end

在其中一个处理步骤中，需要将这些类型中的一个更多地扩展为增加了一些额外数据的类型，有点类似于：

  type type1 = 
    | T1_Case1 of string * type2
    | T1_Case2 of type3
  and type2 = type3 list
  and type3_ = 
    | T3_Case1 of type1
    | T3_Case2 of int
  and type3 = extra_data * type3_

是否有可能在不涉及外部codegen工具或破坏现有代码的情况下实现这一目标？

后一个选项排除了将M转换为使用注释类型参数化的仿函数的可能性：

(* won't work since all places that used to deal with type3 should be updated *)
module type AnnotationT = sig type t end

module M_F(Annotation: AnnotationT) = struct
  type type1 = 
    | T1_Case1 of string * type2
    | T1_Case2 of type3
  and type2 = type3 list
  and type3_ = 
    | T3_Case1 of type1
    | T3_Case2 of int
  and type3 = Annotation.t * type3_
end

module M = struct 
 include M_F(struct type t = unit end)
end

我猜我需要这样的东西（因为我不能在类型声明中使用仿函数应用程序，所以不起作用）：

module type EnvelopeType = sig type t end

module type AnnotatorType = functor(Envelope: EnvelopeType) -> sig
  type t
end

module Annotated_M(Annotator: AnnotatorType) = struct
  type tt = T: Annotator().t
  type type1 = 
    | T1_Case1 of string * type2
    | T1_Case2 of type3
  and type2 = type3 list
  and type3_ = 
    | T3_Case1 of type1
    | T3_Case2 of int
  (* does not work *)
  and type3 = Annotator(struct type t = type3_ end).t
end

module M = struct
  include Annotated_M(functor (Envelope: EnvelopeType) -> struct
    type t = Envelope.t
  end)
end

module M2 = struct
  include Annotated_M(functor (Envelope: EnvelopeType) -> struct
    type t = extra_data * Envelope.t
  end)
end

Answer 1

如果您从仿函数切换到参数化类型，您的尝试可以正常工作：

module type AnnotatorType = sig

  type 'a annotated

end

module Annotated_M(Annotator: AnnotatorType) = struct

  type type1 = 
    | T1_Case1 of string * type2
    | T1_Case2 of type3
  and type2 = type3 list
  and type3_ = 
    | T3_Case1 of type1
    | T3_Case2 of int
  and type3 = type3_ Annotator.annotated

end

module M = struct
  include Annotated_M(struct
    type 'a annotated = 'a
  end)
end

module M2 = struct
  include Annotated_M(struct
    type 'a annotated = extra_data * 'a
  end)
end

Answer 2

这听起来像是X-Y问题的一个实例，虽然问题Y有技术解决方法，但问题X仍会存在。我稍后会详细说明，但现在我会建议几种技术解决方案。

您可以参数化定义类型的仿函数，其类型为已定义类型的每个分支（或某些分支），例如，

module type Variants = sig 
  type t1
  type t2
  ...
end

module Define(V : Variants) = struct
  type t = V of t1 | V of t2 ...
end

您可以改为使用参数类型：
```
type ('a,'b,..) t = A of 'a | B of 'b
```

这两种解决方案都是滥用数据构造函数，因为它们会牺牲灵活性的效率。

现在让我详细说明X问题。我的猜测是你试图代表一些语言转换（即一些中间表示，DSL等）。并且您有一对彼此非常同构的表示，但它们定义了不同的类型，您感知的问题是这两种类型的定义中涉及的代码重复。代码重复是错过抽象的常用指标。 OCaml中的类型定义不引入抽象，它们定义了类型的表示。抽象由模块类型引入，并使用模块实现，模块使用类型的特定表示。因此，您真正需要解决X问题的是一个适当的抽象。在我们的案例中，依靠标记的嵌入迫使我们披露我们的表示并使其具体而不是抽象。这有时是方便的，但相当脆弱，很快就会导致代码重复。一个可能的解决方案是tagless final style，它允许我们定义可扩展的抽象，而不会坚持特定的表示。

（可选）将额外数据附加到类型

2 个答案: