Question

我正在尝试在Scala中编写数据模块。

在并行加载整个数据时，某些数据依赖于其他数据，因此必须以有效的方式管理执行顺序。

例如，在代码中，我保留了一个包含数据名称和清单

的地图

val dataManifestMap = Map(
  "foo" -> manifest[String],
  "bar" -> manifest[Int],
  "baz" -> manifest[Int],
  "foobar" -> manifest[Set[String]], // need to be executed after "foo" and "bar" is ready
  "foobarbaz" -> manifest[String], // need to be executed after "foobar" and "baz" is ready
)

这些数据将存储在可变的哈希映射

中

private var dataStorage = new mutable.HashMap[String, Future[Any]]()

有些代码会加载数据

def loadAllData(): Future[Unit] = {
  Future.join(
    (dataManifestMap map {
      case (data, m) => loadData(data, m) } // function has all the string matching and loading stuff
    ).toSeq
  )    
}

def loadData[T](data: String, m: Manifest[T]): Future[Unit] = {
  val d = data match {
    case "foo" => Future.value("foo")
    case "bar" => Future.value(3)
    case "foobar" => // do something with dataStorage("foo") and dataStorage("bar")
    ... // and so forth (in a real example it would be much more complicated for sure)
  }

  d flatMap { 
    dVal => { this.synchronized { dataStorage(data) = dVal }; Future.value(Unit) }
  }
}

这样，当“foo”和“bar”准备就绪时，我无法确保“foobar”被加载，等等。

我如何以“酷”的方式管理它，因为我可能有数百种不同的数据？

如果我可以拥有某种类型的数据结构，并且必须在某些内容之后加载某些内容的信息，并且可以通过flatMap以简洁的方式处理顺序执行，那将是“非常棒的”。

感谢您的帮助。

Answer 1

在所有条件相同的情况下，我倾向于使用for理解。例如：

def findBucket: Future[Bucket[Empty]] = ???
def fillBucket(bucket: Bucket[Empty]): Future[Bucket[Water]] = ???
def extinguishOvenFire(waterBucket: Bucket[Water]): Future[Oven] = ???
def makeBread(oven: Oven): Future[Bread] = ???
def makeSoup(oven: Oven): Future[Soup] = ???
def eatSoup(soup: Soup, bread: Bread): Unit = ???


def doLunch = {
  for (bucket <- findBucket;
       filledBucket <- fillBucket(bucket);
       oven <- extinguishOvenFire(filledBucket);
       soupFuture = makeSoup(oven);
       breadFuture = makeBread(oven);
       soup <- soupFuture;
       bread <- breadFuture) {
    eatSoup(soup, bread)
  }
}

这将期货连在一起，并在满足依赖性后调用相关方法。请注意，我们在=理解中使用for，允许我们同时启动两个Futures。目前，doLunch会返回Unit，但如果您将最后几行替换为：

// ..snip..
       bread <- breadFuture) yield {
    eatSoup(soup, bread)
    oven
  }
}

然后它将返回Future[Oven] - 如果您想在午餐后使用烤箱做其他事情，这可能会有用。

至于你的代码，我的第一个问题是你应该考虑Spray cache，因为它看起来可能符合你的要求。如果没有，我的下一个想法是替换你目前获得的Stringly typed接口，并根据类型化的方法调用进行处理：

private def save[T](key: String)(value: Future[T]) = this.synchronized {
  dataStorage(key) = value
  value
}

def loadFoo = save("foo"){Future("foo")}
def loadBar = save("bar"){Future(3)}
def loadFooBar = save("foobar"){
  for (foo <- loadFoo;
       bar <- loadBar) yield foo + bar // Or whatever
}
def loadBaz = save("baz"){Future(200L)}
def loadAll = {
  val topLevelFutures = Seq(loadFooBar, loadBaz)
  // Use standard library function to combine futures
  Future.fold(topLevelFutures)(())((u,f) => ())
}

// I don't consider this method necessary, but if you've got a legacy API to support...
def loadData[T](key: String)(implicit manifest: Manifest[T]) = {
  val future = key match {
      case "foo" => loadFoo
      case "bar" => loadBar
      case "foobar" => loadFooBar
      case "baz" => loadBaz
      case "all" => loadAll
  }
  future.mapTo[T]
}

Scala：管理Futures顺序执行的好方法？

1 个答案: