像Scala流一样处理SQL ResultSet

时间:2012-03-09 15:25:01

标签: scala stream resultset

当我查询数据库并接收(仅向前,只读)ResultSet时,ResultSet就像一个数据库行列表。

我试图找到一些方法来像Scala Stream一样对待这个ResultSet。这将允许filtermap等操作,同时不会占用大量RAM。

我实现了一个尾递归方法来提取单个项目,但这要求所有项目同时在内存中,如果ResultSet非常大则会出现问题:

// Iterate through the result set and gather all of the String values into a list
// then return that list
@tailrec
def loop(resultSet: ResultSet,
         accumulator: List[String] = List()): List[String] = {
  if (!resultSet.next) accumulator.reverse
  else {
    val value = resultSet.getString(1)
    loop(resultSet, value +: accumulator)
  }
}

10 个答案:

答案 0 :(得分:69)

我没有测试过,但为什么它不起作用?

new Iterator[String] {
  def hasNext = resultSet.next()
  def next() = resultSet.getString(1)
}.toStream

答案 1 :(得分:10)

@ elbowich回答的效用函数:

System.InvalidCastException was unhandled
HResult=-2147467262
Message=Return argument has an invalid type.
Source=mscorlib
StackTrace:
   at System.Runtime.Remoting.Proxies.RealProxy.ValidateReturnArg(Object arg, Type paramType)
   at System.Runtime.Remoting.Proxies.RealProxy.PropagateOutParameters(IMessage msg, Object[] outArgs, Object returnValue)
   at System.RuntimeType.ForwardCallToInvokeMember(String memberName, BindingFlags flags, Object target, Int32[] aWrapperTypes, MessageData& msgData)
   at REPOSITORYUTILLib.DispIObjectRepositoryUtil.GetChildren(Object Parent)
   at ORReader.Program.Main(String[] args) in c:\Users\DDDAVID.DDDAVID-IN\Documents\Visual Studio 2013\Projects\ORReader\ORReader\Program.cs:line 17
   at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
   at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
   at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
InnerException: 

允许您使用类型推断。 E.g:

def results[T](resultSet: ResultSet)(f: ResultSet => T) = {
  new Iterator[T] {
    def hasNext = resultSet.next()
    def next() = f(resultSet)
  }
}

答案 2 :(得分:8)

对于隐含的课程来说,这听起来像是一个很好的机会。首先在某处定义隐式类:

import java.sql.ResultSet

object Implicits {

    implicit class ResultSetStream(resultSet: ResultSet) {

        def toStream: Stream[ResultSet] = {
            new Iterator[ResultSet] {
                def hasNext = resultSet.next()

                def next() = resultSet
            }.toStream
        }
    }
}

接下来,只需在执行查询的任何位置导入此隐式类,并定义ResultSet对象:

import com.company.Implicits._

最后使用toStream方法获取数据。例如,获取所有ID,如下所示:

val allIds = resultSet.toStream.map(result => result.getInt("id"))

答案 3 :(得分:3)

我需要类似的东西。基于elbowich的非常酷的答案,我把它包裹了一下,而不是字符串,我返回结果(所以你可以获得任何列)

def resultSetItr(resultSet: ResultSet): Stream[ResultSet] = {
    new Iterator[ResultSet] {
      def hasNext = resultSet.next()
      def next() = resultSet
    }.toStream
  }

我需要访问表元数据,但这适用于表行(可以执行stmt.executeQuery(sql)而不是md.getColumns):

 val md = connection.getMetaData()
 val columnItr = resultSetItr( md.getColumns(null, null, "MyTable", null))
      val columns = columnItr.map(col => {
        val columnType = col.getString("TYPE_NAME")
        val columnName = col.getString("COLUMN_NAME")
        val columnSize = col.getString("COLUMN_SIZE")
        new Column(columnName, columnType, columnSize.toInt, false)
      })

答案 4 :(得分:2)

因为ResultSet只是下一个导航的可变对象,我们需要定义我们自己的下一行概念。我们可以使用输入函数进行如下操作:

class ResultSetIterator[T](rs: ResultSet, nextRowFunc: ResultSet => T) 
extends Iterator[T] {

  private var nextVal: Option[T] = None

  override def hasNext: Boolean = {
    val ret = rs.next()
    if(ret) {
      nextVal = Some(nextRowFunc(rs))
    } else {
      nextVal = None
    }
    ret
  }

  override def next(): T = nextVal.getOrElse { 
    hasNext 
    nextVal.getOrElse( throw new ResultSetIteratorOutOfBoundsException 
  )}

  class ResultSetIteratorOutOfBoundsException extends Exception("ResultSetIterator reached end of list and next can no longer be called. hasNext should return false.")
}

编辑: 按上述方式转换为流或其他内容。

答案 5 :(得分:0)

这种实现虽然更长,更笨拙但与ResultSet合同更好地对应。副作用已从hasNext(...)中删除并移至next()。

new Iterator[String] {
  private var available = resultSet.next()
  override def hasNext: Boolean = available
  override def next(): String = {
    val string = resultSet.getString(1)
    available = resultSet.next()
    string
  }
}

答案 6 :(得分:0)

我认为上述大多数实现都有不确定性的hasNext方法。调用两次将光标移至第二行。我建议使用类似的东西:

  new Iterator[ResultSet] {
    def hasNext = {
      !resultSet.isLast
    }
    def next() = {
      resultSet.next()
      resultSet
    }
  }

答案 7 :(得分:0)

Iterator.continually(rs.next())
  .takeWhile(identity)
  .map(_ => Model(
      id = rs.getInt("id"),
      text = rs.getString("text")
   ))

答案 8 :(得分:0)

这里是替代方案,类似于Sergey Alaev和thoredge的解决方案,因为当我们需要一种遵守Iterator合同且hasNext无副作用的解决方案时。

假设一个函数f: ResultSet => T

Iterator.unfold(resultSet.next()) { hasNext =>
  Option.when(hasNext)(f(resultSet), resultSet.next())
}

我发现在map上使用ResultSet“扩展方法”很有用。

implicit class ResultSetOps(resultSet: ResultSet) {
    def map[T](f: ResultSet => T): Iterator[T] = {
      Iterator.unfold(resultSet.next()) { hasNext =>
        Option.when(hasNext)(f(resultSet), resultSet.next())
      }
    }
  }

答案 9 :(得分:0)

上面的另一个变体,可与Scala 2.12一起使用:

implicit class ResultSetOps(resultSet: ResultSet) {
 def map[T](f: ResultSet => T): Iterator[T] =
  Iterator.continually(resultSet).takeWhile(_.next()).map(f)
}