使用Blazegraph的`NanoSparqlServer`,如何在测试之间清除图形?

时间:2016-08-09 18:17:57

标签: java testing junit blazegraph

我正在使用Blazegraph的Bigdata DB的1.5.3版本(现在更名为Blazegraph)。我有一个充当网关的服务,实现了一堆持久层方法。现在我正在为这些方法编写单元测试。我正在使用embedded setup with Jetty。我的设置代码如下:

    int port = 0; // random port
    String namespace = "kb";
    int queryThreadPoolSize = ConfigParams.DEFAULT_QUERY_THREAD_POOL_SIZE;
    boolean forceOverflow = false;

    String servletContextListenerClass = ConfigParams.DEFAULT_SERVLET_CONTEXT_LISTENER_CLASS;
    System.setProperty(SystemProperties.JETTY_XML, "jetty.xml");
    String propertyFile = "RWStore.properties";
    System.setProperty(SystemProperties.BIGDATA_PROPERTY_FILE, propertyFile);

    final Map<String, String> initParams = new LinkedHashMap<>();
    initParams.put("propertyFile", propertyFile);
    initParams.put("namespace", namespace);
    initParams.put("queryThreadPoolSize", Integer.toString(queryThreadPoolSize));
    initParams.put("forceOverflow", Boolean.toString(forceOverflow));
    initParams.put("servletContextListenerClass", servletContextListenerClass);

    sparqlServer = NanoSparqlServer.newInstance(port, journal, initParams);

    LOGGER.info("Waiting for NanoSparqlServer to start...");
    NanoSparqlServer.awaitServerStart(sparqlServer);
    serverUrl = sparqlServer.getURI().toString();
    LOGGER.info("NanoSparqlServer started on: " + serverUrl + '\n');

我正在使用com.bigdata 1.5.3 jar中的默认jetty.xml配置:

<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
<!-- See http://www.eclipse.org/jetty/documentation/current/ -->
<!-- See http://wiki.eclipse.org/Jetty/Reference/jetty.xml_syntax -->
<Configure id="Server" class="org.eclipse.jetty.server.Server">

    <!-- =========================================================== -->
    <!-- Configure the Server Thread Pool.                           -->
    <!-- The server holds a common thread pool which is used by      -->
    <!-- default as the executor used by all connectors and servlet  -->
    <!-- dispatches.                                                 -->
    <!--                                                             -->
    <!-- Configuring a fixed thread pool is vital to controlling the -->
    <!-- maximal memory footprint of the server and is a key tuning  -->
    <!-- parameter for tuning.  In an application that rarely blocks -->
    <!-- then maximal threads may be close to the number of 5*CPUs.  -->
    <!-- In an application that frequently blocks, then maximal      -->
    <!-- threads should be set as high as possible given the memory  -->
    <!-- available.                                                  -->
    <!--                                                             -->
    <!-- Consult the javadoc of o.e.j.util.thread.QueuedThreadPool   -->
    <!-- for all configuration that may be set here.                 -->
    <!-- =========================================================== -->
    <Arg name="threadpool"><New id="threadpool" class="org.eclipse.jetty.util.thread.QueuedThreadPool"/></Arg>
    <Get name="ThreadPool">
        <Set name="minThreads" type="int"><Property name="jetty.threads.min" default="10"/></Set>
        <Set name="maxThreads" type="int"><Property name="jetty.threads.max" default="64"/></Set>
        <Set name="idleTimeout" type="int"><Property name="jetty.threads.timeout" default="60000"/></Set>
        <Set name="detailedDump">false</Set>
    </Get>

    <!-- =========================================================== -->
    <!-- Get the platform mbean server                               -->
    <!-- =========================================================== -->
    <Call id="MBeanServer" class="java.lang.management.ManagementFactory"
          name="getPlatformMBeanServer" />

    <!-- =========================================================== -->
    <!-- Initialize the Jetty MBean container                        -->
    <!-- =========================================================== -->
    <!-- Note: This breaks CI if it is enabled
    <Call name="addBean">
      <Arg>
        <New id="MBeanContainer" class="org.eclipse.jetty.jmx.MBeanContainer">
          <Arg>
            <Ref refid="MBeanServer" />
          </Arg>
        </New>
      </Arg>
    </Call>-->

    <!-- Add the static log to the MBean server.
    <Call name="addBean">
      <Arg>
        <New class="org.eclipse.jetty.util.log.Log" />
      </Arg>
    </Call>-->

    <!-- For remote MBean access (optional)
    <New id="ConnectorServer" class="org.eclipse.jetty.jmx.ConnectorServer">
      <Arg>
        <New class="javax.management.remote.JMXServiceURL">
          <Arg type="java.lang.String">rmi</Arg>
          <Arg type="java.lang.String" />
          <Arg type="java.lang.Integer"><SystemProperty name="jetty.jmxrmiport" default="1090"/></Arg>
          <Arg type="java.lang.String">/jndi/rmi://<SystemProperty name="jetty.jmxrmihost" default="localhost"/>:<SystemProperty name="jetty.jmxrmiport" default="1099"/>/jmxrmi</Arg>
        </New>
      </Arg>
      <Arg>org.eclipse.jetty.jmx:name=rmiconnectorserver</Arg>
      <Call name="start" />
    </New>-->

    <!-- =========================================================== -->
    <!-- Http Configuration.                                         -->
    <!-- This is a common configuration instance used by all         -->
    <!-- connectors that can carry HTTP semantics (HTTP, HTTPS, SPDY)-->
    <!-- It configures the non wire protocol aspects of the HTTP     -->
    <!-- semantic.                                                   -->
    <!--                                                             -->
    <!-- Consult the javadoc of o.e.j.server.HttpConfiguration       -->
    <!-- for all configuration that may be set here.                 -->
    <!-- =========================================================== -->
    <New id="httpConfig" class="org.eclipse.jetty.server.HttpConfiguration">
        <Set name="secureScheme">https</Set>
        <Set name="securePort"><Property name="jetty.secure.port" default="8443" /></Set>
        <Set name="outputBufferSize"><Property name="jetty.output.buffer.size" default="32768" /></Set>
        <Set name="requestHeaderSize"><Property name="jetty.request.header.size" default="8192" /></Set>
        <Set name="responseHeaderSize"><Property name="jetty.response.header.size" default="8192" /></Set>
        <Set name="sendServerVersion"><Property name="jetty.send.server.version" default="true" /></Set>
        <Set name="sendDateHeader"><Property name="jetty.send.date.header" default="false" /></Set>
        <Set name="headerCacheSize">512</Set>
        <!-- Uncomment to enable handling of X-Forwarded- style headers
        <Call name="addCustomizer">
          <Arg><New class="org.eclipse.jetty.server.ForwardedRequestCustomizer"/></Arg>
        </Call>
        -->
    </New>

    <!-- Configure the HTTP endpoint.                                -->
    <Call name="addConnector">
        <Arg>
            <New class="org.eclipse.jetty.server.ServerConnector">
                <Arg name="server"><Ref refid="Server" /></Arg>
                <Arg name="factories">
                    <Array type="org.eclipse.jetty.server.ConnectionFactory">
                        <Item>
                            <New class="org.eclipse.jetty.server.HttpConnectionFactory">
                                <Arg name="config"><Ref refid="httpConfig" /></Arg>
                            </New>
                        </Item>
                    </Array>
                </Arg>
                <Set name="host"><SystemProperty name="jetty.host" /></Set>
                <Set name="port"><SystemProperty name="jetty.port" default="9999" /></Set>
                <Set name="idleTimeout"><SystemProperty name="http.timeout" default="30000"/></Set>
            </New>
        </Arg>
    </Call>

    <!-- =========================================================== -->
    <!-- Set handler Collection Structure                            -->
    <!-- =========================================================== -->
    <Set name="handler">
        <New id="Handlers" class="org.eclipse.jetty.server.handler.HandlerCollection">
            <Set name="handlers">
                <Array type="org.eclipse.jetty.server.Handler">
                    <Item>
                        <New id="Contexts" class="org.eclipse.jetty.server.handler.ContextHandlerCollection">
                            <Call name="addHandler">
                                <Arg>
                                    <!-- This is the redirect from root to /bigdata -->
                                    <New id="moved" class="org.eclipse.jetty.server.handler.MovedContextHandler">
                                        <Set name="contextPath">/</Set>
                                        <Set name="newContextURL">/bigdata</Set>
                                        <Set name="permanent">true</Set>
                                        <Set name="discardPathInfo">false</Set>
                                        <Set name="discardQuery">false</Set>
                                    </New>
                                </Arg>
                            </Call>
                            <Call name="addHandler">
                                <Arg>
                                    <!-- This is the bigdata web application. -->
                                    <New id="WebAppContext" class="org.eclipse.jetty.webapp.WebAppContext">
                                        <Set name="war"><SystemProperty name="jetty.resourceBase" default="bigdata-war/src"/></Set>
                                        <Set name="contextPath">/bigdata</Set>
                                        <Set name="descriptor">WEB-INF/web.xml</Set>
                                        <Set name="parentLoaderPriority">true</Set>
                                        <Set name="extractWAR">false</Set>
                                        <Set name="overrideDescriptor"><SystemProperty name="jetty.overrideWebXml" default="bigdata-war/src/WEB-INF/override-web.xml"/></Set>
                                        <Set name="maxFormContentSize">10485760</Set>
                                    </New>
                                </Arg>
                            </Call>
                        </New>
                    </Item>
                </Array>
            </Set>
        </New>
    </Set>

    <!-- =========================================================== -->
    <!-- extra server options                                        -->
    <!-- =========================================================== -->
    <Set name="stopAtShutdown">true</Set>
    <Set name="stopTimeout">5000</Set>
    <Set name="dumpAfterStart"><Property name="jetty.dump.start" default="false"/></Set>
    <Set name="dumpBeforeStop"><Property name="jetty.dump.stop" default="false"/></Set>

</Configure>

...我正在使用同一个jar中的默认RWStore.properties:

#
# Note: These options are applied when the journal and the triple store are
# first created.

##
## Journal options.
##

# The backing file. This contains all your data.  You want to put this someplace
# safe.  The default locator will wind up in the directory from which you start
# your servlet container.
com.bigdata.journal.AbstractJournal.createTempFile=true

# The persistence engine.  Use 'Disk' for the WORM or 'DiskRW' for the RWStore.
com.bigdata.journal.AbstractJournal.bufferMode=DiskRW

# Setup for the RWStore recycler rather than session protection.
com.bigdata.service.AbstractTransactionService.minReleaseAge=1

# Enable group commit. See http://wiki.blazegraph.com/wiki/index.php/GroupCommit and BLZG-192.
#com.bigdata.journal.Journal.groupCommit=false

com.bigdata.btree.writeRetentionQueue.capacity=4000
com.bigdata.btree.BTree.branchingFactor=128

# 200M initial extent.
com.bigdata.journal.AbstractJournal.initialExtent=209715200
com.bigdata.journal.AbstractJournal.maximumExtent=209715200

##
## Setup for QUADS mode without the full text index.
##
com.bigdata.rdf.sail.truthMaintenance=false
com.bigdata.rdf.store.AbstractTripleStore.quads=true
com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers=false
com.bigdata.rdf.store.AbstractTripleStore.textIndex=false
com.bigdata.rdf.store.AbstractTripleStore.axiomsClass=com.bigdata.rdf.axioms.NoAxioms

# Bump up the branching factor for the lexicon indices on the default kb.
com.bigdata.namespace.kb.lex.com.bigdata.btree.BTree.branchingFactor=400

# Bump up the branching factor for the statement indices on the default kb.
com.bigdata.namespace.kb.spo.com.bigdata.btree.BTree.branchingFactor=1024

# Uncomment to enable collection of OS level performance counters.  When
# collected they will be self-reported through the /counters servlet and
# the workbench "Performance" tab.
#
# com.bigdata.journal.Journal.collectPlatformStatistics=true

使用这些配置,服务器启动正常,我可以通过Java中的BigdataGraphClient API访问Web控制台,进行查询和交互。现在我只想弄清楚如何清除图表以避免单元测试之间的数据泄漏。我尝试了以下内容:

  1. 使用BigdataGraphClient Java API删除所有边和顶点。由于我不知道的原因,将一些边缘和顶点留在原位。 graph.getEdges.forEach(Edge::remove) graph.getVertices.forEach(Vertex::remove)

  2. 停止并销毁服务器。留下日志文件。

    sparqlServer.stop(); sparqlServer.destroy();

  3. 通过设置com.bigdata.journal.AbstractJournal.createTempFile=true并注释掉com.bigdata.journal.AbstractJournal.file=bigdata.jnl来使用临时日记文件。这会清除日志文件,但在第一次测试后会抛出DatasetNotFoundException

  4. 将日志文件放在/tmp/bigdata-test/bigdata.jnl的临时目录中,并在测试之间删除/重新创建该目录。这与#2具有相同的问题。

  5. 尝试创建我自己的Journal对象,并将其作为IndexManager方法的NanoSparqlServer.newInstance参数传递。由于old Lucene dependencies的已知问题,此操作失败。我不能在我的项目中包含这些,因为我依赖于与此相冲突的较新版本的Lucene。抛出的错误与引用的Jira票证中记录的错误相同。

  6. 任何人都知道在测试之间清除图表的清晰,可靠的方法(在每次测试后运行tearDown方法)?

1 个答案:

答案 0 :(得分:0)

原来我遇到了另一个问题,这让我觉得我的第一个方法不起作用。这种方法很好用。我要离开这个问题以防其他人想知道如何做到这一点。我也对更清洁/更快的方式持开放态度。如果测试插入了大量数据,则遍历所有三元组/四元组并逐个删除它们可能会很慢。我更喜欢取消链接日记中的文件。