我正在使用neo4j enterprise 2.2.0和用java编写的服务器扩展。我的电脑有一台ssd,32GB RAM,Intel Core i7 cpu,运行的是Windows 8.我运行一个独立版本的服务器,然后在bin文件夹中运行Neo4j.bat启动它。
现在插入10 000个没有关系的节点大约需要25秒(我需要稍后添加关系,但当时有一个问题)。
for (Thing t : things) {
List<ValuePair> properties = parseThing(t);
String uid = createUid(t);
try (Transaction tx = graphDb.beginTx()) {
Node node = graphDb.createNode();
node.setProperty("uid", uid);
for (ValuePair vp : properties) {
node.setProperty(vp.getName(), vp.getValue());
# Neo4j
# neo4j.properties - database tuning parameters
# Enable this to be able to upgrade a store from an older version.
# The amount of memory to use for mapping the store files, in bytes (or
# kilobytes with the 'k' suffix, megabytes with 'm' and gigabytes with 'g').
# If Neo4j is running on a dedicated server, then it is generally recommended
# to leave about 2-4 gigabytes for the operating system, give the JVM enough
# heap to hold all your transaction state and query context, and then leave the
# rest for the page cache.
# The default page cache memory assumes the machine is dedicated to running
# Neo4j, and is heuristically set to 75% of RAM minus the max Java heap size.
# Enable this to specify a parser other than the default one.
# Keep logical logs, helps debugging but uses more disk space, enabled for
# legacy reasons To limit space needed to store historical logs use values such
# as: "7 days" or "100M size" instead of "true".
#keep_logical_logs=7 days
# Autoindexing
# Enable auto-indexing for nodes, default is false.
# The node property keys to be auto-indexed, if enabled.
# Enable auto-indexing for relationships, default is false.
# The relationship property keys to be auto-indexed, if enabled.
# Enable shell server so that remote clients can connect via Neo4j shell.
# The network interface IP the shell will listen on (use 0.0.0 for all interfaces).
# The port the shell will listen on, default is 1337.
# The type of cache to use for nodes and relationships.
# Maximum size of the heap memory to dedicate to the cached nodes.
# Maximum size of the heap memory to dedicate to the cached relationships.
# Enable online backups to be taken from this database.
# Port to listen to for incoming backup requests.
# Uncomment and specify these lines for running Neo4j in High Availability mode.
# See the High availability setup tutorial for more details on these settings
# http://neo4j.com/docs/2.2.0/ha-setup-tutorial.html
# ha.server_id is the number of each instance in the HA cluster. It should be
# an integer (e.g. 1), and should be unique for each cluster instance.
# ha.initial_hosts is a comma-separated list (without spaces) of the host:port
# where the ha.cluster_server of all instances will be listening. Typically
# this will be the same for all cluster instances.
# IP and port for this instance to listen on, for communicating cluster status
# information iwth other instances (also see ha.initial_hosts). The IP
# must be the configured IP address for one of the local interfaces.
# IP and port for this instance to listen on, for communicating transaction
# data with other instances (also see ha.initial_hosts). The IP
# must be the configured IP address for one of the local interfaces.
# The interval at which slaves will pull updates from the master. Comment out
# the option to disable periodic pulling of updates. Unit is seconds.
# Amount of slaves the master will try to push a transaction to upon commit
# (default is 1). The master will optimistically continue and not fail the
# transaction even if it fails to reach the push factor. Setting this to 0 will
# increase write performance when writing through master but could potentially
# lead to branched data (or loss of transaction) if the master goes down.
# Strategy the master will use when pushing data to slaves (if the push factor
# is greater than 0). There are two options available "fixed" (default) or
# "round_robin". Fixed will start by pushing to slaves ordered by server id
# (highest first) improving performance since the slaves only have to cache up
# one transaction at a time.
# Policy for how to handle branched data.
# Clustering timeouts
# Default timeout.
# How often heartbeat messages should be sent. Defaults to ha.default_timeout.
# Timeout for heartbeats between cluster members. Should be at least twice that of ha.heartbeat_interval.
# Neo4j
# neo4j-server.properties - runtime operational settings
# Server configuration
# location of the database directory
# Low-level graph engine tuning file
# Database mode
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the neo4j.properties config file, then uncomment this line:
# Let the webserver only listen on the specified IP. Default is localhost (only
# accept local connections). Uncomment to allow any connection. Please see the
# security section in the neo4j manual before modifying this.
# Require (or disable the requirement of) auth to access Neo4j
# HTTP Connector
# http port (for all data, administrative, and UI access)
# HTTPS Connector
# Turn https-support on/off
# https port (for all data, administrative, and UI access)
# Certificate location (auto generated if the file does not exist)
# Private key location (auto generated if the file does not exist)
# Internally generated keystore (don't try to put your own
# keystore there, it will get deleted when the server starts)
# Comma separated list of JAX-RS packages containing JAX-RS resources, one
# package name for each mountpoint. The listed package names will be loaded
# under the mountpoints specified. Uncomment this line to mount the
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from
# neo4j-server-examples under /examples/unmanaged, resulting in a final URL of
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId}
# HTTP logging configuration
# HTTP logging is disabled. HTTP logging can be enabled by setting this
# property to 'true'.
# Logging policy file that governs how HTTP log output is presented and
# archived. Note: changing the rollover and retention policy is sensible, but
# changing the output format is less so, since it is configured to use the
# ubiquitous common log format
# Administration client configuration
# location of the servers round-robin database directory. possible values:
# - absolute path like /var/rrd
# - path relative to the server working directory like data/rrd
# - commented out, will default to the database data directory.
# Property file references
# JVM Parameters
# Remote JMX monitoring, uncomment and adjust the following lines as needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/7/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
# http://docs.oracle.com/javase/7/docs/technotes/guides/management/security-windows.html
# Some systems cannot discover host name automatically, and need this line configured:
# Uncomment the following lines to enable garbage collection logging
# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size in MB.
# Wrapper settings
# path is relative to the bin dir
# Wrapper Windows NT/2000/XP Service Properties
# WARNING - Do not modify any of these properties when an application
# using this configuration file has been installed as a service.
# Please uninstall the service before modifying this section. The
# service can then be reinstalled.
# Name of the service
# User account to be used for linux installs. Will default to current
# user if not set.
# Other Neo4j system properties
wrapper.java.additional=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -Xdebug-Xnoagent-Djava.compiler=NONE-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005
答案 0 :(得分:2)
try (Transaction tx = graphDb.beginTx()) {
for (Thing t : things) {
List<ValuePair> properties = parseThing(t);
String uid = createUid(t);
Node node = graphDb.createNode();
node.setProperty("uid", uid);
for (ValuePair vp : properties) {
node.setProperty(vp.getName(), vp.getValue());
答案 1 :(得分:2)
非常感谢Christian Morgner和Michael Hunger指出我正确的方向!
public static final int CPU = Runtime.getRuntime().availableProcessors()*2;
public static final int BATCH_NODES = 100_000;
public static final int BATCH_RELATIONS = 50_000;
ExecutorService pool = createPool(CPU, CPU * 25);
for(int i = 0; i < things.size(); i = i + BATCH_NODES) {
CreateNodeAndRelationRunner nodeRunner;
if(i + BATCH_NODES < things.size()) {
nodeRunner = new CreateNodeRunner(graphDb, things.subList(i, i + BATCH_NODES));
} else {
nodeRunner = new CreateNodeRunner(graphDb, things.subList(i, things.size()));
boolean nodesCreated = false;
try {
nodesCreated = pool.awaitTermination(1, TimeUnit.DAYS);
} catch (InterruptedException e) {
logger.debug("CreateNodeThread was interrupted");
if(nodesCreated) {
pool = createPool(CPU, CPU * 25);
for(int i = 0; i < things.size(); i=i+ BATCH_RELATIONS) {
CreateRelationsRunner relationsRunner;
if(i+ BATCH_RELATIONS < things.size()) {
relationsRunner = new CreateRelationsRunner(graphDb, things.subList(i, i+ BATCH_RELATIONS));
} else {
relationsRunner = new CreateRelationsRunner(graphDb, things.subList(i, things.size()));
<强> CreateNodeRunner.java 强>
public class CreateNodeRunner implements Runnable {
private List<Thing> things;
private GraphDatabaseService graphDb;
public CreateNodeRunner(GraphDatabaseService graphDb, List<Thing> things) {
this.things = things;
this.graphDb = graphDb;
public void run() {
try (Transaction tx = graphDb.beginTx()) {
for(Thing t : things) {
Node node = graphDb.createNode(t.getLabel());
node.setProperty("uid", t.getUid());
for (ValuePair vp : t.getProperties()) {
node.setProperty(vp.getName(), vp.getValue());
<强> CreateRelationsRunner.java 强>
public class CreateRelationsRunner implements Runnable {
private GraphDatabaseService graphDb;
private List<Thing> things;
public CreateRelationsRunner(GraphDatabaseService graphDb, List<Thing> things) {
this.graphDb = graphDb;
this.things = things;
public void run() {
try (Transaction tx = graphDb.beginTx()) {
for(Thing tFrom : things) {
List<ValuePair> relations = tFrom.getRelations();
Label label = tFrom.getLabel();
Node firstNode = graphDb.findNode(label, "uid", tFrom.getUid());
for(ValuePair vp : relations) {
Thing tTo = (Thing) vp.getValue();
label = tTo.getLabel();
Node secondNode = graphDb.findNode(label, "uid", tTo.getUid());
RelationshipType relType = vp.getRelationshipType();
firstNode.createRelationshipTo(secondNode, relType);
如果您发现错误或看到可能的改进,请告诉我们。 :)