database - Bad performance when inserting nodes in Neo4j with java api -


i trying insert 2 million nodes neo4j , having trouble performance.

i using neo4j enterprise 2.2.0 server extension written in java. computer has ssd, 32gb ram, intel core i7 cpu , running windows 8. run standalone version of server , start running neo4j.bat in bin-folder.

it takes 25 seconds insert 10 000 nodes no relationships right (i need add relations later, 1 problem @ time).

i think matter of configuration played around settings bit, no change in performance. find weird if set initmemory , maxmemory settings 15000 in neo4j-wrapper.conf java process allocate 3gb maximum.

i attached code , configurations below, have clue doing wrong? performance should expect when inserting large graph?

code inserting

for (thing t : things) {     list<valuepair> properties = parsething(t);     string uid = createuid(t);      try (transaction tx = graphdb.begintx()) {          node node = graphdb.createnode();         node.setproperty("uid", uid);          (valuepair vp : properties) {             node.setproperty(vp.getname(), vp.getvalue());         }          tx.success();     } } 

(first adding dynamiclabel when creating nodes, slower. possible use labels if want performance when inserting nodes?)

configurations

neo4j.properties

################################################################ # neo4j # # neo4j.properties - database tuning parameters # ################################################################  # enable able upgrade store older version. #allow_store_upgrade=true  # amount of memory use mapping store files, in bytes (or # kilobytes 'k' suffix, megabytes 'm' , gigabytes 'g'). # if neo4j running on dedicated server, recommended # leave 2-4 gigabytes operating system, give jvm enough # heap hold transaction state , query context, , leave # rest page cache. # default page cache memory assumes machine dedicated running # neo4j, , heuristically set 75% of ram minus max java heap size. dbms.pagecache.memory=4g  # enable specify parser other default one. #cypher_parser_version=2.0  # keep logical logs, helps debugging uses more disk space, enabled # legacy reasons limit space needed store historical logs use values such # as: "7 days" or "100m size" instead of "true". #keep_logical_logs=7 days  # autoindexing  # enable auto-indexing nodes, default false. #node_auto_indexing=true  # node property keys auto-indexed, if enabled. #node_keys_indexable=name,age  # enable auto-indexing relationships, default false. #relationship_auto_indexing=true  # relationship property keys auto-indexed, if enabled. #relationship_keys_indexable=name,age  # enable shell server remote clients can connect via neo4j shell. #remote_shell_enabled=true # network interface ip shell listen on (use 0.0.0 interfaces). #remote_shell_host=127.0.0.1 # port shell listen on, default 1337. #remote_shell_port=1337  # type of cache use nodes , relationships. cache_type=hpc  cache.memory_ratio=70  # maximum size of heap memory dedicate cached nodes. node_cache_size=2g #relationship_cache_size=6g  # maximum size of heap memory dedicate cached relationships. #relationship_cache_size=  # enable online backups taken database. online_backup_enabled=true  # port listen incoming backup requests. online_backup_server=127.0.0.1:6362   # uncomment , specify these lines running neo4j in high availability mode. # see high availability setup tutorial more details on these settings # http://neo4j.com/docs/2.2.0/ha-setup-tutorial.html  # ha.server_id number of each instance in ha cluster. should # integer (e.g. 1), , should unique each cluster instance. #ha.server_id=  # ha.initial_hosts comma-separated list (without spaces) of host:port # ha.cluster_server of instances listening. typically # same cluster instances. #ha.initial_hosts=192.168.0.1:5001,192.168.0.2:5001,192.168.0.3:5001  # ip , port instance listen on, communicating cluster status # information iwth other instances (also see ha.initial_hosts). ip # must configured ip address 1 of local interfaces. #ha.cluster_server=192.168.0.1:5001  # ip , port instance listen on, communicating transaction # data other instances (also see ha.initial_hosts). ip # must configured ip address 1 of local interfaces. #ha.server=192.168.0.1:6001  # interval @ slaves pull updates master. comment out # option disable periodic pulling of updates. unit seconds. ha.pull_interval=10  # amount of slaves master try push transaction upon commit # (default 1). master optimistically continue , not fail # transaction if fails reach push factor. setting 0 # increase write performance when writing through master potentially # lead branched data (or loss of transaction) if master goes down. #ha.tx_push_factor=1  # strategy master use when pushing data slaves (if push factor # greater 0). there 2 options available "fixed" (default) or # "round_robin". fixed start pushing slaves ordered server id # (highest first) improving performance since slaves have cache # 1 transaction @ time. #ha.tx_push_strategy=fixed  # policy how handle branched data. #branched_data_policy=keep_all  # clustering timeouts # default timeout. #ha.default_timeout=5s  # how heartbeat messages should sent. defaults ha.default_timeout. #ha.heartbeat_interval=5s  # timeout heartbeats between cluster members. should @ least twice of ha.heartbeat_interval. #heartbeat_timeout=11s 

neo4j-server.properties

################################################################ # neo4j # # neo4j-server.properties - runtime operational settings # ################################################################  #*************************************************************** # server configuration #***************************************************************  # location of database directory org.neo4j.server.database.location=data/graph.db  # low-level graph engine tuning file org.neo4j.server.db.tuning.properties=conf/neo4j.properties  # database mode # allowed values: # ha - high availability # single - single mode, default. # run in high availability mode, configure neo4j.properties config file, uncomment line: #org.neo4j.server.database.mode=ha  # let webserver listen on specified ip. default localhost (only # accept local connections). uncomment allow connection. please see # security section in neo4j manual before modifying this. #org.neo4j.server.webserver.address=0.0.0.0  # require (or disable requirement of) auth access neo4j dbms.security.auth_enabled=true  # # http connector #  # http port (for data, administrative, , ui access) org.neo4j.server.webserver.port=7474  # # https connector #  # turn https-support on/off org.neo4j.server.webserver.https.enabled=true  # https port (for data, administrative, , ui access) org.neo4j.server.webserver.https.port=7473  # certificate location (auto generated if file not exist) org.neo4j.server.webserver.https.cert.location=conf/ssl/snakeoil.cert  # private key location (auto generated if file not exist) org.neo4j.server.webserver.https.key.location=conf/ssl/snakeoil.key  # internally generated keystore (don't try put own # keystore there, deleted when server starts) org.neo4j.server.webserver.https.keystore.location=data/keystore  # comma separated list of jax-rs packages containing jax-rs resources, 1 # package name each mountpoint. listed package names loaded # under mountpoints specified. uncomment line mount # org.neo4j.examples.server.unmanaged.helloworldresource.java # neo4j-server-examples under /examples/unmanaged, resulting in final url of # http://localhost:7474/examples/unmanaged/helloworld/{nodeid} #org.neo4j.server.thirdparty_jaxrs_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged  org.neo4j.server.thirdparty_jaxrs_classes=my.project.package=/mypath  #***************************************************************** # http logging configuration #*****************************************************************  # http logging disabled. http logging can enabled setting # property 'true'. org.neo4j.server.http.log.enabled=false  # logging policy file governs how http log output presented , # archived. note: changing rollover , retention policy sensible, # changing output format less so, since configured use # ubiquitous common log format org.neo4j.server.http.log.config=conf/neo4j-http-logging.xml  #***************************************************************** # administration client configuration #*****************************************************************  # location of servers round-robin database directory. possible values: # - absolute path /var/rrd # - path relative server working directory data/rrd # - commented out, default database data directory. org.neo4j.server.webadmin.rrdb.location=data/rrd 

neo4j-wrapper.conf

#******************************************************************** # property file references #********************************************************************  wrapper.java.additional=-dorg.neo4j.server.properties=conf/neo4j-server.properties wrapper.java.additional=-djava.util.logging.config.file=conf/logging.properties wrapper.java.additional=-dlog4j.configuration=file:conf/log4j.properties  #******************************************************************** # jvm parameters #********************************************************************  wrapper.java.additional.1=-xx:+useconcmarksweepgc wrapper.java.additional.2=-xx:+cmsclassunloadingenabled wrapper.java.additional.3=-xx:-omitstacktraceinfastthrow wrapper.java.additional.4=-xx:hashcode=5  # remote jmx monitoring, uncomment , adjust following lines needed. # make sure update jmx.access , jmx.password files appropriate permission roles , passwords, # shipped configuration contains read role called 'monitor' password 'neo4j'. # more details, see: http://download.oracle.com/javase/7/docs/technotes/guides/management/agent.html # on unix based systems jmx.password file needs owned user run server, # , have permissions set 0600. # details on setting these file permissions on windows see: #     http://docs.oracle.com/javase/7/docs/technotes/guides/management/security-windows.html #wrapper.java.additional=-dcom.sun.management.jmxremote.port=3637 #wrapper.java.additional=-dcom.sun.management.jmxremote.authenticate=true #wrapper.java.additional=-dcom.sun.management.jmxremote.ssl=false #wrapper.java.additional=-dcom.sun.management.jmxremote.password.file=conf/jmx.password #wrapper.java.additional=-dcom.sun.management.jmxremote.access.file=conf/jmx.access  # systems cannot discover host name automatically, , need line configured: #wrapper.java.additional=-djava.rmi.server.hostname=$the_neo4j_server_hostname  # uncomment following lines enable garbage collection logging #wrapper.java.additional=-xloggc:data/log/neo4j-gc.log #wrapper.java.additional=-xx:+printgcdetails #wrapper.java.additional=-xx:+printgcdatestamps #wrapper.java.additional=-xx:+printgcapplicationstoppedtime #wrapper.java.additional=-xx:+printpromotionfailure #wrapper.java.additional=-xx:+printtenuringdistribution  # java heap size: default java heap size dynamically # calculated based on available system resources. # uncomment these lines set specific initial , maximum # heap size in mb. wrapper.java.initmemory=15000 wrapper.java.maxmemory=15000  #******************************************************************** # wrapper settings #******************************************************************** # path relative bin dir wrapper.pidfile=../data/neo4j-server.pid  #******************************************************************** # wrapper windows nt/2000/xp service properties #******************************************************************** # warning - not modify of these properties when application #  using configuration file has been installed service. #  please uninstall service before modifying section.  #  service can reinstalled.  # name of service wrapper.name=neo4j  # user account used linux installs. default current # user if not set. wrapper.user=  #******************************************************************** # other neo4j system properties #******************************************************************** wrapper.java.additional=-dneo4j.ext.udc.source=zip  wrapper.java.additional=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -xdebug-xnoagent-djava.compiler=none-xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005 

you make me happy if me solve this!

you need create more 1 node in transaction, otherwise transaction overhead consumes of time.

please try way:

try (transaction tx = graphdb.begintx()) {      (thing t : things) {          list<valuepair> properties = parsething(t);         string uid = createuid(t);          node node = graphdb.createnode();         node.setproperty("uid", uid);          (valuepair vp : properties) {             node.setproperty(vp.getname(), vp.getvalue());         }     }      tx.success(); } 

Comments