amazon ec2 - sparkr on ec2 : ensure that workers are registered and have sufficient memory -


i set spark (spark-1.4.0) cluster on ec2 using spark-ec2 script comes release. starts fine master , 1 slave , able check status on http://:8080

now i'd run sparkr on cluster, runs fine in local mode on master , slave:

rscript myscript.r local[2] 

in myscript.r have following lines:

library(sparkr) #initialize spark context  sc <- sparkr.init(args[[1]], "bartcv") 

but when try run on cluster :

[ec2-user@ip-10-234-176-66 ~]$ rscript myscript.r spark://ec2-ww-xx-yy-zz.something.amazonaws.com:7077  loading required package: methods [sparkr] initializing classpath /usr/lib64/r/library/sparkr/sparkr-assembly-0.1.jar launching java command  /usr/lib/jvm/java/bin/java   -xmx512m -cp '/usr/lib64/r/library/sparkr/sparkr-assembly-0.1.jar:' edu.berkeley.cs.amplab.sparkr.sparkrbackend /tmp/rtmp2kylxz/backend_portb3c54b28b03  15/07/16 13:47:16 info slf4jlogger: slf4jlogger started 15/07/16 13:47:37 warn taskschedulerimpl: initial job has not accepted resources; check cluster ui ensure workers registered , have sufficient memory 15/07/16 13:47:52 warn taskschedulerimpl: initial job has not accepted resources; check cluster ui ensure workers registered , have sufficient memory 15/07/16 13:48:07 warn taskschedulerimpl: initial job has not accepted resources; check cluster ui ensure workers registered , have sufficient memory 15/07/16 13:48:19 error sparkdeployschedulerbackend: application has been killed. reason: masters unresponsive! giving up. collect on 5 failed java.lang.reflect.invocationtargetexception java.lang.reflect.invocationtargetexception         @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)         @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57)         @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43)         @ java.lang.reflect.method.invoke(method.java:606)         @ edu.berkeley.cs.amplab.sparkr.sparkrbackendhandler.handlemethodcall(sparkrbackendhandler.scala:111)         @ edu.berkeley.cs.amplab.sparkr.sparkrbackendhandler.channelread0(sparkrbackendhandler.scala:58)         @ edu.berkeley.cs.amplab.sparkr.sparkrbackendhandler.channelread0(sparkrbackendhandler.scala:19)         @ io.netty.channel.simplechannelinboundhandler.channelread(simplechannelinboundhandler.java:105)         @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:333)         @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:319)         @ io.netty.handler.codec.messagetomessagedecoder.channelread(messagetomessagedecoder.java:103)         @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:333)         @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:319)         @ io.netty.handler.codec.bytetomessagedecoder.channelread(bytetomessagedecoder.java:163)         @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abstractchannelhandlercontext.java:333)         @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstractchannelhandlercontext.java:319)         @ io.netty.channel.defaultchannelpipeline.firechannelread(defaultchannelpipeline.java:787)         @ io.netty.channel.nio.abstractniobytechannel$niobyteunsafe.read(abstractniobytechannel.java:130)         @ io.netty.channel.nio.nioeventloop.processselectedkey(nioeventloop.java:511)         @ io.netty.channel.nio.nioeventloop.processselectedkeysoptimized(nioeventloop.java:468)         @ io.netty.channel.nio.nioeventloop.processselectedkeys(nioeventloop.java:382)         @ io.netty.channel.nio.nioeventloop.run(nioeventloop.java:354)         @ io.netty.util.concurrent.singlethreadeventexecutor$2.run(singlethreadeventexecutor.java:116)         @ io.netty.util.concurrent.defaultthreadfactory$defaultrunnabledecorator.run(defaultthreadfactory.java:137)         @ java.lang.thread.run(thread.java:745) caused by: org.apache.spark.sparkexception: job aborted due stage failure: masters unresponsive! giving up.         @ org.apache.spark.scheduler.dagscheduler.org$apache$spark$scheduler$dagscheduler$$failjobandindependentstages(dagscheduler.scala:1185)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1174)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1173)         @ scala.collection.mutable.resizablearray$class.foreach(resizablearray.scala:59)         @ scala.collection.mutable.arraybuffer.foreach(arraybuffer.scala:47)         @ org.apache.spark.scheduler.dagscheduler.abortstage(dagscheduler.scala:1173)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:688)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:688)         @ scala.option.foreach(option.scala:236)         @ org.apache.spark.scheduler.dagscheduler.handletasksetfailed(dagscheduler.scala:688)         @ org.apache.spark.scheduler.dagschedulereventprocessactor$$anonfun$receive$2.applyorelse(dagscheduler.scala:1391)         @ akka.actor.actorcell.receivemessage(actorcell.scala:498)         @ akka.actor.actorcell.invoke(actorcell.scala:456)         @ akka.dispatch.mailbox.processmailbox(mailbox.scala:237)         @ akka.dispatch.mailbox.run(mailbox.scala:219)         @ akka.dispatch.forkjoinexecutorconfigurator$akkaforkjointask.exec(abstractdispatcher.scala:386)         @ scala.concurrent.forkjoin.forkjointask.doexec(forkjointask.java:260)         @ scala.concurrent.forkjoin.forkjoinpool$workqueue.runtask(forkjoinpool.java:1339)         @ scala.concurrent.forkjoin.forkjoinpool.runworker(forkjoinpool.java:1979)         @ scala.concurrent.forkjoin.forkjoinworkerthread.run(forkjoinworkerthread.java:107) error: returnstatus == 0 not true execution halted 

i added following lines /root/spark/conf/spark-env.sh :

export spark_worker_memory=2g export spark_executor_memory=2g export spark_driver_memory=2g 

and copy slaves

~/spark-ec2/copy-dir /root/spark/conf/ 

but i'm still getting same error.

i figured out without work rscript won't load necessary ec2 spark configuration, sparkr will. sparkr bash script loads necessary configuration ec2 cluster, i.e. sparkr (in /root/spark/bin/sparkr) sources spark configuration files (/root/spark/bin/load-spark-env.sh in turn calls /root/spark/conf/spark-env.sh) , executes spark-submit sparkr-shell-main.


Comments