python - Spark worker keep removing and adding executors -
i tried build spark cluster using local ubuntu virtual machine master, , remote ubuntu virtual machine worker. local virtual machine running in virtualbox, make accessible remote guest, forwarded virtual machine's 7077 port host's 7077 port. start master by:
./sbin/start-master.sh -h 0.0.0.0 -p 7077
i made listening on 0.0.0.0
, because if use default 127.0.1.1
, remote guest won't able connect it. start worker executing following command on remote machine:
./bin/spark-class org.apache.spark.deploy.worker.worker spark://129.22.151.82:7077
the worker able connect master, can seen on ui:
then tried run "pi" example python code:
from pyspark import sparkcontext, sparkconf conf=sparkconf().setappname("pi").setmaster("spark://0.0.0.0:7077) sc=sparkcontext(conf=conf)
.... once run it, program never stops, noticed program removing , adding executors, because executors exits error code 1. , executor's stderr
:
using spark's default log4j profile: org/apache/spark/log4j- defaults.properties 16/02/25 13:22:22 info coarsegrainedexecutorbackend: registered signal handlers [term, hup, int] 16/02/25 13:22:22 warn nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable 16/02/25 13:22:23 info securitymanager: changing view acls to: kxz138,adminuser 16/02/25 13:22:23 info securitymanager: changing modify acls to: kxz138,adminuser 16/02/25 13:22:23 info securitymanager: securitymanager: authentication disabled; ui acls disabled; users view permissions: set(kxz138, adminuser); users modify permissions: set(kxz138, adminuser) **16/02/25 13:22:23 error usergroupinformation: priviledgedactionexception as:adminuser (auth:simple) cause:java.io.ioexception: failed connect /10.0.2.15:34935 exception in thread "main" java.io.ioexception: failed connect /10.0.2.15:34935** @ org.apache.spark.network.client.transportclientfactory.createclient(transportclientfactory.java:216) @ org.apache.spark.network.client.transportclientfactory.createclient(transportclientfactory.java:167) @ org.apache.spark.rpc.netty.nettyrpcenv.createclient(nettyrpcenv.scala:200) @ org.apache.spark.rpc.netty.outbox$$anon$1.call(outbox.scala:187)
i noticed error here network problem. worker trying access 10.0.2.15
local nat ip address of virtual machine, failed. error never occurs when deploy worker local computer. has idea why error occurs? why worker trying access ip address 10.0.2.15
instead of public ip?
btw, i've set key-less ssh access master slave.
Comments
Post a Comment