Setting Spark master ip @
I have a Spark workers which can't connect to its master because of an IP issue. On the start-all.sh on the master (which name is 'pl'), I get the following on the slave log :
16/02/12 21:28:35 INFO WorkerWebUI: Started WorkerWebUI at http://192.168.0.38:8081 16/02/12 21:28:35 INFO Worker: Connecting to master pl:7077... 16/02/12 21:28:35 WARN Worker: Failed to connect to master pl:7077 java.io.IOException: Failed to connect to pl/192.168.0.39:7077
Here is my /etc/hosts file :
$ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 wk 192.168.0.39 pl # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters
It seems like spark worker is confused between the master names and IP address...How should I set up this ?
Another question is : looking at the master's logs, it seems that the master is listening on another port (7078) than the one the worker is trying to reach (7077) because of a failure to start on the 1st port tried.
romain@pl:~/spark-1.6.0-bin-hadoop2.6/logs$ cat spark-romain-org.apache.spark.deploy.master.Master-1-pl.out Spark Command: /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java -cp /home/romain/spark-1.6.0-bin-hadoop2.6/conf/:/home/romain/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/home/romain/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/home/romain/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/home/romain/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar -Xms1g -Xmx1g -XX:MaxPermSize=256m org.apache.spark.deploy.master.Master --ip pl --port 7077 --webui-port 8080 ======================================== Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/02/12 21:28:35 INFO Master: Registered signal handlers for [TERM, HUP, INT] 16/02/12 21:28:35 WARN Utils: Your hostname, pl resolves to a loopback address: 127.0.1.1; using 192.168.0.39 instead (on interface eth0) 16/02/12 21:28:35 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 16/02/12 21:28:35 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/02/12 21:28:35 INFO SecurityManager: Changing view acls to: romain 16/02/12 21:28:35 INFO SecurityManager: Changing modify acls to: romain 16/02/12 21:28:35 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(romain); users with modify permissions: Set(romain) 16/02/12 21:28:36 WARN Utils: Service 'sparkMaster' could not bind on port 7077. Attempting port 7078. 16/02/12 21:28:36 INFO Utils: Successfully started service 'sparkMaster' on port 7078. 16/02/12 21:28:36 INFO Master: Starting Spark master at spark://pl:7078 16/02/12 21:28:36 INFO Master: Running Spark version 1.6.0 16/02/12 21:28:36 WARN Utils: Service 'MasterUI' could not bind on port 8080. Attempting port 8081. 16/02/12 21:28:36 WARN Utils: Service 'MasterUI' could not bind on port 8081. Attempting port 8082. 16/02/12 21:28:36 INFO Utils: Successfully started service 'MasterUI' on port 8082. 16/02/12 21:28:36 INFO MasterWebUI: Started MasterWebUI at http://192.168.0.39:8082 16/02/12 21:28:36 WARN Utils: Service could not bind on port 6066. Attempting port 6067. 16/02/12 21:28:36 INFO Utils: Successfully started service on port 6067. 16/02/12 21:28:36 INFO StandaloneRestServer: Started REST server for submitting applications on port 6067 16/02/12 21:28:36 INFO Master: I have been elected leader! New state: ALIVE
But what is strange is that the local worker logs as if successusfully connected to the local master on port :
16/02/12 21:28:38 INFO Worker: Connecting to master pl:7077... 16/02/12 21:28:38 INFO Worker: Successfully registered with master spark://pl:7077
You can try running netstat -pna | grep 7077 (needs root privileges) on the master to see what process is blocking the port.
Maybe you have another driver instance running. If this is a Java process blocking the port you can use jps to find out more about it.