Spark application can run over various cluster managers like - spark inbuilt standalone cluster manager, two popular open source cluster managers- Hadoop YARN and Apache Mesos. Along with these cluster manager spark application can be deployed on EC2(Amazon's cloud infrastructure).
The main agenda of this post is to set-up a 3 Node cluster(1 master and 3 workers) and launch this cluster using spark's in-built standalone cluster manager.
3(N) Nodes cluster details and cluster architecture :-
|IP address||Status (Master/Worker)|
|192.168.213.133||Act as Master and Worker both|
|192.168.213.130||Act as Worker|
|192.168.213.134||Act as Worker|
Apace spark should be installed on each machine.Refer How to Set-up Apache spark in Linux System. All above nodes are Unix based system(Ubuntu 13.04 and Cent OS).
Note:- It is recommended to have same user account on each machine(SSH login to work smoothly) and place compiled version of spark at same location too.
1. SSH password-less access to workers :- In order to start worker services and interact with workers, mater node should have login access to worker nodes.Generate private SSH key on master node and add the same to workers node. Private SSH key can be generated using ssh-keygen. Execute the following commands and generate private key file.
Note:- Just press Enter key and do not give any password or file name if prompt ask for the same.We are intended to generate password-less access so No password is used.
zytham@ubuntu:~$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/zytham/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/zytham/.ssh/id_rsa. Your public key has been saved in /home/zytham/.ssh/id_rsa.pub. The key fingerprint is: 3a:17:48:3e:01:23:5c:32:0f:51:b5:0a:0d:15:49:78 zytham@ubuntu The key's randomart image is: +--[ RSA 2048]----+ | .BBO+. | | oOEo . | | ..o + | | . + o | | . + S | | o . | | o . | | o | | | +-----------------+
2. Add generated private key file to workers machine. Following commands does the same.
zytham@ubuntu:~$ cd ~/.ssh/ zytham@ubuntu:~/.ssh$ ssh-copy-id -i ./id_rsa.pub 192.168.213.134 email@example.com's password: Now try logging into the machine, with "ssh '192.168.213.134'", and check in: ~/.ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting. zytham@ubuntu:~/.ssh$ ssh-copy-id -i ./id_rsa.pub 192.168.213.130 firstname.lastname@example.org's password: Now try logging into the machine, with "ssh '192.168.213.134'", and check in: ~/.ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting.Since we are making master machine act as worker too, we need to add "authorized_keys" file in master ~/.ssh directory too. If we do not do so, password-less login will not be possible and we need to manually provide password for starting and stopping worker service at master node. Execute following command to create a file named as "authorized_keys" in ~/.ssh directory.
zytham@ubuntu:~/.ssh$ cat ./id_dsa.pub >> ./authorized_keys zytham@ubuntu:~/.ssh$ ls authorized_keys id_dsa id_dsa.pub id_rsa id_rsa.pub known_hosts
Test accessing worker using ssh :- Login in any of the worker machine using following command and if successful, it indicates we are through with it and proceed with next step.
zytham@ubuntu:~/.ssh$ ssh 192.168.213.134 Welcome to Ubuntu 13.04 (GNU/Linux 3.8.0-19-generic x86_64) * Documentation: https://help.ubuntu.com/ Your Ubuntu release is not supported anymore. For upgrade information, please visit: http://www.ubuntu.com/releaseendoflife New release '13.10' available. Run 'do-release-upgrade' to upgrade to it. Last login: Sun Feb 7 22:03:15 2016 from ubuntu.local
Launch cluster using standalone cluster manager:-
1.Update conf/slaves on Master node :- Create a file name(slaves) under <SPARK_HOME>/conf/ ,if does not exist, and update all workers IP address(one entry in one line).Here along with other two machine IP address,localhost is also added to act it as worker node too. After update conf/slaves looks like:-
# A Spark Worker will be started on each of the machines listed below.
2.Update conf/spark-env.sh:- Make a copy of spark-env.sh.template & name it as spark-env.sh and make following entry in spark-env.sh
SPARK_WORKER_INSTANCES=2 #Optional for creating two instance of worker
SPARK_MASTER_IP=192.168.213.133 #Mandatory for workers
3. Update spark-env.sh on workers node(Optional) :- In order to start N number of worker instances we need to update spark-env.sh file as done above.In our case, I am updating spark-env.sh of only one worker machine(192.168.213.130) as follows.Add below entry in spark-env.sh.
4. Start cluster using standalone cluster manager:- Spark ships with launch scripts in Spark’s sbin directory. In order to start cluster we need to start master and slaves. It can be achieved in either of two ways :-
1. Both master and slaves can be started using : ./start-all.sh
2. Start master using ./start-master.sh followed by ./start-slaves.sh
Execute following scripts from master machine and starts cluster. If everything started, you should get no prompts for a password, and the cluster manager’s web UI should appear at http://192.168.213.133:8080 and show all your workers.
zytham@ubuntu:/opt/spark-1.5.2$ ./sbin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.5.2/sbin/../logs/spark-zytham-org.apache.spark.deploy.master.Master-1-ubuntu.outStart workers/workers
zytham@ubuntu:/opt/spark-1.5.2$ ./sbin/start-slaves.sh localhost: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2/sbin/../logs/spark-zytham-org.apache.spark.deploy.worker.Worker-1-ubuntu.out 192.168.213.134: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2/sbin/../logs/spark-zytham-org.apache.spark.deploy.worker.Worker-1-ubuntu.out localhost: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2/sbin/../logs/spark-zytham-org.apache.spark.deploy.worker.Worker-2-ubuntu.out 192.168.213.130: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2/sbin/../logs/spark-zytham-org.apache.spark.deploy.worker.Worker-1-s158519-vm.localvm.com.out 192.168.213.130: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2/sbin/../logs/spark-zytham-org.apache.spark.deploy.worker.Worker-2-s158519-vm.localvm.com.out 192.168.213.130: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2/sbin/../logs/spark-zytham-org.apache.spark.deploy.worker.Worker-3-s158519-vm.localvm.com.outNote:-
1. Since SPARK_WORKER_INSTANCES=2 for localhost(machine acting as master and worker) that's why two lines appear for localhost. And similarly 3 lines for 192.168.213.130.By default only one worker instance is started as started for 192.168.213.134.
2. Running instance of worker and master can be seen using following command.
zytham@ubuntu:/opt/spark-1.5.2$ jps 7954 Master 8852 Jps 8198 Worker 8266 Worker
Submit spark application on cluster :-
In order to submit an application to the Standalone cluster manager, pass spark://masternode:7077 as the master argument to spark-submit.An application can be submitted to master as follows :-
zytham@ubuntu:~/WordCountExample$ spark-submit --class "WordCount" --master spark://192.168.213.133:7077 target/scala-2.10/wordcount-spark-application_2.10-1.0.jar
Spark-shell can also be launched in standalone mode as follows:-
zytham@ubuntu:/opt/spark-1.5.2$ spark-shell --master spark://192.168.213.133:7077
Standalone cluster Web UI:-
Standalone cluster manager’s web UI at http://<masternode>:8080. Default port is 8080, it can be modified in conf/spark-env.sh. Following Diagram shows UI with workers and spark-shell as running application.
|Spark standlone cluster web UI|
Cluster managed by standalone cluster manager can be stopped by either of following ways.
1, Stop all master and workers(slaves) using ./stop-all.sh
2. Stop master and worker(s) using ./stop-master.sh and ./stop-slaves.sh respectably.
Stop master and workers using ./stop-all.sh
zytham@ubuntu:/opt/spark-1.5.2/sbin$ ./stop-all.sh localhost: stopping org.apache.spark.deploy.worker.Worker localhost: stopping org.apache.spark.deploy.worker.Worker 192.168.213.134: stopping org.apache.spark.deploy.worker.Worker 192.168.213.130: stopping org.apache.spark.deploy.worker.Worker 192.168.213.130: stopping org.apache.spark.deploy.worker.Worker 192.168.213.130: stopping org.apache.spark.deploy.worker.Worker stopping org.apache.spark.deploy.master.Master