Spark installation on ubuntu

spark logo

Spark installation on ubuntu

Spark can be deployed in a variety of ways, provides native bindings for the Java, Scala, Python, and R programming languages, and supports SQL, streaming data, machine learning, and graph processing. Out of the box, Spark can run in a standalone cluster mode that simply requires the Apache Spark framework and a JVM on each machine in the cluster.

 In this tutorial we will see how to install spark on ubuntu 16.04 by doing the following steps

Step 1: Before installing Spark, you need to First ensure that java8 is installed:

Verify that java is correctly installed:

Configuring Java Environment

Step 2: Ensure that you successfully installed hadoop on your machine
Check this link if you need to know how to install it. 

Step 3: Download Apache Spark

  • Go to downloads page
  • Choose a Spark release: 2.1.1 (May 02 2017)
  • Choose a package type: Pre-built for Hadoop 2.6

Step 4: Complete the installation process

  • Move the downloaded file “spark-2.1.1-bin-hadoop2.6.tgz” to your home (~)
  • Compress it :
  • Create a link to spark installation:
  • Edit bashrc using this command line:
  • Add this lines:
  • Execute bashrc:

Step 5: Test the installation

  • Execute this command line:

While Structured Streaming provides high-level improvements to Spark Streaming, it currently relies on the same microbatching scheme of handling streaming data. However, the Apache Spark team is working to bring continuous streaming without microbatching to the platform, which should solve many of the problems with handling low-latency responses

Avatar for Nizar Ellouze

Author: Nizar Ellouze

No Comments

Post a Comment