Follow Us On

Youtube Google Plus linkedintwitterfacebook

Analytics And Big Data


last update : 22/12/2015

Apache Spark & Scala

Event Date Country City Days Price  
    No upcoming event date found      


About The Course

The Apache Spark & Scala course will enable the participants to understand how Spark enables in-memory distributed datasets that optimize iterative workloads in addition to interactive queries. This course is a part of Developer's learning path.

Course Objectives

After the completion of 'Apache Spark & Scala' course, you will be able to:

1) Understand Scala and its implementation2) Apply Lazy values, Control Structures, Loops, Collection, etc.3) Learn the concepts of Traits and OOPS in scala4) Understand Functional programming in scala5) Get an insight into the BigData challenges6) How spark acts as a solution to these challenges7) Install spark and implement spark operations on spark shell8) Understand what are RDDs in spark9) Implement spark application on YARN (Hadoop)10) Analyze Hive and Spark SQL Architecture

Who should go for this course?

This course is a foundation to anyone who aspires to get into the field of Big Data and be aware of the latest developments in fast processing of ever growing data using Spark and related projects. The following professionals can go for this course :

1. Big Data Enthusiasts2. Software Architects, Engineers and Developers3. Data Scientists and Analytics Professionals

What are the pre-requisites for this Course?

A basic understanding of functional programming and object oriented programming will help. Knowledge of Scala will definitely be a plus, but is not mandatory.

Why learn Apache Spark and Scala?

In this era of ever growing data, the need for analyzing it for meaningful business insights becomes more and more significant. There are different Big Data processing alternatives like Hadoop, Spark, Storm etc. Spark, however is unique in providing batch as well as streaming capabilities, thus making it a preferred choice for lightening fast Big Data Analysis platforms.


Module 1­Why Spark? Explain Spark and Hadoop Distributed File System● What is Spark● Comparison with Hadoop● Components of Spark

Module 2­Spark Components, Common Spark Algorithms­Iterative Algorithms, GraphAnalysis, Machine Learning● Apache Spark­ Introduction, Consistency, Availability, Partition● Unified Stack Spark● Spark Components● Comparison with Hadoop – Scalding example, mahout, storm, graph

Module 3­Running Spark on a Cluster, Writing Spark Applications using Python, Java,Scala● Explain python example● Show installing a spark● Explain driver program● Explaining spark context with example● Define weakly typed variable● Combine scala and java seamlessly.● Explain concurrency and distribution.● Explain what is trait.● Explain higher order function with example.● Define OFI scheduler.● Advantages of Spark● Example of Lamda using spark● Explain Mapreduce with example

Module 4­RDD and its operation● Difference between RISC and CISC● Define Apache Mesos● Cartesian product between two RDD● Define count● Define Filter● Define Fold● Define API Operations● Define Factors

Module 5­Spark, Hadoop, and the Enterprise Data Centre, Common Spark Algorithms● How hadoop cluster is different from spark● Define writing data● Explain sequence file and its usefulness● Define protocol buffers● Define text file, CSV, Object Files and File System● Define sparse metrics● Explain RDD and Compression● Explain data stores and its usefulness

Module 6­Spark Streaming● Define Elastic Search● Explain Streaming and its usefulness● Apache bookeeper● Define Dstream● Define mapreduce word count● Explain Paraquet● Scala ORM● Define Mlib● Explain multi graphix and its usefulness● Define property graph

Module 7­Spark Persistence in Spark● Persistence● Motivation● Example● Transformation● Scala and Python● Examples – K­means● Latent Dirichlet Allocation (LDA)

Module 8­Broadcast and accumulator● Motivation● Broadcast Variables● Example: Join● Alternative if one table is small● Better version with broadcast● How to create a Broadcast● Accumulators motivation● Example: Join● Accumulator Rules● Custom accumulators● Creating an accumulator using spark context object

Module 9­Spark SQL and RDD● Introduction● Spark SQL main capabilities● Spark SQL usage diagram● Spark SQL● Important topics in Spark SQL­ Data frames● Twitter language analysis

Submit your details to download the brochure:

First Name *:

Last Name *:

Email *:

Phone Number *:

Job Title:



  Type the characters you see in the picture below *:



Who are the Instructors?

All our instructors are working professionals from the Industry and have at least 10-12 yrs of relevant experience in various domains. They are subject matter experts and are trained by Edureka for providing online training so that participants get a great learning experience.

How can I request for a support session?

Requesting for a support session is a very simple process. As soon as you join the course, the contact number and email-id of the support team will be available in your LMS. Just a phone call or email will solve the purpose.

How can I make payment?

Payment can be made via Cheque / DD / Online Funds transfer / Cash Payment.

Cheque should be drawn in favour of "Unicom training and Seminars Pvt Ltd" payable at Bangalore

NEFT Payment:

Account Name: UNICOM Training & Seminars Pvt LtdBank Name : State Bank of IndiaBank Address: Ground Floor, K V Plaza, Green Glen Layout, Outer Ring Road, Bangalore.A/c Number : 31729010535IFSC : SBIN0012706A/c Type: Current

What is Course timing?

0900 – 1700 each day

What is the course Fee?

2 Days Course - Rs 15,000 + 14% (Service Tax)Whom do I contact for more details?

+91-9538878798 or


navigation div
navigation div

Contact Us(India)

Alankar Plaza,

Bk circle, Nayak Layout

8th Phase,

JP Nagar

Bengaluru - 560076,

Karnataka, India.

Telephone: +91-9538878795, +91-9538878799, +91-8025257962


Contact Us(UK)

OptiRisk R&D House

One Oxford Road






© 2019 All Rights Reserved