last update : 22/12/2015
Apache Spark & Scala
|No upcoming event date found|
About The Course
The Apache Spark & Scala course will enable the participants to understand how Spark enables in-memory distributed datasets that optimize iterative workloads in addition to interactive queries. This course is a part of Developer's learning path.
After the completion of 'Apache Spark & Scala' course, you will be able to:
1) Understand Scala and its implementation2) Apply Lazy values, Control Structures, Loops, Collection, etc.3) Learn the concepts of Traits and OOPS in scala4) Understand Functional programming in scala5) Get an insight into the BigData challenges6) How spark acts as a solution to these challenges7) Install spark and implement spark operations on spark shell8) Understand what are RDDs in spark9) Implement spark application on YARN (Hadoop)10) Analyze Hive and Spark SQL Architecture
Who should go for this course?
This course is a foundation to anyone who aspires to get into the field of Big Data and be aware of the latest developments in fast processing of ever growing data using Spark and related projects. The following professionals can go for this course :
1. Big Data Enthusiasts2. Software Architects, Engineers and Developers3. Data Scientists and Analytics Professionals
What are the pre-requisites for this Course?
A basic understanding of functional programming and object oriented programming will help. Knowledge of Scala will definitely be a plus, but is not mandatory.
Why learn Apache Spark and Scala?
In this era of ever growing data, the need for analyzing it for meaningful business insights becomes more and more significant. There are different Big Data processing alternatives like Hadoop, Spark, Storm etc. Spark, however is unique in providing batch as well as streaming capabilities, thus making it a preferred choice for lightening fast Big Data Analysis platforms.
Module 1Why Spark? Explain Spark and Hadoop Distributed File System● What is Spark● Comparison with Hadoop● Components of Spark
Module 2Spark Components, Common Spark AlgorithmsIterative Algorithms, GraphAnalysis, Machine Learning● Apache Spark Introduction, Consistency, Availability, Partition● Unified Stack Spark● Spark Components● Comparison with Hadoop – Scalding example, mahout, storm, graph
Module 3Running Spark on a Cluster, Writing Spark Applications using Python, Java,Scala● Explain python example● Show installing a spark● Explain driver program● Explaining spark context with example● Define weakly typed variable● Combine scala and java seamlessly.● Explain concurrency and distribution.● Explain what is trait.● Explain higher order function with example.● Define OFI scheduler.● Advantages of Spark● Example of Lamda using spark● Explain Mapreduce with example
Module 4RDD and its operation● Difference between RISC and CISC● Define Apache Mesos● Cartesian product between two RDD● Define count● Define Filter● Define Fold● Define API Operations● Define Factors
Module 5Spark, Hadoop, and the Enterprise Data Centre, Common Spark Algorithms● How hadoop cluster is different from spark● Define writing data● Explain sequence file and its usefulness● Define protocol buffers● Define text file, CSV, Object Files and File System● Define sparse metrics● Explain RDD and Compression● Explain data stores and its usefulness
Module 6Spark Streaming● Define Elastic Search● Explain Streaming and its usefulness● Apache bookeeper● Define Dstream● Define mapreduce word count● Explain Paraquet● Scala ORM● Define Mlib● Explain multi graphix and its usefulness● Define property graph
Module 7Spark Persistence in Spark● Persistence● Motivation● Example● Transformation● Scala and Python● Examples – Kmeans● Latent Dirichlet Allocation (LDA)
Module 8Broadcast and accumulator● Motivation● Broadcast Variables● Example: Join● Alternative if one table is small● Better version with broadcast● How to create a Broadcast● Accumulators motivation● Example: Join● Accumulator Rules● Custom accumulators● Creating an accumulator using spark context object
Module 9Spark SQL and RDD● Introduction● Spark SQL main capabilities● Spark SQL usage diagram● Spark SQL● Important topics in Spark SQL Data frames● Twitter language analysis
Submit your details to download the brochure:
Who are the Instructors?
All our instructors are working professionals from the Industry and have at least 10-12 yrs of relevant experience in various domains. They are subject matter experts and are trained by Edureka for providing online training so that participants get a great learning experience.
How can I request for a support session?
Requesting for a support session is a very simple process. As soon as you join the course, the contact number and email-id of the support team will be available in your LMS. Just a phone call or email will solve the purpose.
How can I make payment?
Payment can be made via Cheque / DD / Online Funds transfer / Cash Payment.
Cheque should be drawn in favour of "Unicom training and Seminars Pvt Ltd" payable at Bangalore
Account Name: UNICOM Training & Seminars Pvt LtdBank Name : State Bank of IndiaBank Address: Ground Floor, K V Plaza, Green Glen Layout, Outer Ring Road, Bangalore.A/c Number : 31729010535IFSC : SBIN0012706A/c Type: Current
What is Course timing?
0900 – 1700 each day
What is the course Fee?
2 Days Course - Rs 15,000 + 14% (Service Tax)Whom do I contact for more details?
+91-9538878798 or firstname.lastname@example.org