Certified Big Data Science Analyst (CBDSA) Course

The Certified Big Data Science Analyst (CBDSA) course is designed to train professionals about the use of advanced analytic techniques against data with large sizes from terabytes to zettabytes, structured or unstructured, and a variety of formats from different sources. This training program will help participants to acquire analytical skills from reporting to advanced data analytics. You will also get to learn the concept of big data technologies supported with hands-on using Apache Hadoop, Apache Hive, Apache HBase & RapidMiner.


  256 Ratings

               589 Participants

Group Discount

Upto 15% OFF

40 hours (5 days) Training

GICT Authorized Training Partner

Intermediate-level Big Data Course

Hands-on Labs

Certified Big Data Science Analyst CBDSA Course Overview

Big data has a huge impact on businesses. With big data analytics, professionals learn how to use advanced analytic techniques and how analytics are being applied across different industries and business domains. This Certified Big Data Science Analyst (CBDSA) course is designed for candidates who are curious to gain technical know-how about the fundamental concepts of Business Analytics and different Data Mining techniques and tools (RapidMiner).

In this CBDSA training, you will be equipped with knowledge of data analytics using RapidMiner. In addition, you will get exposure to practical end-to-end data analytics skillsets such as engineering classification and regression models using Linear Regression, Logistic Regression, Decision Trees, and Neural networks. You will also get an in-depth understanding of big data challenges and leverage on the existing Hadoop ecosystem including HDFS, Map Reduce, HBase, and Hive to accelerate data processing.

In this CBDSA course, participants will also learn about the Big Data Solution (Hadoop) and the components working on top of Hadoop (HBase, Hive).

Course Curriculum


This course is intended for anyone who wish to acquire technical skills in big data technologies. The Certified Big Data Science Analyst (CBDSA) course imparts great value to candidates in the following roles:

  • Data Analyst - Statistics and Mining
  • Big Data Analyst
  • Operations Research Analyst
  • Data Scientist
  • IHL students

Course Objectives

In this course, you will gain insights on:

  • End-to-end concepts for big data technology
  • Big data challenges and various principles, concepts, techniques, and tools used in the Hadoop big data technology
  • Different types of big data real-life business analytics use cases
  • Big data technology as a tool for addressing the real-world problem due to data flood
  • Big data technology comprised of HDFS for the file distribution system, MapReduce for batch processing, HBase for data manipulation, and Hive for queries

Eligibility Criteria

Participants are recommended to have some knowledge/understanding of UNIX commands.

About The Examination

Exam Pattern: Multiple choice questions

No. of questions: 40

Passing score: 70%

Certified Big Data Science Analyst program is a 5-day intensive training program with the following assessment components:

Component 1. Written Examination

Component 2. Project Work Component (PWC)

These components are individual-based. Participants will need to obtain 70% in both the components in order to qualify for this certification. If the participant fails one of the components, they will not pass the course and have to re-take that particular failed component. If they fail both components, they will have to re-take the assessment.

Course Benefits

The creation and consumption of data continues to grow by leaps and bounds and with it the investment in big data analytics hardware, software, and services and in data scientists and their continuing education. The availability of very large data sets is one of the reasons Deep Learning, a sub-set of artificial intelligence (AI), has recently emerged as the hottest tech trend, with Google, Facebook, Baidu, Amazon, IBM, Intel, and Microsoft, all with very deep pockets, investing in acquiring talent and releasing open AI hardware and software.

The statistic shows a revenue forecast for the global big data industry from 2011 to 2026. For 2017, the source projects the global big data market size to grow to just under 34 billion U.S. dollars in revenue (https://www.statista.com/statistics/254266/global-big-data-market-forecast/)

After this training, you will be able to:

  • Acquire knowledge of complete Big Data Technologies stack from Data Storage, Data Processing, Data Visualisation to Data Analytics
  • Acquire skills to manage and analyse big data
  • Implement key predictive modelling Algorithms on RapidMiner
  • Perform Exploratory data analysis and data pre-processing techniques
  • Identify the right tool for solving real life big data problems
  • Get hands-on experience in using Big Data Technologies Hadoop, HBase, Hive, RapidMiner

Read More..

Get in touch

By providing your contact details, you agree to our terms & conditions

Training Options


Instructor led Online Training

  • 40 hrs (5 days) inclusive of training and exam
  • Experienced Subject Matter Experts
  • Approved and Quality Ensured training Material
  • 24*7 leaner assistance and support


Customized to your team's need

  • Blended Learning Delivery Model (Self-Paced E-Learning And/Or Instructor-Led Options)
  • Course, Category, And All-Access Pricing
  • Enterprise-Class Learning Management System (LMS)
  • Enhanced Reporting For Individuals And Teams
  • 24x7 Teaching Assistance And Support

Course Outline

  • The concept of Business Analytics
  • Data, Information, Knowledge and Wisdom
  • Data as Unique Enterprise Asset
  • Data, Information and Analytics Lifecycle
  • Business Analytics – Current Context
  • Types of Analytics
    • Descriptive Analytics
    • Predictive Analytics
    • Prescriptive Analytics
    • Data/Information Architecture
    • Concept of Data Warehouse/Enterprise Data Warehouse (EDW)
    • ETL – Key Process
    • Concept of Data Mart
    • Business Intelligence
    • Data Mining
    • Understand the open-source DM tool RapidMiner
    • Explore the various features of RapidMiner
    • Walkthrough a RapidMiner demo with different scenarios
    • Understand the various data mining techniques
    • Understand how correlation matrix works
    • Understand how association rule mining works
    • Understanding the Predictive Analytics technique
    • Understand the forecasting technique
    • What is Big Data? Why Big Data?
    • 3V’s of Big Data
    • The Rapid Growth of Unstructured Data
    • Big Data Market Forecast
    • Big Data Analytics
    • Big Data in Business
    • Big Data Types & Architecture
    • Big Data – Current Industry Trends
    • Why Process Big Data?
    • Challenges in Data Processing
    • Why Hadoop?
    • What is Hadoop offering?
    • Hadoop Network Structure
    • Hadoop Eco-System
    • Hadoop Core Components
    • Hadoop – Features
    • Hadoop – Relevance
    • Hadoop in Action
    • Sqoop import and export
    • Hadoop HDFS
    • What does HDFS Facilitate?
    • HDFS Architecture
    • Hadoop Network and Server Infrastructure
    • NameNode, Secondary NameNode and DataNode
    • Ensuring Data Correctness
    • Data Pipelining while Loading Data
    • fs Operations
    • Hadoop MapReduce
    • MapReduce Conceptualization
    • MapReduce – Overview
    • MapReduce – Programming Model
    • MapReduce – Execution Overview
    • Hadoop – Application Examples
    • Word Count – Example
  • What is HBase?
  • HBase Architecture
  • ZooKeeper
  • HBase Data model
  • HBase Deployment
  • HBase Cluster Architecture
  • Indexes in HBase
  • Scaling HBase
  • Data Locality, Coherence and Concurrency, Fault Tolerance
  • Hadoop Integration
    • High-Level Architecture
    • Replication of Data Across Data Centres
    • HBase Applications
    • Advantages and Disadvantages
      • What is Hive?
      • Why Hive?
      • Where to use Hive?
      • Hive Architecture
      • Hive: Benefits
      • Hive: Tradeoffs
      • Hive: Real world Examples

Course Reviews


Vinsys trainers have an interesting ability and technique to deliver course concepts efficiently. Through real examples, case studies and numerous practice tests, they make subject matter really easy for you. With 21 years of training excellence, you can trust Vinsys for the Certified Big Data Science Analyst course.

Of course, the Certified Big Data Science Analyst course holds great value in today’s tech-world where everything is data-driven. The revenue forecast for the global big data industry from 2011 to 2026 is highly appealing. Therefore, this is a course that has the potential to accelerate your career in the coming years.