CS702(D) Big Data

RGPV Computer Science and Engineering VII Semester | Complete Unit-wise Notes, Important Questions & PYQ Resources

šŸ“˜ Subject Overview

Big Data is a departmental elective subject in RGPV CSE 7th semester. This subject covers Big Data concepts, Hadoop ecosystem, Hive, Pig, NoSQL databases, MongoDB and social network graph mining.

šŸŽÆ Course Outcomes

šŸ“š Unit-wise Big Data Notes

Unit 1

Introduction to Big Data

Big Data characteristics, types, traditional vs big data, evolution, challenges, technologies, infrastructure and data analytics.

Unit 2

Hadoop Ecosystem

Hadoop introduction, core components, HDFS, YARN, MapReduce, Hadoop limitations and RDBMS vs Hadoop.

Unit 3

Hive and Pig

Hive architecture, HiveQL, Hive data types, Pig architecture, Pig Latin, ETL processing, operators and functions.

Unit 4

NoSQL and MongoDB

NoSQL introduction, business drivers, architectural patterns, managing Big Data using NoSQL and MongoDB basics.

Unit 5

Social Network Graph Mining

Social network mining, social networks as graph, clustering, community discovery and recommender system.

šŸ“ Complete Syllabus

Unit Topics
Unit 1 Introduction to Big Data, Big Data characteristics, types, traditional vs Big Data, evolution, challenges, technologies, infrastructure, analytics and desired properties.
Unit 2 Introduction to Hadoop, core Hadoop components, Hadoop ecosystem, Hive physical architecture, limitations, RDBMS vs Hadoop, HDFS, YARN and MapReduce programming.
Unit 3 Hive architecture, Hive data types, Hive Query Language, Pig, Pig on Hadoop, Pig use cases, Pig Latin, ETL processing, operators, functions and Pig data types.
Unit 4 NoSQL introduction, business drivers, data architectural patterns, NoSQL architectural patterns, managing Big Data with NoSQL and MongoDB.
Unit 5 Mining social network graphs, applications, social networks as graph, types of social networks, clustering, community discovery and recommender system.

⭐ Most Important Topics for Exam

šŸ“Œ PYQ Analysis

For scoring good marks in RGPV exam, focus on Hadoop, HDFS, MapReduce, Hive, Pig, NoSQL, MongoDB and Social Network Graphs. These topics are highly expected for 7 marks and 14 marks questions.


Open PYQ Analysis

ā“ Frequently Asked Questions

Is Big Data easy for RGPV exam?

Yes, Big Data is scoring if you prepare Hadoop, Hive, Pig, NoSQL and MongoDB properly.

Which unit is most important?

Unit 2, Unit 3 and Unit 4 are very important because Hadoop, Hive, Pig and NoSQL are frequently asked.

What should I study first?

Start with Unit 1 basics, then Hadoop ecosystem, then Hive/Pig and finally NoSQL with MongoDB.

Can I score well by studying important questions?

Yes, but read unit-wise concepts also so that you can write long answers properly.

šŸ”— Related Subjects