RGPV Computer Science and Engineering VII Semester | Complete Unit-wise Notes, Important Questions & PYQ Resources
Big Data is a departmental elective subject in RGPV CSE 7th semester. This subject covers Big Data concepts, Hadoop ecosystem, Hive, Pig, NoSQL databases, MongoDB and social network graph mining.
Big Data characteristics, types, traditional vs big data, evolution, challenges, technologies, infrastructure and data analytics.
Hadoop introduction, core components, HDFS, YARN, MapReduce, Hadoop limitations and RDBMS vs Hadoop.
Hive architecture, HiveQL, Hive data types, Pig architecture, Pig Latin, ETL processing, operators and functions.
NoSQL introduction, business drivers, architectural patterns, managing Big Data using NoSQL and MongoDB basics.
Social network mining, social networks as graph, clustering, community discovery and recommender system.
| Unit | Topics |
|---|---|
| Unit 1 | Introduction to Big Data, Big Data characteristics, types, traditional vs Big Data, evolution, challenges, technologies, infrastructure, analytics and desired properties. |
| Unit 2 | Introduction to Hadoop, core Hadoop components, Hadoop ecosystem, Hive physical architecture, limitations, RDBMS vs Hadoop, HDFS, YARN and MapReduce programming. |
| Unit 3 | Hive architecture, Hive data types, Hive Query Language, Pig, Pig on Hadoop, Pig use cases, Pig Latin, ETL processing, operators, functions and Pig data types. |
| Unit 4 | NoSQL introduction, business drivers, data architectural patterns, NoSQL architectural patterns, managing Big Data with NoSQL and MongoDB. |
| Unit 5 | Mining social network graphs, applications, social networks as graph, types of social networks, clustering, community discovery and recommender system. |
For scoring good marks in RGPV exam, focus on Hadoop, HDFS, MapReduce, Hive, Pig, NoSQL, MongoDB and Social Network Graphs. These topics are highly expected for 7 marks and 14 marks questions.
Yes, Big Data is scoring if you prepare Hadoop, Hive, Pig, NoSQL and MongoDB properly.
Unit 2, Unit 3 and Unit 4 are very important because Hadoop, Hive, Pig and NoSQL are frequently asked.
Start with Unit 1 basics, then Hadoop ecosystem, then Hive/Pig and finally NoSQL with MongoDB.
Yes, but read unit-wise concepts also so that you can write long answers properly.