123ArticleOnline Logo
Welcome to 123ArticleOnline.com!
ALL >> Education >> View Article

What Is Hadoop In Big Data?

Profile Picture
By Author: tibacademy
Total Articles: 55
Comment this article
Facebook ShareTwitter ShareGoogle+ ShareTwitter Share

What is Hadoop
Hadoop is an open supply framework from Apache and is used to store method and analyze information that is very vast in volume. Hadoop is written in Java and isn't OLAP (online analytical processing). It’s used for batch/offline process. It’s being used by Facebook, Yahoo, Google, Twitter, LinkedIn and lots of additional. Moreover it can be scaled up just by adding nodes in the cluster.
Modules of Hadoop
1. HDFS: Hadoop Distributed filing system. It states that the files are broken into blocks and keep in nodes over the distributed design.
2. Yarn: yet another Resource negotiator is used for job Hadoop training in Bangalore scheduling and manages the cluster.
3. Map Reduce: this is a framework which helps Java programs to do the parallel computation on data using key value pair. The Map task takes input file and converts it into information set which might be computed in Key value try. The output of Map task is consumed by reduce task then the out of reducer offers the specified result.
4. Hadoop Common: These Java libraries are used to begin Hadoop and are used by other Hadoop modules.
Advantages of Hadoop
• Fast: In HDFS the data distributed over the cluster and are mapped which helps in faster retrieval. Even the tools to process the information are usually on similar servers, so reducing the interval. It is able to process terabytes of data in minutes and Peta bytes in hours.
• Scalable: cluster can be extended by just adding nodes in the cluster.
• Cost Effective: Hadoop is open source and uses artifact hardware to store information thus it extremely cost effective as compared to ancient relational database management system.
• Resilient to failure: HDFS has the property with which it can replicate data over the network, thus if one node is down or another network failure happens, then Hadoop takes the opposite copy of data and use it. Normally, information is replicated thrice however the replication issue is configurable.
History of Hadoop
It was started by Doug Cutting and mike Cafarella. Its origin was the Google filing system paper, printed by Google.
Let's target the history of Hadoop within the following steps: -
• In 2002, Doug Cutting and mike Cafarella began to deal with a venture, Apache Nutch. It's an open source web crawler programming framework venture.
• While chipping away at Apache Nutch, they were managing huge information. To store that Big Data Hadoop Training in Bangalore data they need to spend a great deal of costs which turns into the outcome of that venture. This issue ends up one of the significant purposes behind the rise of Hadoop.
• In 2003, Google presented a record framework called GFS (Google document framework). It's a restrictive circulated record framework created to supply effective access to data.
• In 2004, Google discharged a white paper on Map lessen. This strategy improves the information handling on huge bunches.
• In 2005, Doug Cutting and mike Cafarella presented another document framework called NDFS (Nutch Distributed File System). This record framework additionally incorporates Map diminish.
• In 2006, Doug Cutting quit Google and joined Yahoo. Based on the Nutch venture, Dough Cutting presents another task Hadoop with a record framework known as HDFS (Hadoop Distributed File System). Hadoop first form 0.1.0 discharged in this year.
• Doug Cutting gave named his task Hadoop after his child's toy elephant.
• In 2007, Yahoo runs 2 groups of one thousand machines.
• In 2008, Hadoop turned into the speediest framework to sort one terabyte of information on a 900 hub bunch inside 209 seconds.
• In 2013, Hadoop 2.2.
• In 2017, Hadoop 3.0.

Total Views: 98Word Count: 580See All articles From Author

Add Comment

Education Articles

1. 7 Compelling Reasons To Introduce Robotics In Academia
Author: Ankur Anand

2. How Gruelling Is The Admission Process For Rv Engineering College?
Author: Iesonline

3. Neet Preparation Tips During Coronavirus Lockdown
Author: Richa Ahuja

4. Tips For Effective Time Management For A Student
Author: Srasti Trivedi

5. Rbi Grade B Exam Date
Author: Anuj Jindal

6. Developing Spring Boot Application And Learning Core Features
Author: Abhinav cynix

7. Role Of Blockchain Technology In Business
Author: Block chain council

8. Data Science Blog
Author: Ritika Singh

9. How To Choose A Specialization In An Executive Mba
Author: SCMHRD

10. The New Industrial Revolution With The Amalgamation Of Web 3.0 With Blockchain
Author: Block chain council

11. Best Practices To Implement School Software Successfully
Author: Academy Front

12. What’s The Future Of Engineering Education In India?
Author: navyanavvi

13. 5 Year Integrated Ba Llb Course
Author: SLS Hyderabad

14. Factors Why You Should Give Some Thought To Frequent Company Training For The Employees
Author: Akash Bhowad

15. Why Investing In A School Management Software Will Make You Future Ready!!
Author: saloni shah

Login To Account
Login Email:
Forgot Password?
New User?
Sign Up Newsletter
Email Address: