Liknande böcker
Data Analysis on Multinode Cluster Using Hadoop
Bok av Singh Kamalpreet
CLOUD AND HADOOP : THE PERFECT MATCH Hadoop is an open source cloud computing platform of the Apache Foundation that provides a software programming framework called MapReduce and distributed file system, HDFS. Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort Hadoop has the capability of processing large amount of unstructured or semi-structured data in a distributed fashion over large number of clusters. Cloud Computing provides the necessary flexibility and elasticity to Hadoop whenever needed on pay per use basis. In this work,Hadoop,an open source framework,is explained and used to analyze traffic files captured by packet sniffing tool in pcap format using a proposed analyzer, HPA (Hadoop Pcap Analyzer).