How is ETL different from BigData?

ETL is entirely different from big data. While ETL tries to process delta data entirely, hadoop distribute the processing in distributed cluster. Storage is also different in the two. In hadoop, the data is stored in HDFS in form of files. Files are not simply stored but these are split into small blocks with default block size as 128 mb. These blocks are stored in multiple dataNodes as per the rack awareness to avoid data loss in case of failure. The metadata of these blocks are kept at name node.

Moreover update is not encouraged in hadoop. So implementation of slowly changing dimensions is difficult. Even if hive seems similar to SQL, it is not. In the background, it generates java code in form of jar files and execute the map reduce jobs to load the data.