No doubt working with huge data volumes is hard, but to move a mountain, you have to deal with a lot of small stones. But why strain yourself? Using Mapreduce and Spark you tackle the issue partially, thus leaving some space for high-level tools. Stop struggling to make your big data workflow productive and efficient, make use of the tools we are offering you.
This course will teach you how to:
– Warehouse your data efficiently using Hive, Spark SQL and Spark DataFframes.
– Work with large graphs, such as social graphs or networks.
– Optimize your Spark applications for maximum performance.
Precisely, you will master your knowledge in:
– Writing and executing Hive & Spark SQL queries;
– Reasoning how the queries are translated into actual execution primitives (be it MapReduce jobs or Spark transformations);
– Organizing your data in Hive to optimize disk space usage and execution times;
– Constructing Spark DataFrames and using them to write ad-hoc analytical jobs easily;
– Processing large graphs with Spark GraphFrames;
– Debugging, profiling and optimizing Spark application performance.
Still in doubt? Check this out. Become a data ninja by taking this course!
Course 2 of 5 in the Big Data for Data Engineers Specialization
ENROLL IN COURSE