FlexiToner uses AWS to query 10 years' worth of historical data and get results in moments, with the flexibility to explore data for deeper insights.
Movable Ink provides real-time personalization of marketing emails based on a wide range of user, device, and contextual data, driving higher response rates and better customer experiences.
Also FlexiToner hosts log files captured from web servers running out of different EC2 machines FlexiToner has lot of data assets available in structured, semi-structured and unstructured data forms containing emails, logs, structured data from databases in csv files with formats in CSV, LOG, JSON and binary formats like Parquet and ORC.
FlexiToner is interested to build a data lake out of all the files stored on S3 and provide Data Lake as a service to users from different departments based on pay per queries run.
FlexiToner understands that Athena provides this facility OOTB.FlexiToner has a group of big data professionals specialized in Hadoop and want to understand the underlying hadoop components under the hood so that the team can easily understand the platform and use it for further analytics.
Please advise. Select 2 options.
Click on the arrows to vote for the correct answer
A. B. C. D.Answer: B,D.
Option A is incorrect -Athena uses Presto to execute DML statements.
https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.htmlOption B is correct - Athena uses Hive to execute the DDL statements that create and modify schema.
https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.htmlOption C is incorrect -Athena uses Hive to execute the DDL statements that create and modify schema.
https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.htmlOption D is correct -Athena uses Presto to execute DML statements.
https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.htmlFlexiToner is interested in building a data lake on S3 and using Athena to provide Data Lake as a service to different departments based on pay per queries run. FlexiToner has a group of big data professionals specialized in Hadoop and wants to understand the underlying Hadoop components under the hood so that the team can easily understand the platform and use it for further analytics.
Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using SQL. Athena is built on top of Presto, an open-source distributed SQL query engine optimized for running ad-hoc interactive queries at massive scale. Athena uses Presto to execute SQL queries against data stored in Amazon S3.
Therefore, the correct options are:
A. Athena uses Hive to execute DML statements - This statement is incorrect. Athena does not use Hive to execute DML (Data Manipulation Language) statements. Instead, Athena uses Presto as the query engine.
B. Athena uses Hive to execute the DDL statements that create and modify schema - This statement is incorrect. Athena does not use Hive to execute DDL (Data Definition Language) statements. Instead, Athena uses Presto as the query engine.
C. Athena uses Presto to execute the DDL statements that create and modify schema - This statement is incorrect. Athena uses AWS Glue as the metadata catalog for storing schema information, and not Presto. Athena can create tables and modify schema through the AWS Management Console, the AWS SDKs, or the Athena API.
D. Athena uses Presto to execute DML statements - This statement is correct. Athena uses Presto as the query engine to execute SQL statements, including DML (Data Manipulation Language) statements that modify data in S3. Athena supports a wide range of data formats, including CSV, JSON, Parquet, ORC, and more, making it easy to query structured, semi-structured, and unstructured data stored in S3.
Therefore, the correct options are A) Athena uses Hive to execute DML statements and D) Athena uses Presto to execute DML statements.