Parson Fortunes Ltd is an Asian-based department store operator with an extensive network of 131 stores, spanning approximately 4.1 million square meters of retail space across cities in India, China, Vietnam, Indonesia and Myanmar. Parson built a VPC to host their entire enterprise infrastructure on cloud.
Parson has large assets of data around 20 TB's of structured data and 45 TB of unstructured data and is planning to host their data warehouse on AWS and unstructured data storage on S3
The files sent from their on premise data center are also hosted into S3 buckets.
Parson IT team is well aware of the scalability, performance of AWS services capabilities.
Parson hosts their web applications, databases and the data warehouse built on Redshift in VPC The structured, semi-structured and unstructured formats are stored in S3 in various buckets.
This data be joined and queried along with data in Redshift using Redshift Spectrum.
What kind of below features supported by Redshift Spectrum? select 2 options.
Click on the arrows to vote for the correct answer
A. B. C. D.Answer : A, B.
Option A is correct -If the files are formatted in a format that Redshift Spectrum supports and located in an Amazon S3 bucket that your cluster can access, you can query the data in its original format directly from Amazon S3
The Amazon S3 bucket with the data files and the Amazon Redshift cluster must be in the same AWS Region.
https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-data-files.htmlOption B is correct -Redshift Spectrum supports the following structured and semistructured data formats:
AVRO.
PARQUET.
TEXTFILE.
SEQUENCEFILE.
RCFILE.
RegexSerDe.
Optimized row columnar (ORC)
Grok.
OpenCSV.
Ion.
JSON.
https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-data-files.htmlOption C is incorrect -Redshift Spectrum supports the following structured and semistructured data formats:
AVRO.
PARQUET.
TEXTFILE.
SEQUENCEFILE.
RCFILE.
RegexSerDe.
Optimized row columnar (ORC)
Grok.
OpenCSV.
Ion.
JSON.
https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-data-files.htmlOption D is incorrect -Redshift Spectrum supports the following compression types and extensions:
gzip - .gz.
Snappy - .snappy.
bzip2 - .bz2
https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-data-files.htmlRedshift Spectrum is a feature of Amazon Redshift that enables users to query data stored in Amazon S3 directly from their Redshift cluster. It allows users to run SQL queries against data stored in S3 without the need to load the data into Redshift first. This feature supports the following structured and semi-structured data formats: AVRO, PARQUET, TEXTFILE, SEQUENCEFILE, RCFILE, and others. This means that users can store their data in various formats in S3 and still query it using Redshift Spectrum.
Redshift Spectrum allows users to query the data in its original format directly from Amazon S3 in the same region. This is useful because it eliminates the need to copy or move data into Redshift before running queries. With Redshift Spectrum, users can store data in S3 and query it on-demand, which reduces costs and improves performance.
Redshift Spectrum does not support only CSV and TXT formats, as stated in option C. This is incorrect. Redshift Spectrum supports a variety of formats, including the ones listed above.
Finally, files in S3 can have compression for Redshift Spectrum to access the data. This means that users can compress their data in S3 to save storage space and still query it using Redshift Spectrum. Redshift Spectrum supports several compression formats, including GZIP, BZIP2, and SNAPPY.
In conclusion, the two features supported by Redshift Spectrum are: