AWS Certified Data Engineer Associate DEA-C01 Q31-Q40

  1. AWS Certified Data Engineer Associate DEA-C01 Q1-Q10
  2. AWS Certified Data Engineer Associate DEA-C01 Q11-Q20
  3. AWS Certified Data Engineer Associate DEA-C01 Q21-Q30
  4. AWS Certified Data Engineer Associate DEA-C01 Q31-Q40
  5. AWS Certified Data Engineer Associate DEA-C01 Q41-Q50
  6. AWS Certified Data Engineer Associate DEA-C01 Q51-Q60
  7. AWS Certified Data Engineer Associate DEA-C01 Q61-Q70
  8. AWS Certified Data Engineer Associate DEA-C01 Q71-Q80
  9. AWS Certified Data Engineer Associate DEA-C01 Q81-Q90
  10. AWS Certified Data Engineer Associate DEA-C01 Q91-Q100
  11. AWS Certified Data Engineer Associate DEA-C01 Q101-Q110
  12. AWS Certified Data Engineer Associate DEA-C01 Q111-Q120
  13. AWS Certified Data Engineer Associate DEA-C01 Q121-Q130
  14. AWS Certified Data Engineer Associate DEA-C01 Q131-Q140
  15. AWS Certified Data Engineer Associate DEA-C01 Q141-Q150
  16. AWS Certified Data Engineer Associate DEA-C01 Q151-Q160
  17. AWS Certified Data Engineer Associate DEA-C01 Q161-Q170
  18. AWS Certified Data Engineer Associate DEA-C01 Q171-Q179

Please Subscribe to Access the Premium Content

The remaining premium contents are locked. Please subscribe to the monthly newsletter to unlock the content for free.

Loading...

31. A retail company is using an Amazon Redshift cluster to support real-time inventory management. The company has deployed an ML model on a real-time endpoint in Amazon SageMaker.

The company wants to make real-time inventory recommendations. The company also wants to make predictions about future inventory needs.

Which solutions will meet these requirements? (Choose two.)

A. Use Amazon Redshift ML to generate inventory recommendations.
B. Use SQL to invoke a remote SageMaker endpoint for prediction.
C. Use Amazon Redshift ML to schedule regular data exports for offline model training.
D. Use SageMaker Autopilot to create inventory management dashboards in Amazon Redshift.
E. Use Amazon Redshift as a file storage system to archive old inventory management reports.

Answer

A, B


32. A marketing company collects clickstream data. The company sends the clickstream data to Amazon Kinesis Data Firehose and stores the clickstream data in Amazon S3. The company wants to build a series of dashboards that hundreds of users from multiple departments will use.

The company will use Amazon QuickSight to develop the dashboards. The company wants a solution that can scale and provide daily updates about clickstream activity.

Which combination of steps will meet these requirements MOST cost-effectively? (Choose two.)

A. Use Amazon Redshift to store and query the clickstream data.
B. Use Amazon Athena to query the clickstream data
C. Use Amazon S3 analytics to query the clickstream data.
D. Access the query data through a QuickSight direct SQL query.
E. Access the query data through QuickSight SPICE (Super-fast, Parallel, In-memory Calculation Engine). Configure a daily refresh for the dataset.

Answer

B, E


33. A company plans to provision a log delivery stream within a VPC. The company configured the VPC flow logs to publish to Amazon CloudWatch Logs. The company needs to send the flow logs to Splunk in near real time for further analysis.

Which solution will meet these requirements with the LEAST operational overhead?

A. Configure an Amazon Kinesis Data Streams data stream to use Splunk as the destination. Create a CloudWatch Logs subscription filter to send log events to the data stream.
B. Create an Amazon Kinesis Data Firehose delivery stream to use Splunk as the destination. Create a CloudWatch Logs subscription filter to send log events to the delivery stream.
C. Create an Amazon Kinesis Data Firehose delivery stream to use Splunk as the destination. Create an AWS Lambda function to send the flow logs from CloudWatch Logs to the delivery stream.
D. Configure an Amazon Kinesis Data Streams data stream to use Splunk as the destination. Create an AWS Lambda function to send the flow logs from CloudWatch Logs to the data stream.

Answer

B


34. A company stores CSV files in an Amazon S3 bucket. A data engineer needs to process the data in the CSV files and store the processed data in a new S3 bucket.

The process needs to rename a column, remove specific columns, ignore the second row of each file, create a new column based on the values of the first row of the data, and filter the results by a numeric value of a column.

Which solution will meet these requirements with the LEAST development effort?

A. Use AWS Glue Python jobs to read and transform the CSV files.
B. Use an AWS Glue custom crawler to read and transform the CSV files.
C. Use an AWS Glue workflow to build a set of jobs to crawl and transform the CSV files.
D. Use AWS Glue DataBrew recipes to read and transform the CSV files.

Answer

D


35. A company uses Apache Airflow to orchestrate the company’s current on-premises data pipelines. The company runs SQL data quality check tasks as part of the pipelines. The company wants to migrate the pipelines to AWS and to use AWS managed services.

Which solution will meet these requirements with the LEAST amount of refactoring?

A. Setup AWS Outposts in the AWS Region that is nearest to the location where the company uses Airflow. Migrate the servers into Outposts hosted Amazon EC2 instances. Update the pipelines to interact with the Outposts hosted EC2 instances instead of the on-premises pipelines.
B. Create a custom Amazon Machine Image (AMI) that contains the Airflow application and the code that the company needs to migrate. Use the custom AMI to deploy Amazon EC2 instances. Update the network connections to interact with the newly deployed EC2 instances.
C. Migrate the existing Airflow orchestration configuration into Amazon Managed Workflows for Apache Airflow (Amazon MWAA). Create the data quality checks during the ingestion to validate the data quality by using SQL tasks in Airflow.
D. Convert the pipelines to AWS Step Functions workflows. Recreate the data quality checks in SQL as Python based AWS Lambda functions.

Answer

C


36. The company stores a large volume of customer records in Amazon S3. To comply with regulations, the company must be able to access new customer records immediately for the first 30 days after the records are created. The company accesses records that are older than 30 days infrequently.

The company needs to cost-optimize its Amazon S3 storage.

Which solution will meet these requirements MOST cost-effectively?

A. Apply a lifecycle policy to transition records to S3 Standard Infrequent-Access (S3 Standard-IA) storage after 30 days.
B. Use S3 Intelligent-Tiering storage.
C. Transition records to S3 Glacier Deep Archive storage after 30 days.
D. Use S3 Standard-Infrequent Access (S3 Standard-IA) storage for all customer records.

Answer

A


37. A company uses a data lake that is based on an Amazon S3 bucket. To comply with regulations, the company must apply two layers of server-side encryption to files that are uploaded to the S3 bucket. The company wants to use an AWS Lambda function to apply the necessary encryption.

Which solution will meet these requirements?

A. Use both server-side encryption with AWS KMS keys (SSE-KMS) and the Amazon S3 Encryption Client.
B. Use dual-layer server-side encryption with AWS KMS keys (DSSE-KMS).
C. Use server-side encryption with customer-provided keys (SSE-C) before files are uploaded.
D. Use server-side encryption with AWS KMS keys (SSE-KMS).

Answer

B


38. A data engineer needs to create an empty copy of an existing table in Amazon Athena to perform data processing tasks. The existing table in Athena contains 1,000 rows.

Which query will meet this requirement?

A. CREATE TABLE new_table –
LIKE old_table;

B. CREATE TABLE new_table –
AS SELECT *
FROM old_table –
WITH NO DATA;

C. CREATE TABLE new_table –
AS SELECT *
FROM old_table;

D. CREATE TABLE new_table –
as SELECT *
FROM old_cable –
WHERE 1=1;

Answer

B


39. A company has a data lake in Amazon S3. The company collects AWS CloudTrail logs for multiple applications. The company stores the logs in the data lake, catalogs the logs in AWS Glue, and partitions the logs based on the year. The company uses Amazon Athena to analyze the logs.

Recently, customers reported that a query on one of the Athena tables did not return any data. A data engineer must resolve the issue.

Which combination of troubleshooting steps should the data engineer take? (Choose two.)

A. Confirm that Athena is pointing to the correct Amazon S3 location.
B. Increase the query timeout duration.
C. Use the MSCK REPAIR TABLE command.
D. Restart Athena.
E. Delete and recreate the problematic Athena table.

Answer

A, C


40. A data engineer wants to orchestrate a set of extract, transform, and load (ETL) jobs that run on AWS. The ETL jobs contain tasks that must run Apache Spark jobs on Amazon EMR, make API calls to Salesforce, and load data into Amazon Redshift.

The ETL jobs need to handle failures and retries automatically. The data engineer needs to use Python to orchestrate the jobs.

Which service will meet these requirements?

A. Amazon Managed Workflows for Apache Airflow (Amazon MWAA)
B. AWS Step Functions
C. AWS Glue
D. Amazon EventBridge

Answer

A


Leave a Comment

Your email address will not be published. Required fields are marked *


Scroll to Top