AWS Interview Questions (original) (raw)

Last Updated : 3 Jun, 2026

Amazon Web Services (AWS) is one of the world’s leading cloud computing platforms that provides a wide range of services such as computing power, storage, networking, databases, and security. It helps businesses and developers build applications quickly and efficiently without managing physical hardware. AWS is widely popular because of its scalability, reliability, and global infrastructure.

1. What is IAM, and why is it critical for security? Differentiate between an IAM User, Group, and Role.

IAM (Identity and Access Management) is an AWS service used to securely manage access to AWS resources. It helps enforce the principle of least privilege by controlling who can access specific services and actions.

2. What is the difference between an AWS Region and an Availability Zone (AZ), and how do they work together?

AWS Region Availability Zone (AZ)
A geographical area where AWS data centers are located. A separate data center or group of data centers within a Region.
Contains multiple Availability Zones. Exists inside a single AWS Region.
Used to deploy resources closer to users worldwide. Used to provide high availability and fault tolerance.
Example: Mumbai (ap-south-1), Ireland (eu-west-1), N. Virginia (us-east-1). Example: ap-south-1a, ap-south-1b, ap-south-1c.

How They Work Together:

3. Explain the AWS Shared Responsibility Model.

The AWS Shared Responsibility Model divides security responsibilities between AWS and the customer. AWS manages the cloud infrastructure, including hardware, software, and data centers, while customers are responsible for securing their data, applications, and user access.

Responsibilities vary by service type

4. Differentiate between horizontal and vertical scaling.

Horizontal and Vertical scaling are two fundamental strategies for increasing the capacity of a system to handle load, but they operate on different principles.

5. Explain the AWS Well-Architected Framework. What are its pillars?

The AWS Well-Architected Framework is a set of best practices that helps design secure, reliable, efficient, and cost-effective cloud architectures on AWS. Pillars of the AWS Well-Architected Framework

6. Compare and contrast Security Groups and Network ACLs (NACLs).

Security Groups and Network ACLs (NACLs) are virtual firewalls used to control traffic inside a VPC, but they work at different levels.

Feature Security Group Network ACL (NACL)
Level Operates at the instance level. Operates at the subnet level.
State Stateful (return traffic is automatically allowed). Stateless (return traffic must be explicitly allowed).
Rules Supports only allow rules. Supports both allow and deny rules.
Application Applied to EC2 instances and other AWS resources. Applied to all resources within a subnet.
Use Case Control traffic for specific instances. Provide an additional layer of security for an entire subnet.

7. Explain how AWS Key Management Service (KMS) works and define Envelope Encryption.

AWS Key Management Service (KMS) is a managed service used to create, manage, and control encryption keys for securing data in AWS. It uses secure hardware security modules (HSMs) to protect cryptographic keys.

**Envelope Encryption: Envelope encryption is a method where data is encrypted using a data key, and the data key itself is encrypted using a master key in KMS.

**Process

8. What is the purpose of S3 Object Lock and MFA Delete?

9. Explain the purpose of a Virtual Private Cloud (VPC). What are its core components?

A Virtual Private Cloud (VPC) allows you to create a secure and isolated network within AWS, similar to a traditional on-premises network. It helps control networking, connectivity, and security for AWS resources. Core components of VPC

10. What is the difference between an Elastic IP and a Public IP address?

Feature Public IP Address Elastic IP Address
Definition A public IP automatically assigned to an EC2 instance. A static public IPv4 address allocated to your AWS account.
Persistence Changes when the instance is stopped and started. Remains the same until you release it.
Assignment Automatically assigned by AWS. Manually allocated and associated with resources.
Flexibility Cannot be easily moved between instances. Can be quickly reassigned to another instance.
Use Case Suitable for temporary internet access. Suitable for applications that require a fixed public IP address.

11. What is Amazon Route 53, and what routing policies does it support?

Amazon Route 53 is AWS’s scalable Domain Name System (DNS) service used to route user traffic to applications and AWS resources. It provides high availability, domain registration, health checks, and traffic routing capabilities. Routing Policies Supported by Route 53

12. Explain the difference between VPC Peering and AWS Transit Gateway for connecting multiple VPCs.

Feature VPC Peering AWS Transit Gateway
Connection Type Direct one-to-one connection between two VPCs. Hub-and-spoke model connecting multiple VPCs.
Routing Non-transitive routing. Supports transitive routing.
Management Requires manual route management. Centralized route management.
Scalability Management complexity increases as more VPCs are added. Easily scales to many VPCs and networks.
Use Case Connecting a few VPCs directly. Large multi-VPC or hybrid cloud environments.

**When to Choose

13. What is an EC2 instance, and what are the factors you consider when choosing an instance type?

An EC2 instance is a virtual server in AWS that provides scalable computing power with full control over the operating system and installed software. It is commonly used to host applications, websites, and databases. Factors to consider when choosing an instance type

14. What is the difference between stopping and terminating an EC2 instance?

Stopping an EC2 Instance Terminating an EC2 Instance
Temporarily shuts down the instance. Permanently deletes the instance.
The instance can be started again later. The instance cannot be recovered after termination.
Instance ID remains the same when restarted. Instance ID is lost permanently.
Useful for saving costs when the instance is not in use. Useful when the instance is no longer needed.
Example: Stop a development server during weekends and start it again on Monday. Example: Terminate a temporary testing server after project completion.

15. What is an Amazon Machine Image (AMI)?

An Amazon Machine Image (AMI) is a pre-configured template that provides the information required to launch a virtual server (an EC2 instance) in the cloud. An AMI is the fundamental unit of deployment for EC2. An AMI includes several key components

16. Explain the concept of an Auto Scaling Group (ASG). What components are needed to configure one?

An Auto Scaling Group (ASG) automatically manages the number of EC2 instances based on traffic and workload demand. It helps maintain performance, high availability, and cost optimization by scaling instances in or out automatically.

Components needed to configure an ASG

17. What is an Elastic Load Balancer (ELB)? Describe the different types.

Elastic Load Balancer (ELB) is an AWS service that distributes incoming traffic across multiple targets (like EC2, containers, Lambda), improving application availability and fault tolerance by avoiding overload and routing around unhealthy targets.

Types of ELB

**18. Explain EC2 Placement Groups and their use cases.

EC2 Placement Groups are used to control how EC2 instances are placed on AWS infrastructure for better performance, fault tolerance, or low latency.

**Types of Placement Groups

**When to Use Placement Groups

19. What is the purpose of AWS Lambda? Compare it to EC2.

AWS Lambda is a serverless compute service that runs code without managing servers. It automatically handles scaling, patching, and infrastructure management, making it ideal for event-driven applications like API requests, file uploads, and automation tasks.

Feature EC2 Lambda
Management Manages servers and OS. AWS manages everything.
Execution Long-running, stateful application. Short-lived, stateless functions.
Scaling Manual or Auto Scaling. Automatic Scaling.
Pricing Pay for uptime. Pay per request and execution time.
Best For Web servers, databases. Event-driven and serverless applications.

20. Explain AWS Lambda Cold Starts and methods to reduce them.

A “Cold Start” in AWS Lambda is the delay that occurs when a Lambda function is invoked without an already running execution environment. AWS needs to create a new environment, load the code, and initialize the runtime before executing the function.

**Note: Cold starts are more noticeable in Java and .NET runtimes compared to Node.js and Python.

**Strategies to reduce cold starts

21. What is Amazon S3, and what guarantees does it provide for durability and availability?

Amazon S3 is a scalable object storage service used for backups, archives, data lakes, and static website hosting. It stores data as objects inside buckets and is designed for high durability and availability.

22. Explain the difference between the main AWS storage services: S3, EBS, and EFS. Provide a use case for each.

Feature Amazon S3 Amazon EBS Amazon EFS
Storage Type Object Storage Block Storage File Storage
Access Method Accessed through APIs and URLs Attached to a single EC2 instance as a disk Mounted on multiple EC2 instances as a shared file system
Scalability Virtually unlimited Scales by increasing volume size Automatically scales as files are added
Performance Suitable for large amounts of unstructured data High-performance storage for applications and databases Shared storage for multiple servers
Data Sharing Easily shared over the internet Typically used by one EC2 instance at a time Can be accessed by multiple EC2 instances simultaneously

**Use Case

23. Explain the different S3 Storage Classes and the purpose of S3 Lifecycle Policies.

Amazon S3 provides different storage classes based on data access frequency and cost requirements.

**Purpose of S3 Lifecycle Policies

24. When would you choose a relational database like Amazon RDS versus a NoSQL database like DynamoDB?

25. What is Amazon Aurora, and how is it different from standard RDS databases?

Amazon Aurora is a high-performance relational database service offered by AWS, compatible with MySQL and PostgreSQL. It is designed for better scalability, availability, and performance compared to standard RDS databases.

Feature Amazon Aurora Standard RDS
Performance Higher performance Standard performance
Storage Auto-scales up to 128 TB Limited/manual scaling
Availability Multiple copies across AZs Multi-AZ support
Failover Faster failover Slower failover
Cost Higher cost Lower cost

26. What is the purpose of Amazon CloudWatch?

Amazon CloudWatch is a monitoring and observability service used to track AWS resources and applications. It helps monitor performance, analyze logs, set alerts, and maintain the overall health of systems running on AWS.

**Purpose of CloudWatch

27. What is the difference between Amazon ECS and Amazon EKS, and when would you use each one?

Feature Amazon ECS Amazon EKS
Platform AWS-native container orchestration service. Fully managed Kubernetes service.
Setup & Management Simple to set up and manage. More complex and requires Kubernetes expertise.
Integration Deep integration with AWS services (IAM, VPC, ELB). Integrates with AWS and the Kubernetes ecosystem.
Portability Primarily AWS-focused. Supports multi-cloud and Kubernetes portability.

**When to Choose

28. What is Infrastructure as Code (IaC), and what is the role of AWS CloudFormation?

Infrastructure as Code (IaC) is the practice of managing and provisioning infrastructure using code instead of manual configuration. It helps automate deployments, maintain consistency, and reduce human errors.

AWS CloudFormation is AWS’s native IaC service used to create and manage infrastructure through templates.

**Role of Infrastructure as Code (IaC)

29. Explain the difference between Amazon SNS and Amazon SQS. When would you use each one?

Amazon SNS (Simple Notification Service) and Amazon SQS (Simple Queue Service) are AWS messaging services, but they serve different purposes.

Feature Amazon SNS Amazon SQS
Type Pub/Sub Messaging Service Message Queue Service
Communication One-to-Many One-to-One
Delivery Sends messages to multiple subscribers instantly Stores messages until processed
Use Case Notifications, alerts, event broadcasting Background processing, decoupling applications
Examples Email, SMS, Lambda, SQS Order processing, task queues

**When to Use

30. Explain the difference between Monolithic and Microservices architecture.

Monolithic and Microservices are two different software architecture approaches used to build applications.

Feature Monolithic Architecture Microservices Architecture
Structure Single unified application. Application divided into multiple independent services.
Coupling Components are tightly coupled. Services are loosely coupled and communicate through APIs.
Deployment Entire application is deployed together. Each service can be deployed independently.
Scalability Scales as a whole application. Individual services can be scaled separately.
Maintenance Becomes harder to maintain as the application grows. Easier to maintain and update individual services.
Development Simpler to develop initially. More complex to design and manage.

31. Explain Blue/Green Deployment and Rolling Deployment in AWS.

Blue/Green Deployment and Rolling Deployment are software deployment strategies used to release application updates with minimal downtime and reduced risk.

**When to Use

32. You need to provide an EC2 instance in a private subnet with access to the internet to download software patches. How would you achieve this securely?

**To securely provide internet access to an EC2 instance in a private subnet:

33. You are designing a serverless API backend. Which AWS services would you use, and what would the architecture look like?

A serverless API backend on AWS can be built using fully managed services for scalability, low cost, and minimal server management.

**Architecture Components

34. How would you design a highly available and fault-tolerant architecture for a critical web application on AWS?

To design a highly available and fault-tolerant web application on AWS, resources should be distributed across multiple Availability Zones (AZs) to avoid single points of failure.

35. How would you design a CI/CD pipeline for a containerized application on AWS?

A CI/CD pipeline for a containerized application on AWS automates code building, testing, and deployment using AWS DevOps services.

**Pipeline Components

36. How would you troubleshoot intermittent 502 Bad Gateway errors in an application behind an ALB and Auto Scaling Group?

To diagnose intermittent 502 Bad Gateway errors in an application behind an ALB and Auto Scaling Group, follow a systematic troubleshooting approach

37. How would you investigate a sudden increase in AWS costs and reduce expenses?

To investigate an unexpected AWS cost spike and recommend savings, follow these steps

38. Design a scalable, fault-tolerant, and cost-effective architecture for a global photo-sharing application.

A scalable, fault-tolerant, and cost-effective global photo-sharing application should be designed to handle millions of users, provide low-latency access worldwide, and automatically scale based on demand while minimizing operational overhead.

**Architecture Components

39. How would you respond if IAM access keys were accidentally exposed in a public GitHub repository?

If IAM access keys are accidentally exposed in a public GitHub repository, it should be treated as a security incident because the keys may already be compromised. The priority is to secure the AWS account, investigate any unauthorized activity, and prevent future exposures.

40. How would you migrate a 10TB on-premises Oracle database to AWS with minimal downtime?

41. How would you design a scalable and manageable AWS network architecture for a large enterprise?

To design a scalable, secure, and manageable network architecture for a large enterprise, use a hub-and-spoke model with centralized networking and governance.

**Architecture Components

42. How would you troubleshoot Lambda timeouts when connected to RDS inside a VPC?

43. What AWS services would you use to ingest, process, and analyze real-time IoT sensor data?

  1. **Data Ingestion - Amazon Kinesis Data Streams: Collect high-velocity data with guaranteed durability and ordering, handling massive concurrent producers.
  2. **Real-Time Processing:
    • **AWS Lambda: Event-driven processing for lightweight tasks like filtering, enrichment, format conversion, and alerting; integrates with SNS for notifications.
    • **Amazon Kinesis Data Analytics: Run continuous SQL queries for time-series analysis, aggregations, and anomaly detection without managing infrastructure.
  3. **Durable Storage - Amazon S3: Store raw and processed data in a scalable, durable, and cost-effective data lake for long-term retention and analytics.
  4. **Ad-Hoc Analytics Amazon Athena: Query S3 data directly using SQL, or optionally load structured data into Amazon Redshift for high-performance analytics and BI.
  5. **Optional Enhancements: Secure data flow with IAM roles, KMS encryption, and VPC endpoints; monitor using CloudWatch and X-Ray; manage metadata and ETL with AWS Glue.

44. How would you design a zero-trust networking environment on AWS for microservices running on EKS?

To design a zero-trust networking environment for microservices on EKS

45. Your company has a critical application with a Recovery Time Objective (RTO) of 15 minutes and a Recovery Point Objective (RPO) of 1 minute. The application runs in a single AWS Region. What disaster recovery (DR) strategy would you recommend to meet these requirements in case of a full region failure?

To meet a 15-minute RTO and 1-minute RPO for a critical application, a Warm Standby disaster recovery strategy is recommended.

46. An EC2 instance is unable to access objects stored in an Amazon S3 bucket. How would you troubleshoot and resolve the issue?

To troubleshoot this issue, we would follow a systematic approach:

47. Design an AWS-based e-commerce platform that can automatically scale during peak shopping periods while maintaining high availability and security.

To build a scalable, highly available, and secure e-commerce platform on AWS, we would use a multi-tier architecture with automatic scaling and fault tolerance.

**Architecture Components

48. An Auto Scaling Group is not launching new EC2 instances even though CPU utilization is above the scaling threshold. How would you troubleshoot the issue?

To troubleshoot this issue, we would verify each component involved in the Auto Scaling process.

49. An AWS Region becomes unavailable. How would you ensure your application continues serving users with minimal downtime?

To ensure business continuity during a regional outage, we would use a multi-region disaster recovery strategy.

50. Your application stores sensitive credentials such as API keys and database passwords. How would you securely manage and rotate these secrets in AWS?

To securely manage sensitive credentials, we would use AWS Secrets Manager.

51. An application generates and stores millions of files each day. How would you design the AWS storage architecture to ensure scalability, durability, and cost optimization?

To design a scalable, durable, and cost-effective storage solution, we would use Amazon S3 as the primary storage service.

**Architecture Components

52. A microservices-based application running on EKS is experiencing communication failures between services. How would you troubleshoot the issue?

To troubleshoot communication failures between microservices on EKS, we would systematically verify networking, service discovery, and application configurations.

**Troubleshooting Steps