Amazon Web Services (AWS): A Comprehensive Guide to Cloud Computing Services
This guide explores Amazon Web Services (AWS), a leading cloud computing platform. Learn about its core services (EC2, S3, etc.), understand its pay-as-you-go pricing model, and delve into key security concepts like key pairs for secure access to your AWS resources. A valuable resource for anyone exploring cloud computing.
AWS (Amazon Web Services) Interview Questions
What is AWS?
Question 1: What is AWS?
AWS (Amazon Web Services) is a comprehensive cloud computing platform offering a wide range of services, including compute (EC2), storage (S3), databases, networking, analytics, and more. It's a pay-as-you-go service, meaning you only pay for the resources you consume.
Components of AWS
Question 2: Components of AWS
Major AWS components:
- S3 (Simple Storage Service): Object storage for various data types (images, documents, etc.).
- EC2 (Elastic Compute Cloud): Provides scalable compute capacity (virtual machines).
- EBS (Elastic Block Store): Persistent block storage for EC2 instances.
- CloudWatch: Monitoring and logging service.
- IAM (Identity and Access Management): User authentication and authorization.
- SES (Simple Email Service): For sending emails.
- Route 53: DNS (Domain Name System) service.
Key Pairs
Question 3: What are Key Pairs?
Key pairs in AWS (a public key and a private key) are used for authentication and encryption. The public key is used to encrypt data; the private key decrypts it. They provide secure access to your EC2 instances and other AWS resources.
S3 (Simple Storage Service)
Question 4: What is S3?
Amazon S3 (Simple Storage Service) is a scalable object storage service offered by Amazon Web Services (AWS). It stores data as objects within buckets. It is highly reliable, secure and provides high availability.
EC2 Pricing Models
Question 5: EC2 Pricing Models
EC2 instance pricing models:
- On-Demand: Pay-as-you-go; no upfront commitment.
- Reserved Instances: Discounted pricing; upfront payment required.
- Spot Instances: Highly discounted; instances are available as long as the bid price is above the current spot price. The instances are terminated if your bid price is no longer sufficient.
- Dedicated Hosts: Dedicated physical servers.
AWS Lambda
Question 6: AWS Lambda
AWS Lambda is a serverless compute service. You upload your code, and AWS runs it automatically based on triggers (without managing servers). You only pay for the compute time used.
S3 Bucket Limits
Question 7: How Many Buckets Can Be Created in S3?
There's no limit to the number of buckets you can create. However, each bucket's name must be globally unique.
Cross-Region Replication
Question 8: Cross-Region Replication
Cross-Region Replication in Amazon S3 copies objects asynchronously to another bucket in a different AWS region. This provides redundancy and data protection across regions.
Learn More About Cross-Region Replication
CloudFront
Question 9: CloudFront
Amazon CloudFront is a Content Delivery Network (CDN) that caches content closer to end-users, improving performance and reducing latency.
Learn More About Amazon CloudFront
Regions and Availability Zones
Question 10: Regions and Availability Zones
An AWS region is a geographical area; availability zones are isolated locations within a region, providing high availability.
Learn More About AWS Regions and Availability Zones
Edge Locations
Question 11: Edge Locations
Edge locations are points in a CDN (like CloudFront) where content is cached, bringing content closer to users for faster delivery.
Learn More About Edge Locations
S3 Object Size Limits
Question 12: Minimum and Maximum Object Size in S3
Minimum object size: 0 bytes. Maximum object size: 5 TB (terabytes).
EBS (Elastic Block Store) Volumes
Question 13: EBS Volumes
EBS (Elastic Block Store) volumes provide persistent block storage for use with Amazon EC2 instances. They offer various performance options and are automatically replicated within their Availability Zone.
What is AWS?
Question 1: What is AWS?
AWS (Amazon Web Services) is a comprehensive cloud computing platform offering various services (compute, storage, databases, networking, etc.). It provides on-demand access to computing resources, often described as a "pay-as-you-go" model. This helps organizations to build, deploy, and manage applications and services in a scalable and flexible manner, reducing the need for managing physical infrastructure.
Components of AWS
Question 2: Main Components of AWS
Core AWS services:
- S3 (Simple Storage Service): Object storage for unstructured data (images, videos, etc.).
- EC2 (Elastic Compute Cloud): Provides scalable computing resources (virtual machines).
- EBS (Elastic Block Store): Persistent block storage for EC2 instances.
- CloudWatch: Monitoring and logging.
- IAM (Identity and Access Management): Manages user access and permissions.
- SES (Simple Email Service): Sends emails.
- Route 53: DNS (Domain Name System) web service.
Key Pairs in AWS
Question 3: Key Pairs
Key pairs in AWS use public-key cryptography for secure access to resources. A key pair consists of a public key and a private key. The public key is used to encrypt data, while the corresponding private key is used to decrypt it.
S3 (Simple Storage Service)
Question 4: What is S3?
Amazon S3 (Simple Storage Service) is a highly scalable and durable object storage service. It stores data as objects within containers called buckets.
EC2 Instance Pricing Models
Question 5: EC2 Instance Pricing Models
EC2 offers various pricing models:
- On-Demand: Pay-as-you-go; billed per hour or second.
- Reserved Instances: Discounted pricing for long-term commitments.
- Spot Instances: Highly discounted; use excess capacity; instances can be interrupted.
- Dedicated Hosts: Dedicated physical servers for your use.
AWS Lambda
Question 6: AWS Lambda
AWS Lambda is a serverless compute service that runs code in response to events or triggers. You don't manage servers; you upload your code, and AWS handles the execution. You only pay for the compute time used.
S3 Bucket Limits
Question 7: S3 Bucket Limits
There is no limit on the number of buckets you can create. Each bucket name must be globally unique.
Cross-Region Replication
Question 8: Cross-Region Replication
Cross-Region Replication in S3 asynchronously copies objects to a bucket in a different AWS region. This provides redundancy and improved data availability.
Learn More About Cross-Region Replication
CloudFront
Question 9: CloudFront
Amazon CloudFront is a Content Delivery Network (CDN) that delivers content to users with low latency. It caches content at edge locations around the world, speeding up delivery times.
Regions and Availability Zones
Question 10: Regions and Availability Zones
Regions are geographical areas; availability zones are isolated locations *within* a region that are designed to provide high availability. Having multiple AZs in a region increases resilience to outages.
Learn More About AWS Regions and AZs
Edge Locations
Question 11: Edge Locations
Edge locations are points in a CDN where content is cached, minimizing latency for end-users.
Learn More About Edge Locations
S3 Object Size
Question 12: Minimum and Maximum Object Size in S3
Minimum size: 0 bytes. Maximum size: 5 TB.
EBS Volumes
Question 13: EBS Volumes
EBS (Elastic Block Store) volumes provide persistent block storage for EC2 instances. They're highly durable, available, and offer various performance options.
Auto Scaling
Question 14: Auto Scaling
Auto Scaling in AWS automatically adjusts compute capacity (EC2 instances) based on demand. It maintains application performance and availability while optimizing costs.
Benefits of Auto Scaling
Benefits of Auto Scaling
- Easy Setup
- Smart Scaling Decisions
- Automated Performance Maintenance
Amazon Machine Images (AMIs)
Question 15: AMIs (Amazon Machine Images)
AMIs are templates for creating EC2 instances. They contain the operating system, software, and configurations needed for your instances.
Sharing AMIs
Question 16: Sharing AMIs
Yes, AMIs can be shared with other AWS accounts.
Elastic IP Addresses (EIPs)
Question 17: Elastic IP Addresses (EIPs)
An EIP is a static public IP address that you associate with your EC2 instance. It remains assigned to your account and can be easily transferred to other instances.
Example: Website and EC2 Instance
Using an Elastic IP avoids the problem of losing the public IP address if your EC2 instance is stopped and restarted.
S3 Storage Classes
Question 18: S3 Storage Classes
(This would list different S3 storage classes, such as Standard, Intelligent-Tiering, One Zone-Infrequent Access (Amazon S3 One Zone-IA), Standard-Infrequent Access (Amazon S3 Standard-IA), Reduced Redundancy Storage (Amazon S3 RRS), Glacier Instant Retrieval, Amazon S3 Glacier Flexible Retrieval, Amazon S3 Glacier Deep Archive, describing their uses and cost tradeoffs.)
Learn More About S3 Storage Classes
Securing S3 Buckets
Question 19: Securing S3 Buckets
You secure S3 buckets using Access Control Lists (ACLs) and bucket policies. ACLs manage access to individual objects; bucket policies manage access to the entire bucket.
AWS Policies
Question 20: AWS Policies
IAM policies in AWS control access to resources. Types include identity-based policies (attached to users, groups, or roles), resource-based policies (attached to resources), and others (e.g., permissions boundaries, SCPs, session policies, and ACLs).
EC2 Instance Types
Question 21: Types of EC2 Instances
Types of EC2 Instances
Amazon EC2 offers a wide variety of instance types optimized for different workloads. Choosing the right instance type is crucial for performance and cost efficiency. Here's a breakdown of the main families and their characteristics:
General Purpose:
- T family (T4g, T3, T3a, T2): Burst performance. Cost-effective for applications with moderate CPU usage like microservices, small/medium databases, and dev environments.
- M family (M7g, M6g, M6i, M5, M5n, M5zn, M4): Balanced compute, memory, and networking. Suitable for a broad range of applications, including web servers, application servers, and backend processing.
Compute Optimized:
- C family (C7g, C6g, C6gn, C6i, C5, C5n, C4): High CPU performance. Ideal for compute-intensive applications like HPC, scientific modeling, batch processing, and gaming servers.
Memory Optimized:
- R family (R7g, R6g, R6gd, R5, R5n, R5b, R4): Large memory capacity. Designed for memory-intensive applications like in-memory databases, big data analytics, caching, and real-time processing.
- X family (X2gd, X2iezn, X1e, X1): Highest memory-to-vCPU ratio. Optimized for in-memory databases like SAP HANA and HPC applications needing significant memory.
Accelerated Computing:
- P family (P4d, P3, P3dn, P2): GPU-based for general-purpose GPU computing, machine learning, and HPC.
- Inf family (Inf2, Inf1): FPGA-based for hardware acceleration, particularly machine learning inference, video processing, and genomics.
- G family (G5g, G5, G4dn, G4ad, G3): GPU-based, optimized for graphics-intensive applications like gaming, video streaming, and remote graphics workstations.
- DL family (DL1): Designed for training machine learning models using Habana Gaudi accelerators.
- Trn family (Trn1): Designed for training deep learning models at scale using Trainium accelerators and high networking bandwidth.
Storage Optimized:
- I family (I4i, I3en, I3, I2): High IOPS and throughput for I/O-intensive workloads like NoSQL databases, transactional databases, and data warehousing.
- D family (D3, D2): High disk throughput and large local storage. Suitable for applications needing substantial local storage, such as big data processing and log analysis.
- H family (H1): Magnetic storage optimized for frequently accessed data. Cost-effective for large datasets requiring frequent access.
Instance Sizes:
Each family offers various sizes (e.g., nano
, micro
, small
, etc.), impacting vCPUs, memory, network performance, and storage.
Other Considerations:
- Bare Metal Instances: Access to the underlying physical hardware.
- Spot Instances: Unused capacity at discounted prices, ideal for fault-tolerant workloads.
- Reserved Instances: Reserved capacity with discounted hourly rates for long-term commitments.
- Dedicated Hosts: Physical servers dedicated to a single AWS account.
- Capacity Reservations: Reserve capacity in a specific Availability Zone.
Remember to consult the official AWS documentation for the most up-to-date information as instance types and offerings evolve.
AWS Compute: Instance Types
Question 21: EC2 Instance Types
Amazon EC2 offers various instance types optimized for different workloads:
Instance Type | Description | Use Cases |
---|---|---|
General Purpose (T2, M4, M3) | Balanced CPU, memory, and networking | Development, small to medium-sized web applications |
Compute Optimized (C4, C3) | High CPU performance | Large compute workloads, high-performance computing |
GPU (G2) | Graphics processing unit | Graphics-intensive applications, machine learning |
Memory Optimized (R3) | High memory capacity | In-memory databases, big data analytics |
Storage Optimized (I2, D2) | High storage performance | Data warehousing, large databases |
T2 Instances: Burstable performance instances; receive CPU credits when idle.
M4 Instances: General-purpose; offer balanced CPU, memory, and network.
M3 Instances: Previous generation of general-purpose instances.
C3 Instances: Compute-optimized; high-performance processors.
C4 Instances: Newer generation of compute-optimized instances.
G2 Instances: GPU-accelerated instances; ideal for graphics-processing tasks.
R3 Instances: Memory-optimized; large memory capacity.
I2 Instances: Storage-optimized; high-performance SSDs.
D2 Instances: Dense storage instances.
Default Storage Class in S3
Question 22: Default Storage Class in S3
The default storage class in Amazon S3 is Standard
.
Amazon Snowball
Question 23: Amazon Snowball
Amazon Snowball is a physical device that you can use to transfer large amounts of data into and out of the AWS cloud. This is particularly useful when you have massive datasets that are difficult to transfer over a network.
Learn More About Amazon Snowball
Stopping vs. Terminating EC2 Instances
Question 24: Stopping vs. Terminating EC2 Instances
Differences:
Action | Effect | EBS Volumes |
---|---|---|
Stop | Shuts down the instance. | Remain attached |
Terminate | Removes the instance from your AWS account. | Deleted |
Elastic IP Addresses (EIPs)
Question 17: Elastic IP Addresses (EIPs)
An Elastic IP address (EIP) is a static public IPv4 address that is associated with your account. Unlike a regular instance IP address, an Elastic IP doesn't change if you stop or terminate your instances. You can easily disassociate and reassociate EIPs with other EC2 instances. This feature is very useful when you have a fixed URL that needs to point to an EC2 instance.
Elastic IP Address Limits
Question 25: Elastic IP Limits
You can have up to 5 Elastic IPs per region per AWS account.
Load Balancers
Question 26: Load Balancers
Load balancers distribute incoming traffic across multiple EC2 instances. This improves the availability and scalability of your applications. They prevent any single server from being overloaded.
Learn More About AWS Load Balancers
VPCs (Virtual Private Clouds)
Question 27: VPCs (Virtual Private Clouds)
A VPC is a logically isolated section of the AWS cloud where you can launch AWS resources. It provides a customizable network environment for your applications. This offers improved security and control compared to using the AWS default network.
VPC Peering Connections
Question 28: VPC Peering Connections
VPC peering connects two VPCs, allowing resources in different VPCs to communicate as if they were in the same network. This is useful for connecting development, testing, and production environments.
NAT Gateways
Question 29: NAT Gateways
NAT (Network Address Translation) gateways allow instances in a private subnet to access the internet or other AWS services without having public IP addresses assigned to those instances. This improves security.
VPC Security
Question 30: Securing Your VPC
Security mechanisms for VPCs:
- Security Groups: Act as virtual firewalls for EC2 instances.
- Network ACLs (Network Access Control Lists): Act as firewalls for subnets.
RDS Database Types
Question 31: Database Types in RDS
Amazon RDS (Relational Database Service) supports various database engines:
- Amazon Aurora: A MySQL and PostgreSQL-compatible relational database engine.
- PostgreSQL: An open-source relational database.
- MySQL: An open-source relational database.
- MariaDB: An open-source relational database.
- Oracle: A commercial relational database.
S3 Default Storage Class
Question 22: Default Storage Class in S3
The default storage class in S3 is Standard
. Other storage classes (like Intelligent-Tiering, Glacier) offer cost savings for less frequently accessed data.
AWS Database Services
Question 31: Database Types in RDS
Amazon RDS (Relational Database Service) supports various database engines:
- Amazon Aurora: A MySQL and PostgreSQL-compatible relational database engine optimized for the AWS cloud.
- PostgreSQL: A powerful, open-source relational database system.
- MySQL: A widely used, open-source relational database management system.
- MariaDB: A community-developed, open-source relational database management system, which is a fork of MySQL.
- Oracle: A commercial, enterprise-grade relational database management system.
- SQL Server: Microsoft's relational database management system.
Amazon RDS manages database provisioning, patching, backups, and other administrative tasks, simplifying database management in the cloud.
Amazon Redshift
Question 32: Amazon Redshift
Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. It's designed for fast query execution on large datasets using columnar storage and massive parallel processing (MPP).
Learn More About Amazon Redshift
Amazon SNS (Simple Notification Service)
Question 33: Amazon SNS (Simple Notification Service)
Amazon SNS is a pub/sub (publish/subscribe) messaging service. It allows you to send messages to various endpoints (email, SMS, other AWS services, etc.). It is a highly scalable messaging service.
Route 53 Routing Policies
Question 34: Route 53 Routing Policies
Route 53 routing policies manage how DNS (Domain Name System) queries are directed to resources:
- Simple: Round-robin distribution across resources.
- Weighted: Distributes traffic based on weights assigned to resources.
- Latency-based: Directs traffic to the resource with the lowest latency.
- Failover: Directs traffic to a backup resource if the primary resource fails.
- Geolocation: Directs traffic based on user location.
SQS (Simple Queue Service) Message Size
Question 35: Maximum Message Size in SQS
The maximum size of a message in SQS (Simple Queue Service) is 256 KB (kilobytes).
Security Groups vs. Network ACLs
Question 36: Security Groups vs. Network ACLs
Differences:
Security Mechanism | Scope | Rules | Statefulness |
---|---|---|---|
Security Group | EC2 instances | Inbound and outbound rules (only allow) | Stateful |
Network ACL | Subnets | Inbound and outbound rules (allow and deny) | Stateless |
IAM User Access Types
Question 37: IAM User Access Types
When creating IAM (Identity and Access Management) users, you can grant:
- Console access: Access to the AWS Management Console.
- Programmatic access: Access via APIs (e.g., AWS CLI).
Subnets
Question 38: Subnets
Subnets are divisions of a VPC (Virtual Private Cloud) network. They are used to further isolate resources and customize network configurations.
S3 vs. EC2
Question 39: Amazon S3 vs. Amazon EC2
Differences:
Service | Purpose |
---|---|
Amazon S3 | Object storage |
Amazon EC2 | Compute (virtual machines) |
VPC Peering Across Regions
Question 40: VPC Peering Across Regions
No, VPC peering is not supported across different AWS regions; it's limited to the same region.
Subnet Limits per VPC
Question 41: Subnet Limits per VPC
You can have a maximum of 200 subnets per VPC.
EC2 Launch Date
Question 42: EC2 Launch Date
Amazon EC2 was officially launched in 2006.
Amazon ElastiCache
Question 43: Amazon ElastiCache
Amazon ElastiCache is a managed in-memory data store service. It simplifies deploying and managing caching solutions in the cloud to improve application performance.
Learn More About Amazon ElastiCache
AMI Types
Question 44: Types of AMIs
Two main types of AMIs:
- Instance Store-Backed: Root device is on instance storage (data lost if instance terminates).
- EBS-Backed: Root device is an EBS volume (data persists even if instance terminates).
Loan Grading
Question 28: Loan Grading in Banking
Loan grading is a crucial risk assessment process in banking. It involves assigning a quality rating (or grade) to a loan based on the borrower's creditworthiness and the loan's characteristics. This grade reflects the perceived likelihood of the loan defaulting.
Factors Considered in Loan Grading:
- Borrower's Credit History: Past repayment behavior, credit score, existing debt levels, and credit utilization.
- Borrower's Financial Strength: Income stability, employment history, assets, and liabilities.
- Loan Purpose: What the loan will be used for (e.g., business expansion, home purchase, personal expenses). Different loan purposes carry different levels of risk.
- Loan Amount and Term: The larger the loan amount and the longer the term, the higher the risk.
- Collateral: Assets pledged as security for the loan. The presence and value of collateral reduce the risk.
- Industry and Economic Conditions: The borrower's industry and the overall economic climate can impact their ability to repay.
- Loan Structure: Features like interest rate type (fixed or variable), repayment schedule, and any guarantees can influence risk.
Loan Grading Process:
- Gathering Information: Collecting data about the borrower and the loan application.
- Credit Analysis: Evaluating the borrower's creditworthiness using credit reports, financial statements, and other relevant information.
- Assigning a Loan Grade: Based on the analysis, the loan is assigned a grade (e.g., A, B, C, D) or a numerical rating.
- Risk Mitigation: Implementing measures to mitigate the identified risks, such as requiring collateral, setting higher interest rates for riskier loans, or establishing loan covenants.
- Monitoring: Regularly reviewing the loan's performance and the borrower's financial condition to identify any potential problems.
Importance of Loan Grading:
- Risk Management: Helps banks identify and manage the risk of loan defaults.
- Pricing Decisions: Loan grades influence interest rates, fees, and other loan terms.
- Capital Allocation: Banks allocate capital based on the risk profile of their loan portfolio.
- Regulatory Compliance: Regulators require banks to assess and manage credit risk effectively.
- Portfolio Management: Loan grading facilitates diversification and optimization of the loan portfolio.
Effective loan grading is essential for the financial health of banks and the stability of the financial system. It enables banks to make informed lending decisions and manage credit risk effectively.
Amazon EMR (Elastic MapReduce)
Question 45: Amazon EMR
Amazon EMR (Elastic MapReduce) is a managed Hadoop framework in the AWS cloud. It simplifies running large-scale data processing jobs using Hadoop, Spark, and other big data technologies. You create a cluster of EC2 instances to run your jobs, and EMR manages the cluster's setup, configuration, and scaling. EMR is useful for processing massive datasets efficiently and cost-effectively.
Node Types in an EMR Cluster:
- Master Node: Orchestrates the tasks within the cluster.
- Core Node: Processes data and stores it in HDFS (Hadoop Distributed File System).
- Task Node: Processes data (doesn't store in HDFS).
Connecting EBS Volumes to Multiple Instances
Question 46: Connecting EBS Volumes to Multiple Instances
You cannot directly attach a single EBS (Elastic Block Store) volume to multiple EC2 instances simultaneously. However, you can attach multiple EBS volumes to a single EC2 instance.
Auto Scaling Lifecycle Hooks
Question 47: Lifecycle Hooks in Auto Scaling
Lifecycle hooks in AWS Auto Scaling allow you to run custom scripts or actions before launching or terminating EC2 instances. This provides a way to perform additional tasks (e.g., software installation, configuration changes) before an instance becomes active or after it's terminated.
Amazon Kinesis Firehose
Question 48: Amazon Kinesis Firehose
Amazon Kinesis Firehose is a fully managed service for reliably loading streaming data into destinations like Amazon S3, Redshift, and other data stores. It's used for collecting and delivering real-time data streams.
Learn More About Amazon Kinesis Firehose
Amazon Transfer Acceleration
Question 49: Amazon Transfer Acceleration
Amazon Transfer Acceleration speeds up file transfers to and from S3 buckets, especially for clients located far from AWS regions. It uses a globally distributed network to optimize transfer speeds.
Learn More About Amazon Transfer Acceleration
Accessing EBS Data
Question 50: Accessing EBS Data
EBS (Elastic Block Store) volumes are block storage devices attached to EC2 instances. You can access the data on an EBS volume by mounting it as a file system within your EC2 instance.
Horizontal vs. Vertical Scaling
Question 51: Horizontal vs. Vertical Scaling
Differences:
Scaling Type | Description |
---|---|
Vertical Scaling | Increase resources (CPU, memory, etc.) of existing instances. |
Horizontal Scaling | Add more instances to your infrastructure. |
S3 Default Storage Class
Question 22: Default Storage Class in S3
The default storage class in Amazon S3 is Standard
. This is the most cost effective storage class for frequently accessed data. Other storage classes offer lower cost for less frequently accessed data.
Snowball
Question 23: Amazon Snowball
Amazon Snowball is a physical device used to transfer large datasets to and from AWS. It’s a cost-effective solution for transferring petabyte-scale data.