AWS SAA Preparation

08/27/21
moon moon

Code SAA-C02 (Study guide) AWS has a bunch of regions, so how do you pick a region to use?

Compliance with government and local requirements
Proximity to customers for reduced latency
Available services within a region
Pricing which varies region by region

Regions have Availability Zones (min. 2, max. 6, avg. 3). AZs are one or more discrete data centers. They’re separated to be isolated from disasters, but they're connected with high-bandwidth low latency network connection.

Amazon has 216 Points of Presence (205 Edge Locations and 11 Regional Caches) in 84 cities across 42 countries.

Services

Accessing Services

Users have three options:

AWS Management Console
AWS Command Line Interface (CLI)
AWS Software Development Kit (SDK)

The AWS CLI and SDK are protected by Access Keys, which can be generated through the AWS Console. Each user manages its own access keys. These keys are just as secret as passwords, do not share them. The Access Key ID is like a username and the Secret Access Key is like a password.

The AWS SDK allows programatic access to AWS services and can be embedded within applications. It supports a variety of popular languages such as JavaScript/TypeScript, Python, PHP, .NET, Ruby, Java, Go, Node.js, and C++ among others. There’s also some mobile SDKs for iOS and Android, as well as some IoT Device SDKs for embedded systems using C or Arduino. There is also an AWS CloudShell that allows CLI access from a browser. You can even create files in your CloudShell that will stick around for return sessions! It also lets you upload and download files from your CloudShell environment which is pretty neat.

CloudShell is not available in all regions yet.

IAM - Identity and Access Management

Is a Global Service that does not require region selection.

All about Users and Groups.

When you create an account it auto creates a root account, but you should not use this account for anything. You should create Users in IAM that let people in your organization sign in and be grouped.

Groups can only contain users, not other groups. However, Users can belong to multiple groups.

Users and Groups can be assigned a JSON document called a Policy. This document describes what users and users inside of groups are allowed to do.

You do not use the root user for anything other than initial user account management. Create an Admins group with the administrator policy and add users to that if you desperately need global control. Best practice is to set up a few groups for various departments and add users to those. That way they can have baseline permissions for their jobs.

IAM comes with a large amount of policies to cover most of the bases, but you can also create your own policies either by directly writing JSON or by using a visual editor.

If you’re trying to get permissions for more than one service or something covered by one or more pre-made policies, use a user group.

Some AWS services need to perform account actions on our behalf. This is achieved with IAM Roles, which are just like users but can be used by AWS services. For example, EC2 Instance roles, Lambda Function roles, and roles for CloudFormation.

Best Practices Overview

Don’t use root account for anything other than initial setup
One physical user = one AWS IAM user
Assign users to groups and assign permissions to groups
Create a strong password policy
Use and enforce the use of MFA
Create and use Roles for giving permissions to AWS services
Use Access Keys (and keep them secret) for programmatic access using the CLI or SDK
Audit permissions of your account with IAM Credentials Report
Never share IAM users and Access Keys

Policies Structure

Here is an example of what a policy looks like.

{
    "Version": "2012-10-17",
    "Id": "S3-Account-Information",
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Principle": {
                "AWS": ["arn:aws:iam::123456789012:root"]
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": ["arn:aws:s3::mybucket/*"]
        }
    ]
}

It consists of:

Version: policy language version, always include "2012-10-17"
Id: an optional identifier for the policy
Statement: one or more individual statements
- Sid: an optional identifier for the statement
- Effect: whether the statement allows or denies access
- Principle: account/user/role to which the policy is applied to
- Action: list of actions/API calls which the policy allows or denies
- Resource: list of resources to which the actions apply to
- Condition: an optional condition for when this policy is in effect

Policies Inheritance

[Image: Screen Shot 2021-06-29 at 11.36.42 AM.png] there's supposed to be an image here, i need to figure out how to upload it

Policies can be assigned to a group and all members of the group will have that policy.

Users in multiple groups will get the policies from both groups.

Inline policies can be applied to a user.

IAM Security Tools

IAM Credentials Report (account-level): create a report of all account’s users and their credential statuses.
IAM Access Advisor (user-level): shows service permissions for the user and when they were last accessed.

Security with IAM

You can set password policies with minimum length and/or specific required character types. You can allow users to change their own passwords, and you can even set a password expiration and prevent password reuse.

Multi Factor Authentication should definitely be enabled on root accounts and on other IAM users as well. MFA means a password that you know and a security device that you own. This is a decent means of verifying identity. In AWS the MFA options are:

Virtual MFA device: Google Authenticator, Microsoft Authenticator, Authy, etc. These Virtual MFA devices typically have support for multiple tokens on the same device so you can use that device for a bunch of accounts/IAM users.
Universal 2nd Factor (U2F) Security Key: YubiKey
Hardware Key Fob MFA Device: such as the ones provided by Gemalto
Hardware Key Fob MFA Device for AWS GovCloud (US): special key fob specifically for GovCloud accounts provided by SurePassID

Budgets and Billing

Billing and Cost Management dashboard requires activation of IAM access. This allows users to access billing information. You can view bills by months with every charge broken down and itemized by service and reason. You can also download a CSV of this information.

From the home page you can access Top Free Tier Services by Usage which shows some information about the free tier of your account.

Budgets can send you alerts based on thresholds of both actual and forecasted usage costs.

EC2 - Elastic Compute Cloud

Is region dependent (Regional Service).

Is one of the most popular services offered by AWS.

It is composed of several different parts:

EC2 Instances: Renting virtual machines in the cloud
EBS Volumes: attachable block storage to act like hard drives
ELB: distribute load across machines
ASG: scale services automatically to meet demand

There are three operating systems on offer, Windows, Linux, and MacOS. Just like any virtual machine you can configure computer power and CPU cores, RAM, storage space, and network cards. In addition you can also add virtual firewalls called Security Groups and bootstrap scripts called EC2 User Data.

You can choose to decrease the amount of vCPUs available to the instance to save on licensing cost if you don’t need that many. For example, if you find an instance that has 32GB of RAM and you only need 4vCPUs you can set that.

"licensing cost" here means if an application you’re running is charging licensing per CPU core.

For HPC workloads you can also set up the instance to disable multithreading/hyper-threading for better core performance.

Networking

AWS can use both IPv4 and IPv6, but mostly you just use IPv4. IPv6 is more for IoT devices. IPv4 allows for 3.7 billion addresses (which are almost running out). IPs must be unique in the network, whether that be in a private subnet or in the public world wide web.

Ports are process or application-specific software constructs serving as communication endpoints. There are a few common ports to know:

22: SSH
21: FTP
22: SFTP (FTP over SSH)
80: HTTP
443: HTTPS
3389: RDP

In AWS subnets are all private. The only way for traffic to go outside of the subnet is to have an Internet Gateway (a proxy).

Internet Gateways basically function kind of like routers if you think about it too much. They’re the gateway with a public IP and internal gateway address that allows traffic to go between subnets, but in this case the other subnets are the external internet.

EC2 Instances will be assigned a private IP which remains constant for that instance. Public IPs can be automatically assigned depending on the subnet settings and, along with the public DNS, will change every time the instance is stopped and started again. If you need a fixed public IP you can assign an Elastic IP. Elastic IPs are public IPv4 addresses that you say you own and can then attach it to one instance. By default you can only have 5 Elastic IPs in your account by default (you can ask AWS to increase that but it’s quite rare to use them). It is recommended to not use them and utilize Route53 to assign a DNS name.

ENI - Elastic Network Interface

Virtual network cards used for EC2 Instances among other things. They are a logical component in a VPC and bound to a specific AZ. Since EC2 Instances can have multiple (might be max 2, I dunno) ENIs, the instances are not bound to subnets. Only the ENIs are linked to subnets. This allows you to connect an EC2 Instance to multiple subnets.

ENIs are literally just virtual network cards. Two ENIs means two IPs, means two ports with different routing rules on a router, means one port can be for one subnet and the other port can be for a different subnet.

ENIs can have the following attributes:

One primary IPv4 address plus one or more secondary IPv4 addresses
One Elastic IP
One public IPv4 address
One or more Security Groups
MAC address

ENIs can be created independently to EC2 Instances so that they can be attached or moved at will (for things like failover). When creating an ENI you can either auto-assign a private IP or manually assign one, as well as attach a Security Group. ENIs can be hot-attached.

Bootstrapping

Bootstrapping means launching commands on first start. The EC2 User Data is a script that will execute only once when the machine first starts and can be used to configure the instance. It’s used to automate boot tasks such as installing software and updates, downloading common files, and other various first time single run tasks.

These EC2 User Data scripts run with the root user which means it has sudo permissions.

EC2 Storage Options

There are a few different kinds of available storage:

Elastic Block Storage: virtual block level volumes.
Elastic File System: serverless filesystem for network attached storage.
EC2 Instance Store: short-term temporary block storage for EC2 Instances.

AMI - Amazon Machine Image

AMIs are starter images (think ISO or IMG) for creating EC2 Instances. There are a bunch of starter ones as well as a marketplace for more, but you can also define your own. AMIs are region locked, so the same AMI ID will not be available across regions.

Instance Types

Each type of instance has different families as well. Instance types have the following naming convention: m5.2xlarge

m: instance class
5: instance generation (often new hardware generations)
2xlarge: instance size within the class

General purpose instances (T) are good all-around machines with a balance between compute, memory, and network. Computer optimized instance (C) are great for compute-intensive tasks like batch processing, media transcoding, high performance web servers, high performance computing (HPC), scientific or machine learning, or a dedicated gaming server. Memory optimized instances (R and X) are great for working with large datasets in memory for things like databases, distributed caches, and real-time processing of big unstructured data.

Storage optimized instances (I, D, and H) are great for tasks that require high, sequential read and write access to large datasets. This includes things like high frequency online transaction processing (OLTP) systems, databases, cache for in-memory databases, data warehousing, and distributed file systems.

Spot Instances

Spot instances can have a discount of up to 90% compared to On-Demand.

When creating spot instances, you can set a maximum spot price to only get the instance when the current spot price is less than that. If the price rises past your specified maximum while an instance is running, you can set it to either stop or terminate (both of these options have a 2 minute grace period). Spot pricing varies by the hour because of changing capacity.

You can also specify a Spot Block. This is a reserved spot instance for a specified time (1 to 6 hours) "without interruptions." In practice, however, the instance may be reclaimed in very rare circumstances.

Spot Requests use this information along with the desired number of instances and a launch specification to create spot instances. There are two types of spot requests, One-time Requests and Persistent Requests. One-time Requests will stop or terminate the instance once it is done and leave it while Persistent Requests will attempt to restart it once capacity is available again. To cancel a Spot Request it must be either in the Open, Active, or Disabled state.

Cancelling Spot Requests will not terminate any spot instances started as a result of the request. Those must be terminated manually.

Spot Fleets

Spot Fleets are a set of Spot Instances plus (optionally) some On-Demand Instances. The Spot Fleet will try its best to meet the target capacity with defined price constraints using defined Launch Pools. The fleet will choose the best Launch Pool for you and will stop launching instances once you’ve hit either capacity or budget.

Spot Fleets have a few different strategies for choosing Launch Pools:

Lowest Price: good for short workloads
Diversified: distributed across all pools
Capacity Optimized: pool with optimal capacity for number of instances

Placement Groups

Allows you to control how EC2 Instances are placed within the AWS infrastructure. There’s no direct hardware access but you can ask AWS to provision your instances a certain way in relation to each other. Placement groups have 3 strategies:

Cluster: clusters instances into a low latency and high throughput group (same rack) in a single AZ.
Speak: spread instances across different hardware (max 7 instances per group per AZ).
Partition: spread instances across 1 to 7 different partitions (sets of racks) within an AZ.

Security Groups

Security Groups are the fundamental network security in AWS. They control network traffic flowing into and out of EC2 instances through use of allow rules. Security Group rules can reference IPs or other Security Groups. Security Groups are Groups, so they can be assigned to many instances and instances can have many Security Groups.

I believe the maximum number of Security Groups an instance can have is 5.

It can be a good idea to maintain one Security Group for SSH access separately that you can assign at will.

If you’re attempting to connect to an instance and you get a Timed Out error, the problem is with the SG. If you get a connection refused error that means the SG let it through and your application is refusing the connection.

IAM Roles and Connecting to the AWS CLI/SDK

EC2 Instances based on the Amazon Linux 2 image come with the AWS CLI (version 1) preinstalled. It is the full CLI so you could run aws configure but DO NOT ENTER ANY ACCESS KEYS. This is because those access keys will be available to other users SSHing into the instance. Instead, give the instance an IAM role with the required permissions. The AWS CLI will automatically detect this and allow you to use it.

EC2 Hibernate

The in-memory state is preserved. It’s just hibernate for fast startup. Just like with normal hibernate the contents of the RAM are stored on disk, which in this case is the root EBS volume, and its used to resume the OS quickly.

For this to work, the EBS volume must be encrypted! EC2 Hibernate also only supports C3 through C5, M4 through M5, and R3 through R5. The EC2 Instance RAM amount must also be less than 150GB. It is also not supported for bare metal instances. It only works on a few AMIs such as Amazon Linux 1 and 2, Ubuntu, and Windows. It is also only available for Reserved and On-Demand instances (no Spot) and instances cannot be hibernated for more than 60 days.

EC2 Nitro

Nitro is the underlying platform for the next generation of EC2 Instances. EC2 was using an old virtualization technolodgy and Nitro is a new one.

It allows for:

Better networking options (enhanced networking, HPC, IPv6)
Higher Speed EBS Volumes (Nitro is necessary for 64,000 IOPS - max 32,000 without it)

Nitro has better underlying security.

Nitro instances are C5+*, D3*, G4, I3, M5*, and a couple other ones like A1(I?) and Inf1(I?)

Basically any newish instance generation will be using it.

Capacity Reservations

This ensures you have available capacity when you need to launch EC2 instances. You can set reservations with either a planned or manual end-date for the reservation. The capacity access is immediate so you get billed as soon as it starts.

These are short reservations so there is no need for 1 or 3 year commitments.

You can specify:

Instance Type
Platform
Availability Zone
Tenancy
Instance Quantity

This can be combined with Reserved instances as well as Savings Plans for optimal cost savings.

Procedures

Least Privilege Principle: give users only the access they need to work in their role and nothing more. You don’t give more permissions than a user needs.

Backlinks