AWS Solution Architect Associate Exam Study Notes: AWS Databases

These notes were written while working through the A Cloud Guru AWS Certified Solutions Architect - Associate online course. These notes are partly from the videos, and also from various other online sources. Primarily, they’re notes for me, but you might find them useful too.

Since the AWS platform is changing so quickly, it’s possible that some of these notes may be out of date, so please take that into consideration if you are reading them.

Please let me know in the comments below if you have any corrections or updates which you’d like me to add.

This post was last updated in March, 2019.

AWS RDS (Relational Data Storage) Databases

Used for relational OLTP (Online Transaction Processing)

There is no data charge between primary and secondary instances

DMS (Data migration service) uses replication to ensure that any changes to the source DB while migration is in progress is also applied to the target.

RDS security groups can be configured to allow traffic from another security group. i.e. Set source to ‘custom’, and allow traffic from your security group such as ‘MyWebDMZ’

Some of the common causes of not being able to connect to a DB instance on AWS are:

The DB is still being created
The local firewall is stopping communication traffic
The security groups for the DB have not been property configured

Creating a read replica does not prevent connection to the source DB.

RDS MySQL has a table size limit of 6TB, so make sure you don’t let your tables get too large. More best practices for working with MySQL on RDS

Backups

Automated backups are recoverable at any time within the configurable retention period i.e. between 1 and 35 days.

There is 1 full daily snapshot, and transaction logs for the remainder of the day.

Point in time recovery is supported during the retention period.

Automated backups are enabled by default. You get free storage matching the size of the DB.

Choose a backup window during a time of reduced load; storage i/o may be suspended during backup, and users may experience increased latency.

When restoring from an automated backup or manual snapshot, the restored version will be a new RDS instance with a new DNS endpoint.

Encryption

Encryption at rest is supported for:

MySQL
Oracle
SQL Server
PostgreSQL
MariaDB
AuroraDB (more recently)

Encrypting existing DBs is not supported. To do this, you’ll need to create a new encrypted instance, and migrate data to it. The encryption key can be stored in KMS.

Failover

Multi-AZ supports failover to the same DNS endpoint. Note that Amazon RDS uses DNS names, not IP addresses.

When Multi-AZ failover is enabled, AWS creates a primary RDS instance, and synchronously replicates data to a standby instance in a different AZ.

If the primary DB instance fails, the following happens:

The standby replica (which is stored in a different AZ to the primary) is promoted to become the new primary.
The DNS record of the DB instance is changed to point to the new primary.
The original primary DB instance is terminated, and a new standby replica is created.

Failover is for DR (Disaster Recovery) only. It is not possible to use your DR DB as a read replica.

Failover and Multi-AZ is only supported for:

MySQL
Oracle
Maria DB

Multi-AZ deployments are not supported for SQL Server or Aurora

Read replicas

Read replicas:

Are for scaling, not for DR.
Are only supported by:
- MySql
- PostgreSQL
- MariaDB
Can be promoted to a standalone, writable DB
Have their own DNS address, and are not Multi-AZ

AWS Oracle DB does not support read replicas

AWS RDS read replicas are not supported by SQL server, Oracle or Aurora

For a DB, up to 5 read replica copies are supported

It’s possible to have read replicas of read replicas, but watch out for latency.

Read replicas are supported in separate regions from the source DB for both MySQL and MariaDB, but NOT PostgreSQL.

Scaling

DynamoDB - scales on the fly. No downtime.

RDS - scale via a bigger instance size, or read replicas

Aurora

Aurora FAQ

Aurora is intended to compete with Oracle, and is MySQL compatible.

Only runs on AWS infrastructure. This means that it’s not possible to run a local Aurora DB.

Up to 5x better performance then MySQL, and 1/10 the cost of commercial DBs

Scaling

The minimum storage of Aurora is 10GB, scaling in 10GB increments from then on up to 64TB.

No need to provision storage in advance.

Scales up to 32 vCPUs and 244GB of memory

Maintains 2 copies of data in each AZ, with max of 3 AZs -> 6 copies of the data in total.

High Availability and Replication

There are two types of replicas available:

Aurora - up to 15. Supports auto-failover an self-healing
MySQL - up to 5. Does not support auto-failover

To test failover, it’s possible to simulate failure by rebooting an instance.

Aurora is desigend to handle loss of up to 2 copies of data with affecting write availability, and up to 3 without affecting read availability.

DynamoDB

DyanamoDB FAQ

DynamoDB is

Non-relational
A Key-Value store
Extremely important for the Developer exam. Only an overview is needed or the solutions architect exam.
Always on SSD - magnetic disks are not supported
Spread across 3 disting geographic data centers (meaning 3 different facilities. AWS is non specific about whether this means different AZs, countries, etc)

DynamoDB consists of:

Collections (aka tables)
Documents (aka rows), which contain key-value pairs (aka fields)

DynamoDB tables support a primary key which can be either:

a single-attribute partition key, i.e. UserId
a composite partition-sort key, i.e. UserId (partition), Timestamp (sort)

Provisioned capacity / provisioned throughput capacity controls the read and write capacity on a table - when you create a table, you specify how much providioned throughput capacity you want to reserve for reads and writes (RCU - Read Capacity Units, WCU - Write Capacity Units). DyanamoDB will then reserve the necessary resources to meet your throughput needs while ensuring consistent, low-latency performance. Your proisioned throughput capacity can be changed after table creation, and can be increased/decreased as necessary.

DynamoDB cross-region replication is suitable for:

Efficient disaster recovery - by replicating tables in multiple data centers, you can switch over to using DynamoDB tables from another region in case a data center failure occurs.
Faster reads
Easier traffic management
Easy regional migration
Live data migration

Reads and Writes

Dynamo supports eventually consistent reads (usually within 1 sec), and strongly consistent reads

Write capacity units are billed in blocks of 10

1 write capacity unit = 1 write per second

Read capacity units are billed in blocks of 50

1 read capacity unit represents:

1 strongly consistent read per second
2 eventually consistent reads per second

DynamoDB can be expensive for writes, but is cheap for reads.

Data Warehousing via Redshift

OLTP (Online Transaction Processing) via RDS
OLAP (Online Analysis Processing) via Redshift

Columnar storage - data is stored sequentially, and is organised by column

Redshift has a 1MB block size

Single node (160GB)

Multi-node:

Leader node - manages client connections and receives queries
Compute nodes - for data storage and queries. Can have up to 128 of these

Redshift supports advanced compuression

Doesn’t require indexes or materialised views
Analyses your data and picks the best compression method for it

Supports automatically distributing your query among nodes for massively parallel processing.

Charged for compute node hours; you’re only charged for compute nodes, not for leader nodes.

Charged for 1 unit per node per hour for the billing period

Data is encrypted in transit using SSL, and encrypted at rest using AES256

Is not designed for Multi-AZ

Designed for management reports / BI (Business Intelligence)

In the event of an AZ outage, data snapshots can be restored to a different AZ

Elasticache

Elasticache can be used to increase performance on read-heavy databases.

Only a shallow level of knowledge of Redis vs Memcached is needed for the AWS Solutions Architect Associate exam. The Professional exam covers deeper scenarios.

Elasticache:

Supports:
- Memcached
  - Single-AZ
- Redis - similar to Memcached, but generally preferrable; Redis was developed using lessons learned from Memcached
  - Multi-AZ

Want to read more?

Check out the AWS Certified Solutions Architect Associate All-in-One Exam Guide on Amazon.com. The book getting great reviews, was updated to cover the new 2018 exam SAA-C01, and is available on Kindle and as a paperback book.

See my full exam tips here: AWS Solutions Architect Associate Exam Tips

And click here to see all of my notes: AWS Solutions Architect Associate Exam Notes

AWS Solution Architect Associate Exam Study Notes: AWS Databases

AWS RDS (Relational Data Storage) Databases

Backups

Encryption

Failover

Read replicas

Scaling

Aurora

Scaling

High Availability and Replication

DynamoDB

Reads and Writes

Data Warehousing via Redshift

Elasticache

Want to read more?

Subscribe to get my best content