These notes were written while working through the A Cloud Guru AWS Certified Solutions Architect - Associate online course. These notes are partly from the videos, and also from various other online sources. Primarily, they’re notes for me, but you might find them useful too.
Since the AWS platform is changing so quickly, it’s possible that some of these notes may be out of date, so please take that into consideration if you are reading them.
Please let me know in the comments below if you have any corrections or updates which you’d like me to add.
AWS RDS (Relational Data Storage) Databases
Used for relational OLTP (Online Transaction Processing)
There is no data charge between primary and secondary instances
DMS (Data migration service) uses replication to ensure that any changes to the source DB while migration is in progress is also applied to the target.
RDS security groups can be configured to allow traffic from another security group. i.e. Set source to ‘custom’, and allow traffic from your security group such as ‘MyWebDMZ’
- The DB is still being created
- The local firewall is stopping communication traffic
- The security groups for the DB have not been property configured
Creating a read replica does not prevent connection to the source DB.
RDS MySQL has a table size limit of 6TB, so make sure you don’t let your tables get too large. More best practices for working with MySQL on RDS
Automated backups are recoverable at any time within the configurable retention period i.e. between 1 and 35 days.
There is 1 full daily snapshot, and transaction logs for the remainder of the day.
Point in time recovery is supported during the retention period.
Automated backups are enabled by default. You get free storage matching the size of the DB.
Choose a backup window during a time of reduced load; storage i/o may be suspended during backup, and users may experience increased latency.
When restoring from an automated backup or manual snapshot, the restored version will be a new RDS instance with a new DNS endpoint.
Encryption at rest is supported for:
- SQL Server
- AuroraDB (more recently)
Encrypting existing DBs is not supported. To do this, you’ll need to create a new encrypted instance, and migrate data to it. The encryption key can be stored in KMS.
Multi-AZ supports failover to the same DNS endpoint. Note that Amazon RDS uses DNS names, not IP addresses.
When Multi-AZ failover is enabled, AWS creates a primary RDS instance, and synchronously replicates data to a standby instance in a different AZ.
If the primary DB instance fails, the following happens:
- The standby replica (which is stored in a different AZ to the primary) is promoted to become the new primary.
- The DNS record of the DB instance is changed to point to the new primary.
- The original primary DB instance is terminated, and a new standby replica is created.
Failover is for DR (Disaster Recovery) only. It is not possible to use your DR DB as a read replica.
Failover and Multi-AZ is only supported for:
- Maria DB
- Are for scaling, not for DR.
- Are only supported by:
- Can be promoted to a standalone, writable DB
- Have their own DNS address, and are not Multi-AZ
AWS Oracle DB does not support read replicas
For a DB, up to 5 read replica copies are supported
It’s possible to have read replicas of read replicas, but watch out for latency.
Read replicas are supported in separate regions from the source DB for both MySQL and MariaDB, but NOT PostgreSQL.
DynamoDB - scales on the fly. No downtime.
RDS - scale via a bigger instance size, or read replicas
Aurora is intended to compete with Oracle, and is MySQL compatible.
Only runs on AWS infrastructure. This means that it’s not possible to run a local Aurora DB.
Up to 5x better performance then MySQL, and 1⁄10 the cost of commercial DBs
The minimum storage of Aurora is 10GB, scaling in 10GB increments from then on up to 64TB.
No need to provision storage in advance.
Scales up to 32 vCPUs and 244GB of memory
Maintains 2 copies of data in each AZ, with max of 3 AZs -> 6 copies of the data in total.
High Availability and Replication
There are two types of replicas available: * Aurora - up to 15. Supports auto-failover an self-healing * MySQL - up to 5. Does not support auto-failover
To test failover, it’s possible to simulate failure by rebooting an instance.
Aurora is desigend to handle loss of up to 2 copies of data with affecting write availability, and up to 3 without affecting read availability.
- A Key-Value store
- Extremely important for the Developer exam. Only an overview is needed or the solutions architect exam.
- Always on SSD - magnetic disks are not supported
- Spread across 3 disting geographic data centers (meaning 3 different facilities. AWS is non specific about whether this means different AZs, countries, etc)
DynamoDB consists of:
- Collections (aka tables)
- Documents (aka rows), which contain key-value pairs (aka fields)
DynamoDB tables support a primary key which can be either:
- a single-attribute partition key, i.e. UserId
- a composite partition-sort key, i.e. UserId (partition), Timestamp (sort)
Provisioned capacity / provisioned throughput capacity controls the read and write capacity on a table - when you create a table, you specify how much providioned throughput capacity you want to reserve for reads and writes (RCU - Read Capacity Units, WCU - Write Capacity Units). DyanamoDB will then reserve the necessary resources to meet your throughput needs while ensuring consistent, low-latency performance. Your proisioned throughput capacity can be changed after table creation, and can be increased/decreased as necessary.
DynamoDB cross-region replication is suitable for:
- Efficient disaster recovery - by replicating tables in multiple data centers, you can switch over to using DynamoDB tables from another region in case a data center failure occurs.
- Faster reads
- Easier traffic management
- Easy regional migration
- Live data migration
Reads and Writes
Dynamo supports eventually consistent reads (usually within 1 sec), and strongly consistent reads
Write capacity units are billed in blocks of 10
1 write capacity unit = 1 write per second
Read capacity units are billed in blocks of 50
1 read capacity unit represents:
- 1 strongly consistent read per second
- 2 eventually consistent reads per second
DynamoDB can be expensive for writes, but is cheap for reads.
Data Warehousing via Redshift
- OLTP (Online Transaction Processing) via RDS
- OLAP (Online Analysis Processing) via Redshift
Columnar storage - data is stored sequentially, and is organised by column
Redshift has a 1MB block size
Single node (160GB)
- Leader node - manages client connections and receives queries
- Compute nodes - for data storage and queries. Can have up to 128 of these
Redshift supports advanced compuression
- Doesn’t require indexes or materialised views
- Analyses your data and picks the best compression method for it
Supports automatically distributing your query among nodes for massively parallel processing.
Charged for compute node hours; you’re only charged for compute nodes, not for leader nodes.
Charged for 1 unit per node per hour for the billing period
Data is encrypted in transit using SSL, and encrypted at rest using AES256
Is not designed for Multi-AZ
Designed for management reports / BI (Business Intelligence)
In the event of an AZ outage, data snapshots can be restored to a different AZ
Elasticache can be used to increase performance on read-heavy databases.
Only a shallow level of knowledge of Redis vs Memcached is needed for the AWS Solutions Architect Associate exam. The Professional exam covers deeper scenarios.
- Redis - similar to Memcached, but generally preferrable; Redis was developed using lessons learned from Memcached