AWS Relational Database Service (RDS)  sandbox 

Scope

az-scope - provide rds subnet group, collection of multiple az
multi-az
cross-region read replicas
cross-region backup snapshots to help disaster recovery

Engines

Questions

  1. RDS MySql vs Aurora MySql?
  2. Options group?

Deployment Options

Which deployment option to choose? https://aws.amazon.com/blogs/database/choose-the-right-amazon-rds-deployment-option-single-az-instance-multi-az-instance-or-multi-az-database-cluster/

Single AZ

Single instance running in a single AZ. Best suited for development workloads.

A read replica can be configured in the same or a different region. This read replica can be manually promoted if the primary instance fails.

Typical RPO is ~ 5 mintues, and RTO could be several hours depending on the nature of the disaster.

Multi AZ Instance

This is an active-passive setup.

Do note that in case of a multi-AZ deployment, both instances are NOT in use at the same time. The secondary instance is merely kept up-to-date by synchronous replication, and is promoted to be the primary instance if the primary fails.

RPO is ZERO due to the synchronous replication.
RTO could be longer due to variety of factors like -
copying transaction logs
lazy loading from S3 to EBS
instance class’s I/O throughput
rollback uncommitted transaction
rollforward in-memory committed transactions

During failover, active queries or transactions are cancelled, so it is a recommended to maintain some mechanism to monitor query cancellations.

Replication is not supported from both sides, that is to say, it cannot really be used for an active/active setup.

For multi-AZ deployments, there is possiblity of outage during the time it takes for the instance to fail over (~ 60 seconds).

OS Maintenance happens on the multi-AZ deployments in steps, secondary first, followed by primary failing over to secondary, then the update continuing on primary.

DB engine version upgrade however, happens on both both primary and secondary at the same time, causing an outage during this operation. Upgrade on read replica happens independently from the source instance.

Falling back to primary is difficult.

Best suited for business critical applications that need high availability and low RPO/RTO.

Due to the nature of the setup, however, it is not the best scaling option for high read scenarios. Better to use in conjuction with Read replicas or use Multi AZ DB cluster instead.

Avoid caching DNS data of the DB instances, set a TTL to less than 30 seconds. During a failover, a higher TTL might result in application still trying to use the failed instance.

Multi AZ DB Cluster

Backups

Automated and manual snapshots.
Automated snapshots can be copied cross-region in the same account but not cross-account.
Manual snapshots can be copied cross-region, cross-account provided they are are not using option groups with persistant or permanent options like TDE, timezone etc.
Snapshots using default encryption key can also be not shared, they would need to be encryted with custom key which is then shared along with the snapshot.

Change Data Capture (CDC)

Only for MySQL.
Captures metadata about all changes in a table. Use for on-going replication during migration.
Ref MS doc

RDS Proxy

Supports Aurora, RDS

When to use RDS Proxy?

Too many connections to database Offload the logic to prepare a database connection from your lambda. Make use of the IAM permissions instead to provide access to the database. SaaS or eCommerce applications which prioritize low latency, often need a ready pool of data connections to work with. It makes failover transparent for applications. Since it bypasses DNS (how?, why?), it routes the preserved connections to the new database endpoint thereby reducing failover times for aurora by 66%! RDS Proxy can help avoid out-of memory errors on databases using smaller instance classes like T2, T3.