AWS Relational Database Service (RDS) sandbox
Scope
az-scope - provide rds subnet group, collection of multiple az
multi-az
cross-region read replicas
cross-region backup snapshots to help disaster recovery
Engines
Questions
- RDS MySql vs Aurora MySql?
- Options group?
Deployment Options
Which deployment option to choose? https://aws.amazon.com/blogs/database/choose-the-right-amazon-rds-deployment-option-single-az-instance-multi-az-instance-or-multi-az-database-cluster/
Single AZ
Single instance running in a single AZ. Best suited for development workloads.
A read replica can be configured in the same or a different region. This read replica can be manually promoted if the primary instance fails.
Typical RPO is ~ 5 mintues, and RTO could be several hours depending on the nature of the disaster.
Multi AZ Instance
This is an active-passive setup.
Do note that in case of a multi-AZ deployment, both instances are NOT in use at the same time. The secondary instance is merely kept up-to-date by synchronous replication, and is promoted to be the primary instance if the primary fails.
RPO is ZERO due to the synchronous replication.
RTO could be longer due to variety of factors like -
copying transaction logs
lazy loading from S3 to EBS
instance class’s I/O throughput
rollback uncommitted transaction
rollforward in-memory committed transactions
During failover, active queries or transactions are cancelled, so it is a recommended to maintain some mechanism to monitor query cancellations.
Replication is not supported from both sides, that is to say, it cannot really be used for an active/active setup.
For multi-AZ deployments, there is possiblity of outage during the time it takes for the instance to fail over (~ 60 seconds).
OS Maintenance happens on the multi-AZ deployments in steps, secondary first, followed by primary failing over to secondary, then the update continuing on primary.
DB engine version upgrade however, happens on both both primary and secondary at the same time, causing an outage during this operation. Upgrade on read replica happens independently from the source instance.
Falling back to primary is difficult.
Best suited for business critical applications that need high availability and low RPO/RTO.
Due to the nature of the setup, however, it is not the best scaling option for high read scenarios. Better to use in conjuction with Read replicas or use Multi AZ DB cluster instead.
Avoid caching DNS data of the DB instances, set a TTL to less than 30 seconds. During a failover, a higher TTL might result in application still trying to use the failed instance.
Multi AZ DB Cluster
Backups
Automated and manual snapshots.
Automated snapshots can be copied cross-region in the same account but not cross-account.
Manual snapshots can be copied cross-region, cross-account provided they are are not using option groups with persistant or permanent options like TDE, timezone etc.
Snapshots using default encryption key can also be not shared, they would need to be encryted with custom key which is then shared along with the snapshot.
Change Data Capture (CDC)
Only for MySQL.
Captures metadata about all changes in a table. Use for on-going replication during migration.
Ref MS doc
RDS Proxy
When to use RDS Proxy?
Too many connections to database Offload the logic to prepare a database connection from your lambda. Make use of the IAM permissions instead to provide access to the database. SaaS or eCommerce applications which prioritize low latency, often need a ready pool of data connections to work with. It makes failover transparent for applications. Since it bypasses DNS (how?, why?), it routes the preserved connections to the new database endpoint thereby reducing failover times for aurora by 66%! RDS Proxy can help avoid out-of memory errors on databases using smaller instance classes like T2, T3.