Using Amazon Aurora as a datasource with high availability in Connext: Yenlo’s WSO2 Cloud solution
Within Yenlo’s WSO2 Cloud solution Connext, we use the Amazon Aurora service for the datasources used by the WSO2 products. Amazon Aurora offers a database with high availability endpoints. Out of the box it offers 2 named endpoints for access to the database, a cluster endpoint offering read/write access and a read-only endpoint.
This means you don’t have to configure failover in your application as you can use the cluster endpoint which automatically always resolves to the database instance offering write access.
To configure database access within the WSO2 products the JDBC mysql driver is used together with connection pooling. The connection pool will make sure that when the application needs a database connection it is available and in working condition.
<datasource> <name>WSO2_USERSTORE_DB</name> <jndiConfig> <name>jdbc/WSO2UM_DB</name> </jndiConfig> <definition type="RDBMS"> <configuration> <url>jdbc:mysql://rds-cluster.endpoint:3306/database?autoReconnect=true</url> <username>mysqluser</username> <password>mysqlpassword</password> <driverClassName>com.mysql.jdbc.Driver</driverClassName> <maxActive>30</maxActive> <maxWait>60000</maxWait> <testOnBorrow>true</testOnBorrow> <validationQuery>SELECT 1</validationQuery> <validationInterval>30000</validationInterval> <defaultAutoCommit>false</defaultAutoCommit> </configuration> </definition> </datasource>
The configuration above serves as an example for a JDBC userstore in a WSO2 product, it serves purely as an example and there should not directly be used on production systems as it for instance a plain text password.
What we see here is that it is configured that the connection pool will check the connection before turning it over to the service asking for it by using the testOnBorrow setting. When testing the connection, it will execute the validationQuery which is by following the best practice “SELECT 1” and will only work if the connection is alive.
When Amazon Aurora is forced to execute a failover when a node fails it will promote a reader node to be the master node and will remove the read_only setting. The former master node will be restarted in read_only mode, if that succeeds it will be back up but not usable for writing operations anymore and will act as a node for read access.
As the connection pool is not aware of the underlying Amazon Aurora solution it will not be notified of a failover. So, when the failover happens quickly and the node is restarted successfully in read_only mode, the connection pool will have connections to a read node which it will validate as working connections. The validation will execute correctly, however when the application tries to use these connections it will evidently run into Database errors claiming the database to be in read_only mode.
The connection pool will not be able to recover these connections and so the product has to be restarted to recover this.
Using maxAge parameter
It is possible to set a Max age for a connection in the connection pool. This will evict connections that are older than the setting from the pool when they are released by the application. This will make sure that the application will recover from a Amazon Aurora failover when the faulty connections in the pool have timed out.
Using an enhanced validationQuery
Amazon Aurora offers a specific parameter readable in a query to denote a node is read_only, namely “innodb_read_only”. In other failover solutions this parameter might be called “read_only”. Another solution to the failover problem is to set a validation query that will fail if run on a read_only host. By making use of the above properties in a case statement which will evaluate to an exception when run on a read_only host, the connection pool will mark the connection as faulty and discard it.
A way to do this is to let the query return more than 1 row which is not expected by the validationQuery. By using for example, a select from query on one of the default MySQL tables which are known to contain more than 1 row it will do just that.
Select case when @@read_only + @@innodb_read_only = 0 then 1 else (select table_name from information_schema.tables) end as `1`
Use the MariaDB JDBC driver
One final solution that can be applied is using the MariaDB JDBC driver, if possible under project conditions. This driver supports Amazon Aurora and is able to recognize the topology of the Database cluster. It has also built in failover support.
For more information see this blog about MariaDB JDBC driver on Amazon.
Want to learn more around our WSO2 Cloud offering? Have a look at our 24/7 Managed WSO2 Cloud solution called Connext, the most flexible and affordable cloud solution for WSO2 API management, WSO2 Enterprise Service Bus and WSO2 Identity and Access Management. All of them making use of this Amazon Aurora services to deliver high-availability and scalability out-of-the-box.