- What is Replication?
- Replication Concept
- Features of Replication
- Restrictions of Replication
- Methods of Replication in Altibase HDB
- Replication conflict
- Why Replication Conflict occurs
- How to resolve conflict
- Replication type
- Lazy Replication
- Eager Replication
- ORACLE FailOver vs ALTIBASE HDB FailOver
Replication is a technique for sending information about the changes to the contents of a single database over a network to one or more other databases.
Replication enables data from ALTIBASE to be replicated to one or more ALTIBASE.
The basic idea behind replication in Altibase is the use of the log replay method.
To support the replication feature of Altibase, a local server transfers transaction logs to a remote server when the logs change.
The remote server ”replays” the received logs to its database, that is, it implements the changes that have been recorded in the transaction logs.
Here're the steps how ALTIBASE replicate the changes when DML statement issues.
(1) Client issues a SQL, for example, "INSERT INTO T1 .... ".
(2) A service thread executes the SQL. By the WAL (Write Ahread Logging) protocol, the sevice thread first writes the changes to the redo log files.
(3) Replication Sender thread is watching the redo logs. As soon as the service thread writes its change log,
Replication Sender thread reads logs and translate the change log to XLOG.
(4) Replication Sender send the changes to Replication Receiver in the remote ALTIBASE over TCP/IP protocol.
(5) Replication Receiver receives the XLOG and applies the XLOG to local ALTIBASE.
Replication Receiver writes its change log to the redo log before change the bufferpool or memory tablespace as well.
- ALTIBASE provides Replication feature as a form of database object, called "Replication Object." You can administrate and monitor the Replication Object by Querying SQL. ALTIBASE provies an easy way to configure Replication object as well.
- When you create Replication object, you can specify the tables which you want to replicate. If so, the only specified tables are replicated.
- ALTIBASE Replication works over TCP/IP procotol, in other words, you can configure Replication environment between ALTIBASE instances even if physically separated.
- SELECT statement does not cause any replication issue because SELECT statement does not generate any redo log I/O in ALTIBASE.
- An Altibase HDB single node can accomodate up to 32-way replications(replication objects).
- A DDL statement (ALTER TABLE , RENAME TABLE, TRUNCATE TABLE , DROP TABLE ...) is not permitted the tables which participate a replication by default. You can issue a DDL statement in the replication environment by changing ALTIBASE property, however, it is not recommended because data integrity can not be guaranteed.
- Only Table data can be replicated. Other objects such as Synonym, View, Procedure, Trigger are not replicated.
- The pairs between replicated table should have identical structures including table schema, column size and indices.
- Altibase replication does not resolve conflict automatically. It will be described in the below.
- ALL tables those participate replication should have PRIMARY KEY constraints.
- User is not able to update the columns having PRIMARY KEY.
- Redo log-based Replication takes place in real time.
By Converting Redo Logs in the SM module, is responsible for data storage and management via concurrency control and recovery, to a replayable Logical Form and Sending them.
Some conversion expense is incurred, but replication performance is good.
Active-StandBy topology means that data manipulation is occured in only one ALTIBASE instance.
Note that data manipulation means data changes and It will cause generation of the redo logfile.
SELECT statement does not issue any data change, therefore, you can deploy the applications which only read from the standby ALTIBSE instance.
Active-Active topology means that the data manipulation is occured both/among ALTIBASE instances in the same time.
This topology is able to maximize hardware utilization, however, you must find the way to resolve conflict.
The term ”Replication Conflict” refers to the case where records having the same primary key or records to which constraints apply in replication target tables in two or more database servers are changed by respective local transactions.
Replication Conflict can cause data mismatch between/among ALTIBASE instances.
It is very important why Replication Conflict occurs and how to resolve conflict.
There're two main reasons why Replication Conflict occurs.
- Global lock
ALTIBASE does not have any global lock in Replication enviroment. If Transaction A occurs in ALTIBASE #1, Replication sender automatically sends the changes of Transaction A to ALTIBASE #2's receiver, then Receiver in ALTBASE #2 create new transaction to apply received log. The transaction that made by Replication receiver is independent from Transaction A. Transaction A and Replication transaction is not in same context.
- Network latency
Although ALTIBASE Replication replicates changed data to another ALTIBASE instances as soon as transaction issued, receiver is not able to faster than service thread does. due to the network latency. If developer is not aware of network latency problem, transaction ordering can not be guaranteed.
Here's an example.
Assume that an application has to process the transactions in order, such as tranaction #1, transaction #2, transaction #3, transaction #4.
And the application sent its transactions to both two ALTIBASE instances without consideration of replication conflict.
The application has two connections to ALTIBASE instances, which are configurated with active-active topology.
Suppose that application sent its transactions as below.
- Connection to ALTIBASE #1 : transaction #1, transaction #3
- Connection to ALTIBASE #2 : transaction #2, transaction #4
In this case, the result of transactions may be wrong, because ALTIBASE does not guarantee the order of transactions even if application sent its transactions in order.
If an application has to keep the order of transactions, the application has to use only one ALTIBASE instance during performing the transactions.
The only one thing that you have to remember, to resolve replication conflict, is to make replication receiver and service thread to be independent each other.
If a service thread handles row#1 and receiver thread handles the same row at the same time, it may cause replication conflict, however, if you prevent replication receiver and service thread from accessing same rows in same ALTIBASE instance, the data conflict is resolved.
That is to say, It means that preventing an intervention by handling duplicated data is crucial.
Here's a simple example.
Application #1 handles only ORDER table and ALTIBASE #2 handles only CUSTOMER table.
Therefore the service thread and replication receiver work perfectly independent, once there is no intervention by handling duplicated data between them with each other.
Furthermore, Application #1 handles record#1 in ORDER table and Application #2 handles record#2 in same table, Replication conflict problem is resolved.
ALTIBASE provides two options when you create Replication object.
This is a Default replication mode in ALTIBASE. When a transaction is issued,
Service thread does not wait the completion of replication.
There're no sychonoization mechanism between service thread and replication sender thread.
Here're the benefits and risks when you want to use lazy replication.
Eager Replication is focused on ensuring data consistency rather than Lazy Replication.
When a transaction is issued, Service thread will wait for the completion of replication.
Then replication sender transfers the redo log to replication receiver, and replication receiver notifies the transaction completion to replication sender thread.
Then Replication sender thread wake blocked service thread, then client can issue another transaction.
If Receiver thread has a problem to apply the redo log file, service thread transaction will fail.
But there's a disadvantage in Eager replication, so to speak, replication can affect transaction performance(degradation).
Comparing the Lazy replication, the performance may decrease 50% or more.