Introduction: The idea of a distributed database system is to have multiple copies of a database spread out on multiple computers, some geographically diverse. The general requirements are to be able to tolerate a "seldom connected" scenario, synchronize over both a LAN and a WAN, have relatively low overhead during the synchronization operation, allow for edits to take place on any copy of the database but resolve conflicts either during or after synchronization, and finally allow for deletion to take place while databases are not connected.
Sync vs. Async: At the highest level, synchronization of databases can be done synchronously or asynchronously. For applications such as transaction processing where workloads may be split among multiple databases, there is a significant need for synchronous replication. These systems typically have very high performance interconnect and are always online and available. This is not the application that tFM is targeted to perform since there are plenty of large, well established solutions in this space and no one needs another.
Asynchronous replication is easier to implement, but has the shortcomings that data can be stale or worse, edited with difference contents in two different locations. This requires that changes be logged and either not committed until all databases are ready, or be able to be undone with some reasonable conflict resolution process. The application scenario that tFM is targeted at is where one copy of the database might be on a laptop and not connected to a network at all. Edits may be performed on the laptop while away from a network, but as soon as it is connected to the network a synchronization process should be started to make the databases consistent. If inconsistencies are found, they can be flagged and resolved at a later time. Further edits to inconsistent records should generate an alert, but may continue as the user will have to decide which edits to keep/overwrite later.
Triggers vs. Change Logs: The two common ways to implement synchronization are triggers that detect when a record is being edited and invoke a synchronous change process, or change logs that keep track of what has changed and allow synchronization to occur later. The choice of asynchronous replication forces change logs to be implemented. These should be maintained until all peer databases are consistent, then they can be deleted. In the current tFM implementation, logs are kept only after databases have been initially synchronization.
Record Deletion: One of the difficult things to implement with lazy database synchronization is record deletion. The first implementation of it wound up with every record that was deleted being put back from the other linked database. The current thinking about how to do deletions is to effectively implement a two-phase delete operation where in the first phase the record contents are deleted, but the RECID and TSTAMP fields stay around
Initial Synchronization Phases:
- Top level database comparison
- Record level comparisons
- Local Record Newer
- Remote Record Newer
- Flag deletions