Bayou Paper Review
Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System
Buzzwords
- weakly replicated storage system for bad network connectivity
- eventual consistency
- dependency checks
- conflict resolution
- rollback and redo in serialization order
- tentative and committed
- session guarantees
- anti-entropy, epidemic algorithms
- logical clocks for timestamp
- log instead of data
Summary
The operations in the disconnected or weakly connected network are often
valuable in the real world. Many real-world applications require the
operations of immediate reads and writes of a local replica even
disconnected just as Git
or Dropbox
. So this paper proposes the idea
of eventual consistency
which relays on some novel mechanisms in this
paper to relay on the local database and conflict resolution to merge
the logs.
More precisely, Bayou is designed for mobile network, and each server
has its own database and logs for reads and writes. So the clients’ reads
and writes just access to any local server’s database which may have
some uncommitted logs called tentative logs
. Two different clients may
have logs in different orders which may produce different outputs. So,
conflicts should be solved between different devices. Firstly, Dayou
uses global ordering to order the logs, which relays on
Lamport logical clock
with undo and redo mechanisms
to acquire
casual consistency. For example, when two server’s ordering is
different, one of two needs to rollback the previously executed writes
and redo them in a global serialization order based on the timestamps.
What’s more, when the orders are same for each server, Bayou use user
defined dependency check
and conflict resolution
to solve the update
conflicts for different conflict purposes. Moreover, Bayou uses a
manually selected primary
to take responsibility the committing
updates for solving the problems that a long-delayed update in a disconnected device may rollback lots of logs and committed logs needs
all of the services to establish the connections.
On the implementation side, Bayou uses logs with version vectors instead
of real data which we can rely on the prefix property to simplify
syncing. In addition, Bayou uses tuple stores similar to the database in
6.824 lab3
which just simplify the logs compaction.
strength of the paper
The idea of
eventual consistency
is currently state-at-the-art algorithm for solving the disconnected network.Bayou proposes a general conflict resolution algorithm based on
dependency check
andconflict resolution
, which is more general than other algorithms at that time.Bayou uses logs instead of real data to simplify the syncing and rely on the
version vector
which is an interesting idea to me. It’s more lightweight and simple.
Weakness of the paper
The implementation of
undo and re-execution logs
isn’t discussed in the paper which is an interesting point towards me.The
primary commit protocol
also needs more discussions otherwise it may be unclear for those who have no background of this algorithm.The conference scheduler example is not user-friendly, so some improvements may be required.
The
access control
part maybe not too important in the paper.
Recommandation
We can use the timestamps for each tuple, so when there is a
long-delayed updates, you can bypass the rollback by just appending it
if the timestamps of all elements in the updates are larger than the
one in the database. Which may occur frequently in google calendar
Apps
in my view of point.