How I’m making our Calendar collaborative in real-time with Redux and Operational Transformation

Written by Alex Mundiñano, Software Engineer | January 25, 2023

Calendar is the new homepage of Optibus - the central source of truth of which schedules are operational on which days.

Many managers and schedulers have it open at the same time, so it needs to show current information automatically, without a manual refresh. Concurrent edits must be handled, mistakes should be trivial to undo, and everyone must see the same, consistent state.

Real-time collaboration is helpful to any editable UI with concurrent users. It prevents mistakes and gives users confidence. It’s especially valuable for enterprises that have many users and lots of changing data. You can apply these techniques to make your app higher quality and wow your customers.

I’ll explain my high-level plan to add real-time collaboration to our Calendar web app:

Save and share changes in real-time with Redux.
Resolve conflicts with Operational Transformation.
Support undo and redo.

(You don’t need any prior knowledge of Redux or Operational Transformation.)

Let’s begin with Redux…

Redux

Redux plays an important role in our real-time collaboration solution because:

Redux actions can be posted to the server to persist changes. (One endpoint to rule them all!)
Redux actions can be broadcast to other clients and applied through the same reducers, as if everyone had done the action themselves.
Redux actions capture user intent, so they can be more easily compared, composed, and transformed. (This is essential for Operational Transformation, which is covered further down this post.)

Here’s a diagram of the data flow:

The alternatives are cumbersome:

Passing a diff does not capture intent, so conflicting diffs are tricky to resolve.
Passing the whole updated document does not allow for concurrent edits.

Also, actions can be appended to a log, which is useful for auditing. You could also compute the state from scratch. What we’re doing is eerily similar to CQRS (Command Query Responsibility Segregation).

Persistable Redux actions

Each Redux action is either persistable or ephemeral. Persistable actions apply to the document state and shall be posted to the server, whereas ephemeral actions only exist in the UI.

Here’s an example hierarchy:

Post Redux actions to the server

First, let’s upload persistable actions to the server as well as dispatching them to Redux. (You can write a Redux middleware or thunk.)

Next, define a server route to save the action to the database, so the change is still there when you refresh your browser.

Persist Redux actions to the database

To persist the action, let’s write a function that executes Mongo or SQL queries. We can’t reuse the client-side reducer because the database is not an object in memory, unless we load and save the whole database on every action. So each action needs a client-side reducer and a server-side persister, and they must produce the same end result to keep things consistent.

Here’s what a persister might look like in TypeScript with Mongo:

And for comparison, here’s what a reducer might look like:

const reducer = (state: IState, action: IPersistableAction): IState => { switch (action.type) { case 'ADD_TASK': { const { id, task } = action.payload; return update(state, { tasks: { [id]: { $set: task } } } ); }

Broadcast Redux actions to all clients

Next, we need users to see each other’s changes in real-time. We need to broadcast the actions to all the clients, as well as persisting them. We can implement this like so:

Connect each client to the server with a WebSocket.
When the server receives an action, also send that action to every WebSocket client.
When the client receives an action via its WebSocket, dispatch that action to its own store.

This works, but it’s naive… What if two users do conflicting actions at the same time? Some will show a different state! This can easily happen in reality, because network requests are not instant. In this sequence diagram, let’s say that action A is to rename the schedule to ‘Alex’, and action B is to rename it to ‘Ofek’:

In the end, the server, Alex, and anyone else will see ‘Alex’, but Ofek will see ‘Ofek’ instead. The source of truth is whatever state the server holds, which is why we need the server to broadcast accepted actions to everyone.

Pending actions queue

One solution is to broadcast the accepted actions to all clients, not only the client that posted the action. However, the posting client will dispatch the action twice! What we can do is not dispatch the action initially, but instead, append it to a local queue of pending actions.

dispatch({ type: 'POSTED_ACTION' , action });

When we receive an action from the server, it may be one we posted earlier, so we should remove it from the pending queue.

dispatch({ type: 'RECEIVED_ACTION' , action });

Optimistic view

Now that we’re not dispatching actions locally until the server accepts and broadcasts them, we’ve introduced some lag in the UI. Changes no longer feel instantaneous, and users complain.

Fortunately, we can compute an optimistic view by combining the state with the list of pending actions. This is what it would look like if the server were to accept all the pending actions. This gives an instant feel to the UX.

This can be achieved with a Redux selector, which automatically memoizes to recompute only when either the main document or the list of pending actions changes. Here’s an example of TypeScript code:

interface IReduxState { // Actions posted to the server but not yet broadcast. pending: IPersistableAction[]; // Known state of the server. document: IDocument; } await db.collection('calendarTasks').insertOne({ id, ...task }); break; } const selectDocument = createSelector( [ (state: IReduxState): IDocument => state.document, (state: IReduxState): IPersistableAction[] => state.pending ], (document: IDocument, pendingActions: IPersistableAction[]): IDocument => pendingActions.reduce( (state, action) => documentReducer(state, action), document ) );

So far, so good: all clients are eventually consistent. But what if some concurrent actions are conflicting?

What if one user edits an object but another user deletes it, at the same time?
What if one user edits a property but another user edits it to a different value, at the same time?
What if two users do the exact same thing at the same time?

We can apply Operational Transformation to resolve conflicts like these. Let’s see how it works…

Operational Transformation

For OT to work with Redux actions, we need them to point to a specific version of the document that they were conceived upon. I’ve added an integer ‘version’ property to our document state, that begins as 0 and increments each time an action is successfully applied. I’ve also added an integer ‘version’ property to our persistable Redux actions, so the server can tell which other actions happened in the meantime.

When our server encounters an action that applies to a previous version of the state, it needs to make the action work on the current version of the state. This is called transformation, and we need to transform the action against every other action that was successfully applied between the previous state and now.

The goal of the transform function is to preserve the intent of the action, with respect to another action that has sneakily happened just before. If it cannot do that, then it must reject the action.

If the action affects something that no one else has touched concurrently, then it should be able to make it through the transformation unscathed.

Let’s look at the first conflict case: a user edits an object that has just been deleted. What should we do? Common sense (at least in our specific domain) tells us that the edit no longer makes sense and the object should remain deleted. We can model this visually with a state diagram:

Specifically, if the delete happens first, our transform function is called `t(edit, delete)` and it rejects the action. If the edit happens first, then it’s called `t(delete, edit)` and it returns the delete action unscathed. Either way, the object is deleted.

Now let’s investigate the second conflict case: a user edits a property, but another user edits it to a different value, at the same time. Here’s a state diagram:

It’s apparent that the order of operations matters here. Our transform function can either keep the prior edit or accept the subsequent edit. Alternatively, the action ID may be used as a deterministic tie-breaker. It doesn’t matter which action wins, because we want the users to confront it and decide what to do.

Finally, let’s review the third conflict case: two users do the exact same thing at the same time. It depends on what the action is. If it’s an edit or a delete, then no matter how it is handled, the result should be the same. However, if the action is to add an item or increment a number, then we need to think carefully about this.

Imagine both users intended to increment the number just once, but if they both do it, it increments twice. Not what anyone wanted! To resolve this, we could model the action as an edit, or code our transform function to discard the latter increment. For additions, we could discard the latter addition if it’s identical to the first.

Convergence property

If the transform function produces the same effect no matter which client’s actions are received first, then we can rely solely on OT for consistency, instead of a pending actions queue.

I’ve chosen not to enforce this because I don’t want to guarantee that every pair of actions transforms consistently. It’s useful that the server is the strict source of truth, and that actions can be rejected by the user.

Undo and redo

We can undo any action by inverting it to cancel out the effect of the original action. Redo is just undo twice. To undo an addition is trivial, but what if we edited or deleted information? We need to faithfully restore that discarded information.

There are 2 ways to architect this:

Lossless actions keep the old data in the document.
Fat actions move the old data into the action itself.

A lossless approach may simply mark an object as deleted:

A fat action may look like this:

I’m leaning towards the ‘fat action’ approach because it avoids accumulating invisible, deleted data in the document, and doesn’t require new action types to undo.

When undo is unfeasible

If undoing an action does not make sense, then it should not be allowed. For example, if an external entity is deleted, then it would not make sense to undo any actions on that entity that were applied before it was deleted. This gets super complicated, so you may wish to restrict undoing to recent actions.

Wrapping up

I hope I’ve stirred your thoughts on real-time collaboration and given you ideas for your own app. There are many approaches, and I recommend you explore them to find what makes sense for you. Bear in mind, the aim is to provide a pleasant user experience, not complete resolution of all potential conflicts and edge cases, and also to develop a codebase that is possible for newcomers to maintain.

This was about streaming the persistable actions, but real-time collaboration is more than this. You could also stream the ephemeral actions, such as UI selections, current page, and pointer position, to give your users a tangible presence within your metaverse/app.

By the way, we’re hiring talented software people, so if this captures your imagination perchance, get in touch.

View full post