Real-Time Collaboration Issues and Solutions

Implementing a real-time collaborative editor in Codeanywhere is something that we have had on our roadmap for some time. It is one of those features that everyone was asking for. Moreover, we wouldn't have to use things like screen sharing software for remote collaboration any more. However, implementing this feature proved to be no simple task.

Obviously, the most difficult task in creating a real-time collaborative editor (RTCE) is enabling users to write their code in the manner that they are used to and still be in sync with changes made by others in the session.

Problem

To illustrate the problem, let's say we have two users (Bob and John) collaborating on a text file that contains only a string: "abc".
The first operation comes from Bob:

O1 = Insert[0, "x"] (to insert character "x" at position "0")

Immediately after that, John sends:

O2 = Delete[2, "c"] (to delete character "c" at position "2")

Bob had already executed his operation O1, which resulted in the string "xabc", and after deleting the second character as stated in O2, his text becomes: "xbc".
On the other end, John is executing his operation (O2) first, so his text first becomes "ab" and then, after adding the character "x", it becomes "xab".

As you can see, Bob and John didn't end up with the same version ("xbc" ≠ "xab"), and their collaboration on this wasn't successful. We need to find a way to always be in sync after executing all operations.

Client-Side Recalculation of Operations

Our first solution to this problem was creating a stack of operations on the server using WebSockets (http://en.wikipedia.org/wiki/WebSocket) for communication. Every time the client makes a change, the operation is posted in a stack, but it is accepted only if the client has all previous operations from the stack.
In case the client did not receive the last changes, the server refuses the operation and he must execute all operations, recalculate its changes and try again.

For example, if John tries to push his operation O2 = Delete[2,”c”], it will be refused by the server because he didn’t receive the O1 operation from Bob. So he retrieves O1 = Insert[0,”x”] and recalculates O2 into O2’ = Delete[3, “c”] (since the new string is now “xabc”, not “abc”).

In this way, users can collaborate with each other and be sure that they will stay in sync.

But there is a downside to this approach. The client has to recalculate the operation if it is refused by the server. In order to do that, all refused changes have to be rolled back and then implemented again with respect to the new ones. This can take up to 200 ms to perform, and that is a lot of time in developer standards.

To fix this issue, we decided to move the operation recalculation to the server side.

Operational Transformation

Operation recalculation which we did on the client side is similar to Operational Transformation or OT (http://en.wikipedia.org/wiki/Operational_transformation). OT was originally designed for consistency maintenance and concurrency control in collaborative editing of plain text documents. Two decades of research have extended its capabilities and expanded its applications to include group undo, locking, conflict resolution, operation notification and compression, group-awareness, HTML/XML and tree-structured document editing, collaborative office productivity tools, application-sharing and collaborative computer-aided media design tools. In 2009, OT was adopted as a core technique behind the collaboration features in Apache Wave and Google Docs.

A visual presentation of OT using the above mentioned example would look like this:

Solution

We have to say that, although OT is the solution, implementation is a hard task in its own right. There are tons of algorithms with different tradeoffs, mostly trapped in academic papers. The correct implementation of algorithms is also very time-consuming. Luckily, there is a Node.js library developed by Joseph Gentle, an ex-Google Wave engineer, called ShareJS (http://sharejs.org/). ShareJS’s purpose is to ease the pain of implementing OT in a web-based project, so it is perfect for Codeanywhere.

Although ShareJS was still in its early stages of development, it proved to be a good basis for our implementation. Of course, we improved Share.JS and added features like authentication for access to our RTCE sessions, access to our communication layer and, of course, sending operations like transferring cursor positions, checking availability of users and many more.

Going Further

It is explained here how we implemented a real-time collaborative editor in Codeanywhere and finally enabled people to collaborate in real time and even pair program remotely.

But, as always, there was still one feature missing in RTCE, and that was the users' ability to use multiple cursors in their real-time collaboration sessions. This was something we had to do all on our own.

Since we had not found the solution before coming up with a solution of our own, we decided to give back to the community and post it publicly. So feel free to check out our pull request on GitHub to see how we solved it (https://github.com/share/share-codemirror/pull/6).

Hopefully, this will help you in creating your own RTCE. Or, if you just need one to use it with your fellow developers, feel free to log in now and start collaboration in real time.

Problem

Client-Side Recalculation of Operations

Operational Transformation

Solution

Going Further

Ready to start coding from anywhere?