Recently Pivotal Labs decided to push mobile devices to the limit by connecting multiple devices to one other in order to have them behave as one. Through the use of image recognition, optical character recognition, persistent low cost connections and a whole lot of ingenuity Pivotal Labs was able to put together an innovative and unique experience. Below, we have collected all the accounts of the key individuals involved in bringing to life this piece of innovation and technology.
How we did it
Engineers at Pivotal Labs came up with an idea to use multiple devices to orchestrate a unique unified experience. With so many people interested in contributing to this idea, a hackathon ensued where engineers worked in groups, each group contributing on the separate pieces needed to make this happen.
While most of the tasks were accomplished during the hackathon, a few projects lingered beyond the hackathon, such as building the stand and working on the general stability of the framework. The source code is available for those who want to try to replicate this flow.
OTA and Architecture
Before we could start building the applications that would run in this unified experience, we needed to build a software layer to run these applications.
We split the main communication layer into 3 parts. The server, the client and the client server. You can read in more detail the idea behind each of these here.
The server would be our communication protocol layer, RabbitMQ and apache in this case, the client would be the phone applications and the server client would be java applications running server side.
In short, the server clients and the clients would communicate between each other using the server. The server client would keep track of the entire state of the unified experience and report to each client their particular state with respect the unified experience. Each client would then render their respective state. The clients would also collect sensor events such as touch and send this information back to the server client. Think of it as one big MVC where the views are the clients, the controllers are the server clients and the models are replicated on boths sides.
To get the interactive and responsive experience we wanted, we knew we had to innovate on what kind of technologies we needed to bring to mobile. In this case we brought RabbitMQ’s implementation of AMQP to help move messages between devices.
We built libraries and services that would manage all the connection details,
allowing the client developers to focus on the content they were transmitting as opposed to the delivery system. This made development of the individual games and applications possible in a very short time. In some cases a single day.
Architecture diagram of libraries
The red parts represent the sections that app developers were responsible for. The rest was provided by the libraries team.
We had heard of the OpenCV (Open Source Computer Vision) library before, and it looked powerful with many potential applications. We started looking for an opportunity to play around with it and realized that one of the interesting problems that we’d want to solve for the Device Wall would be to determine where the device screens were relative to each other. We took this as an opportunity to give OpenCV a spin!
OpenCV is a library that focuses on image processing. It analyzes and manipulates images. This was useful after setting up the devices on the wall. We could take a picture of them to determine where their screens were relative to each other and identify each device.
The process starts by running an app on each of the devices. This application communicates with a server to get a unique id. The devices then each show a white screen with their id in the center. We take a picture of all these devices and analyze it.
We detect contours in the picture and throw out those that don’t have four corners and a minimum area. At this point we’ve found all the rectangles of sufficient size in the picture.
Luckily devices have rounded corners and perfectly rectangular screens, so we only detect their screens.
We then iterate over all the various rectangles we’ve found to find the outermost points.
These define the boundaries of the virtual screen that they all make up together.
We use the virtual screen’s dimensions and fit an image to it. Knowing all the screens’ points relative to this virtual screen, we can cut out pieces of that image and show them on their respective devices.
The final step is determining which screen belongs to which device. We crop each screen individually from the picture, using ImageMagick, then run the open source optical character recognition (OCR) program Tesseract to detect the two digit id in the middle of the screen. We used two digits since it greatly improved the accuracy over using single digit numbers.
Finally we can match the id to each virtual screen and we can determine the relative position of the screens to their corresponding virtual screen. We write all this out into a JSON string, which the server will use to cut an image up and deliver to the devices.
The Memory Game is a game enjoyed by children of all ages around the world.
The rules are simple: flip up a card to see what it is, then flip up another card to see if you can find the same card. If the cards are the same, the cards stay flipped up. If they are different, the cards flip over again. Repeat until all cards are flipped up.
We built this game into the Xtreme Screen by making each phone represent a card, and each tablet represent two cards. If there are an odd number of cards, one phone will remain empty or a tablet will only show one card.
The game state resides on the Memory Game server client. When the game is initialized, all cards are shuffled and flipped down. The assignments are sent to the clients. When the user touches a card, events are sent to the server client. The server client responds by sending a message back to the client telling it to flip up the card if appropriate. It then either flips the two cards back over if they don’t match, or if they do match, it either does nothing if there are more cards to be flipped, or displays a win message to the user if they have flipped over all the cards and won the game.
We used custom view animations to animate the card images in the Android app. In order to save time, we leveraged the Flip 3D view transition project.