This week we were treated to a lunchtime tech-talk by Blaine Cook of Twitter. He came to talk to us about Starling, the all-Ruby message queue system that runs much of Twitter. Blaine spoke about the history and motivation for creating Starling, then showed how it worked, and talked about possible future enhancements and directions for further development.
Starling looks quite simple to use. The Starling server speaks the memcache protocol, so to talk to it you just need to load up the memcache-client gem and create a client instance. Note, the Starling server doesn’t use memcached for its implementation at all; it just speaks the protocol.
Some interesting bits about why Blaine built Starling. It basically comes down to that every other solution had some problem that made it unsuitable for them to use. Here’s the list:
- rq (by Ara Howard) – nfs/disk based, high latency
- DRb – not robust under load
- Rinda – very slow! O(n) for take operations
- Apache ActiveMQ – super complex
- RabbitMQ – Erlang dependency
In the last few months we’ve seen a lot of Starling-like things appear, some inspired by Starling itself.
- beanstalkd – uses memcached for storage, not persistent or recoverable
- bj – database backed
- thruqueue – uses Thrift protocol, ugly
- sparrow – Starling imitator
- ap4r – full-featured
Interesting new directions for Starling… Currently Starling has some overhead from polling on both client and server sides. Kevin Clark and Chris Wanstrath have hacked it to run using EventMachine to eliminate polling. Not sure what happens if clients die while request is waiting to be filled. Also, some issues with load balancing and starvation need to be looked at. And there are opportunities to build a richer client API.
This is another case of NIH syndrome. I can’t believe Apache ActiveMQ was dismissed as “super complex”. It is actually very simple to use and start implementing it. And as you mentioned in your last para about the missing features in the current version of Starling, it will become equally complex once those feature are baked in. That itself will take time to implement and test. Whereas ActiveMQ is already here and tested and deployed in a number of production environments.
January 26, 2008 at 11:54 pm
It would be interesting to hear what is super complex about ActiveMQ. Installation, maintenance, API, all of the above?
The Spring configuration for ActiveMQ doesn’t look too bad: http://activemq.apache.org/spring-support.html
What Ruby support for ActiveMQ already exists? Is it comparable to Spring’s, in richness and complexity?
Then again, if Twitter doesn’t currently have Java as part of their architecture, that can be a compelling reason to Invent it Here. If Starling is clean, well-tested, stable Ruby that they understand and can easily support (because they wrote and tested it), and they can avoid a dependence on the Java ecosystem, that could save a lot of long-term cost – maybe even enough to offset Inventing it Here.
– Chad
January 28, 2008 at 8:41 am
Joann,
for me, it’s a question of NIH versus LIM (Less is More). To get ActiveMQ up and running, realistically you need a recent JDK (not always easy on older linux installs, and still confusing for non-Java developers), Maven, the ActiveMQ install, a configured Broker, and an understanding of all the different terminology that’s used. There’s a web interface, with a whole series of permissions settings and XML-based configuration files on top of the ActiveMQ configuration settings that are available.
In practice, all these options just aren’t needed for simple installs. Remember that when Rails first shipped, it was essentially a stripped down MVC framework whose only contribution was a less verbose syntax than existing frameworks. Now that Rails has grown to a more complex system, simpler frameworks like Merb and Camping have emerged.
When we started work on Starling, the STOMP client libraries were rumored to be unstable and problematic, and the documentation on the ActiveMQ site was complicated and emphasized the flexibility of ActiveMQ rather than how one might actually use it (try taking a look at the ActiveMQ JavaDocs!). I think it says a lot that I was able to implement a new queue server and leverage an existing set of libraries in much less time than it would have taken to hook up the Java-centric world of JMS into the RESTful world of Ruby and Rails.
One of the things that I tried to emphasize at the Brown Bag was that I don’t take adding new features lightly. Any new functionality will need to have a proven use case (ideally, someone will come to me with a problem that they’re actually having). Once the use-case is identified, the solution needs to have a simple and semantically clear implementation.
I guess the bottom line is that I don’t see Starling as a re-implementation of ActiveMQ; there are similar features and functionality, but the approach is fundamentally different, with different requirements driving the process. If ActiveMQ makes sense for a project (e.g., if you need message transformation or topics with guaranteed per-consumer delivery), then by all means it makes sense to absorb the extra start-up cost. For most people’s needs, Starling implements extremely simple store-and-forward messaging with as close to no overhead as possible.
January 28, 2008 at 9:52 pm
I feel obliged to join the chorus of ‘unfair’ here :-)
RabbitMQ does not require any knowledge of Erlang to use. Please try one of the many language clients. Hey there is a STOMP client now too.
For installation we have many packages designed to make your life easy:
http://www.rabbitmq.com/download.html
Most of these installs are trivial. Probably the hardest ‘mainstream’ install is on MacOSX, because we have not fully packaged for that O/S yet, and even that is just a few shell commands. Here is a well known Java person describing his experience with MacOSX: http://rossmason.blogspot.com/2008/02/erlang-on-os-x.html
And – free free to get in touch!
alexis
February 18, 2008 at 5:43 pm
Alexis,
We run erlang at Twitter for ejabberd — it’s a great tool, and works fantastically. However, for most PHP or Ruby developers, it’s fine as long as it’s a black box. I once spent a morning learning erlang because the primary mnesia server went down, which made it not-a-black-box. The second that happens, the dependency becomes a liability.
I think there’s a tendency in the erlang world to expose the language as a management and runtime interface — which can be powerful, but can also be a major stumbling block if you have to use it to interact with servers (e.g., to do clustering in RabbitMQ). Whatever can be done to remedy that (and it looks like the majority of operations with RabbitMQ happen through a nice clean interface) is helpful to those just getting started with erlang-based servers.
Keep in mind that Starling is only intended to solve a limited set of problems; you wouldn’t want to use it to do video streaming, for example. On the other hand, setup and connection to an essentially unlimited number of named queues is one line of code, using a library that most websites already have installed. Sending and receiving messages are one line each. Simplicity here is the primary feature, and in designing Starling the erlang and STOMP/AMQP dependencies were potential liabilities.
There’s a good chance that at some point in the future we’ll need additional functionality that Starling may not be able to provide with a clean, consistent interface. At that time, we’ll definitely re-evaluate the options vis-a-vis their complexity. The reality is that for most simple web sites, every new server is a major decision, and minimizing the potential points of (human) failure is a real concern.
Hopefully that addresses the question of unfairness! :-)
February 21, 2008 at 11:23 pm