CouchDB + Ruby: Perform Like a Pr0n Star – Matt Aimonetti
Matt Aimonetti is part of the Merb team.
- memory usage
- amount of servers
You need reliability
- Nice public interfaces
- no discrimination
- reaction underload
- handle concurrent connections
- numbert of requests persecond
Every p0rn star needs a trick
Why aren’t you a porn star now? Because of RDBMS
What does RDBMS have problems?
- most/all fields become options, schema changes and it becomes hard to deal with
- joins hurts performance (add indices/tables)
- replication is difficult when using multiple masters
- auto-incremental ids
What other kinds of approaches can we use?
Distributed Hash Table
- fault tolerant
- used by p2p and IM
- memached is another example
Some other projects:
Project Voldemort, Tokyo Cabinet, Redis
What is CouchDB?
It is a project written in Erlang by Apache. It uses spider monkey.
Can be used as a key/value store.
Can also put data in a schema-less store, format your data as a JSON object and dump the whole thing into the db
CouchDB comes with a nice web interface called Futon that lets you inspect your database and its contents.
CouchDB is de-centralized.
You can do replication between multiple masters.
Optimized for the web, more reads than writes.
- full Acid compliance
- https rest interface
- caching (couchdb uses etags) – this makes caching with existing HTTP caching technology (nginx+memcache, varnish, etc) really easy.
- built-in conflict management using MVCC (multiversion concurrency control)
- Every single record gets saved as a different revision
- document attachments are attached as document stubs, that also get replicated to different nodes and don’t use a lot of memory
How does this relate to Ruby?
Everything is a JSON object.
You can define properties so you get an attribute reader/writer, and do document.attr
There is no SQL in CouchDB, you do queries like Card.first, Card.all, Card.get(’matt_aiomonetti’)
The problem is trying to map documents into Object-Oriented languages, but it doesn’t always work. If you have dependent objects (such as Card and Address), then you tell CouchDB to cast it as something.
property :questions, :cast_as > ['Question']
save_callback :before, :generate_slug_from_title
Relationships can be done, but you have to decide how to do it, and there are a lot of ways.
When to use couch?
- When you need to scale you database and availibility is more important than consistency.
- When your data is decentralized (you have more than one master
- When you need to compute data
- Analytics – combined with traditional rdbms system to get statistics using couchdb
- Personal finance – Bank accounts in different countries – download all accounts, and can process in one place – attach PDF files, desktop app, etc.
- Medical Record System – Many patients with visits, history, records, etc.
- distributed e-commerce sites (1 main website working with multiple partners, the data can be replicated easily) In this case, CouchDB is much faster than RDBMS, because of drugs, compounds, and complex structures.
Q: Can you use couchdb with objects that have attributes that change often?
A: Yes you can use it.
Q: There are many ways to do relationships. Can you give an example?
A: Blog has an article with many comments. You can make one doc that is an article, and you can add comments. Problem is you need to send the whole document that increases risk of conflicts. So you create a new object that is the comment object that just refers to the article. Then, in the view you define the Article as aggregating all comments. Alternately, you can make a query to retrieve all comments for the article.