Steve Conover's blog
XSS #1: There's a huge cross-site scripting hole if you use the meta refresh tag...it has a "data" attribute into which you can insert arbitrary javascript.
XSS #2: Cross-site scripting resources, from an internal mailing list:
"I've gained a new appreciation for the importance of carefully thinking through security and escaping in RoR there's more than just h()'ing all your user entered data."
XSS vulnerabilities - http://ha.ckers.org/xss.html.
Very useful catalog of different XSS vectors. Includes some utilities to base64-, URL- and hex- encode attacks so you can test out your apps.General OWASP wiki - http://www.owasp.org/index.php/Main_Page. Lots of useful data information here. OWASP is a nonprofit group charted to improve the security of webapps in general.
Security Guide for RoR - http://www.lulu.com/product/download/owasp-ruby-on-rails-security-guide/4489819 general guidelines/things to think about for securing RoR apps.
Loofah - http://github.com/flavorjones/loofah is supported by a fellow Pivot and provides fast and good sanitization built on Nokogiri, albeit slightly slower on short strings than brittle regular expressions. It's in production at several companies.
"Loofah excels at HTML sanitization (XSS prevention). It includes some nice HTML sanitizers, which are based on HTML5lib’s whitelist, so it most likely won’t make your codes less secure."
Happy New Year
We've pooled some Pivot shared tech news feeds and made this feedburner feed:
http://feeds.feedburner.com/pivotal-news-network
The content is in the spirit of Blabs, so we hope readers here might find it to be useful. See what you think.
by Russell Edens. He has a great take on why Erector is interesting, complete with code examples:
With erector [views] are first class plain old ruby objects. Why is this good? It gives you all the tools of inheritance and mixin's for your views. That is cool. Especially for an application with multiple views of the same underlying models. You can refactor your views into base classes that derive and render the same data in different ways. This is object oriented design for views. Nice.
I've seen object oriented view code in other languages and it leads to some very powerful re-use that all OO programmers can understand. The most ambitious of these attemps was by an HR company ...[that] created their own markup language that was object oriented. The nature of HR data is that it has very complicated rules regarding who can see what data and when. The OO design of the language allowed that to be abstracted to the base classes and a functional programmer simply focused on the problem at hand. They took it further, as all commercial enterprise applications do, and they allowed the customer to define new models and views. Those views were very easy to write with this advanced data access logic abstracted out. Their customers loved it. They wrote very advanced business applications on top of this abstraction.
Views as simple classes, methods, and objects in Ruby - perfect!
Erector Hello World:
class Hello < Erector::Widget
def content
html do
head do
title "Hello"
end
body do
text "Hello, "
b "world!"
end
end
end
end
For more see the Erector user guide.
We made a code change and deployed to demo, and all the sudden some of our ruby processes were eating a ton of CPU against our full dataset.
In the java world you can send a SIGQUIT to any running java process and get a thread dump. Go ahead, run a java process and kill -3 it.
You can get this in the ruby world by using xray:
sudo gem install xray
Drop this into your code:
require "xray/thread_dump_signal_handler"
Now:
kill -3 <ruby pid>
Look in the log file where you're sending stdout. You'll see something like:
=============== XRay - Done ===============
/usr/lib64/ruby/gems/1.8/gems/eventmachine-0.12.0/lib/eventmachine.rb:224:in `call'
_ /usr/lib64/ruby/gems/1.8/gems/eventmachine-0.12.0/lib/eventmachine.rb:224:in `run_machine'
_ /usr/lib64/ruby/gems/1.8/gems/eventmachine-0.12.0/lib/eventmachine.rb:224:in `run'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/backends/base.rb:57:in `start'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/server.rb:150:in `start'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/controllers/controller.rb:80:in `start'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:173:in `send'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:173:in `run_command'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:139:in `run!'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/bin/thin:6
_ /usr/bin/thin:19:in `load'
_ /usr/bin/thin:19
(that's thin patiently waiting to service the next request)
A one-line code drop-in results in a powerful new inspection tool. Pretty neat.
For bonus points:
ps ax | grep "thin server" | grep -v grep | awk '{print $1}' | xargs kill -3
For more bonus points, stick this in a capistrano task and grab the thread dumps from your logs, and you'll have a cluster-wide snapshotting tool.
We kill -3'd our CPU-eating thins and discovered a directory scan problem introduced by a recent code change - totally obvious from the thread dump. Now we're nailing it down with a failing perf unit test and fixing the problem.
I'm here as part of the Best Buy Remix crew, hanging out in Mashery's Circus Mashimus all weekend. Come by, have a beer, and check out Remix and other interesting API stuff if you're at SXSW.
We're in a room near the front of the convention center, not far from the Pepsi booth.
-Steve
By Steve Conover and Brian Takita
Peer-to-Patent, one of Pivotal Labs' clients, got Slashdotted last week, and we had no trouble handling the load. The site was just as responsive as it always is, and we didn't come close to having a scale problem.
Moral of the story: the technology for serving static web pages is old, boring, and extremely scalable. If you have the type of site that can be page-cached, do so aggressively, starting with the front page and any pages likely to be linked to. We got a huge payoff for the engineering time that we invested in our page-caching strategy.
Highlights:
- We moved away from Rails page-caching and developed our own "holeless cache", which uses a symlink trick (see below) to instantly and "holelessly" switch to a new version of a cached page. (The cache "hole" is the time between the expiration or purge of a cached page and the time when it's regenerated. The danger is that in that time your Mongrels can be saturated with requests - something we proved to ourselves could easily happen.)
Here's our symlink trick, using the front page as an example:
- Have index.html point to index.html.current
- If (index.html.current is >= 20 minutes old)
- Copy index.html.current to index.html.old
- Point index.html to index.html.old
- Rewrite index.html.current by asking Rails for the page (using the process method)
- Repoint index.html back at index.html.current
- Repeat step 2 every minute using a cron job.
For cache expiration that's model-based, we make a call from the model observer class to our holeless cache routine, instead of using Rails cache sweepers. So, instead of just deleting the cached page we regenerate it in place.
It was important to write tests that proved that the HTML we generated for cached pages looked exactly the same in different "modes" (user logged in vs not, for example). This forced us to push modal decision logic out of Markaby templates and into JavaScript, meaning that view-oriented Rspec tests asserting modal differences became useless. We rewrote them as Selenium tests.
Performance/load testing: we tried several tools and approaches and found that a simple Ruby script that launches wget requests (that write to /dev/null) in many separate threads worked best for us.
We send down exactly one .js and one .css file. If you are sending down more than one of each of these to the browser, you have a performance problem. Fix it with asset packager.
Update: one clarification about the cron job: we deploy this "automatically" using capistrano.







