I just released cinabox. It is intended to be The Simplest Thing That Could Possibly Work to set up a Continuous Integration (CI) server, using cruisecontrolrb (CCRB).
In addition to being a (hopefully) useful tool to help people easily set up CI systems for various platforms and languages, it is also an experiment in simplicity and minimalism:
- The project consists of only two simple scripts, one shell script to bootstrap ruby, and one ruby script to set up cruisecontrolrb.
- In the script, readability and simplicity are favored over clever abstractions and DRYness. Hopefully, even people who don’t know shell scripting or Ruby can read the scripts and easily understand the commands it is executing.
- A standardized environment is assumed: A dedicated Ubuntu 8.04 system, Ruby 1.8.6, and latest dependencies via aptitude. PCs and Virtual Machines are cheap, and Linux and CCRB are free. There’s really no reason you shouldn’t be able to run a dedicated CI box. If this environment doesn’t work for you for some reason, the scripts should be self-explanatory enough that you can easily hack it up to work for your environment (and contribute your version back to to the project!).
- I use the magic fairy dust of GitHub to eliminate build scripts, release scripts, packaging, versions, and pretty much all the regular boring overhead of a project. The README.txt is my only documentation. The GitHub “Download Tarball” link automatically provides packaging and uniquely-named packages (by the git hex commit id) for each “release”.
I’m pretty pleased with how this turned out. I hope it will lower the barrier for people to start trying out Continuous Integration, as well as provoke some thought about simplicity and minimalism. I’ve tried it out on a few flavors of Ubuntu VMs and my personal box, and it works for me. Please let me know what you think, and feel free to offer any suggestions for improvement.
Ask for Help
“We are getting 504 Gateway errors and we thing it is because our mongrels are freezing up do to inability to allocate memory, what to do?”
Without more info on the problem a few possibilities were suggested, such as the OS might be swap thrashing or the OS has no more memory to allocate.
One suggestion is to cut down your swap space to 0 in an attempt to verify that your mongrels are asking for too much, basically remove to OS swapping memory to disk from the equation.
Another suggestion is to boost your swap up to some insane size, also to take it out of the equation, the theory being that we know mongrel can leak memory, we trust the OS to keep the used memory in RAM, and we have plenty of disk space, so why put your OS in the position of not grating a mongrel what it is asking for.
Both solutions above don’t seem ideal but, whatever, we are pragmatists, and if we combine those with periodic monitoring of the system using top/ps/vmstat, at least your mongrel can keep running and this may give you time to figure out why mongrel may be so memory hungry
- If your wanting better out-of-the-box error messaging you can use one or both of the following plugins:
If you choose to use both however ORDER DOES MATTER (use the order specified above) otherwise the validates_associated one just doesn’t seem to work.
- Hash Iterations is very expensive (this includes my_hash.keys and my_hash.to_a etc…). We think this is related to the way hashes are stored in large, sparsely populated hashtables. If you can, avoid iterating over a hash, and if you must, try using a SequencedHash (which is provided by the collections gem) which solves this by storing hashes as both traditional hashtables and arrays, allowing for fast random access (the hashtable) as well as fast iteration (the array).
Ask for Help
“We want to load a different set of libraries for our selenium test than our regular tests. We tried to create a ‘selenium’ environment and pass that to the rake:test task but that didn’t work, anyone know why?”
You cannot run in non ‘test’ environment with the rake tasks as the ‘test’ environment is hard coded into the test task, and passing a different RAILS_ENV seems to only have the effect of telling the ‘test’ environment what database to base it’s schema off of.
Proposed work around – pass a second environment variable e.g. selenium=true and switch on that. (it’s not ideal so we are still open to better solutions)
- If you have a “target” method on your model, things will get a bit weird when you try to access this method through an association. Since associations have their own “target” method, you actually need to call assocation.target.target, or probably better, don’t create methods called “target”.
- Since Time.now always returns the time for the local timezone, if you use it in your fixtures, but then have your app running under a different time zone, the times in your fixtures will be incorrect. Use the active support helpers such as 0.days.ago instead, or if you have a timezone configured in your environment, you can use Time.zone.now
Ask for Help
“How can I test the route helpers in RSpec? If I’m passing a complex set of options to a helper I’d like to test that it’s giving me what I expect.”
Nobody had any serious suggestions, although many humorous testing scenarios were mentioned.
- When using time zones in Rails 2.1, if you specify a zone, any datetime ActiveRecord attributes will be returned in that zone. E.g. if you specify Eastern Time, and then later request changed_at from an ActiveRecord, it will be returned in ET. However, if you ask for
Time.now it is always returned in the local time zone, regardless of TZ settings. This isn’t necessarily bad or unexpected behavior, but it can lead to test failures if you save a time to an ActiveRecord, get it back, and then compare the values. One workaround is to use
Time.zone.now, which will always respect the current time zone, although this doesn’t help with large existing codebases.
Ask for Help
“We’re FlexMock for some of our Test::Unit unit tests, and recently added some new tests; nothing that is exercising new parts of the code or creating new mocks. However, for some reason when we call previously existing mocks we get errors from Rspec. These are not exceptions or assertion failures, but full-stop errors as if there were a syntax error. Turns out Rspec reopens the Test::Unit::TestCase class and overwrites some behavior, although the cause of the errors remain entirely unclear. Anyone know why it would do that and how to prevent it in the future?”
A few people mumbled about Rspec magic, but actual help was not immediately forthcoming.
I’ve been passionate about Extreme Programming and Agile Software Development Practices since first hearing Kent Beck speak back in 2002. But it took five years, and finding a job where I was expected by management to be Agile every day (TDD, paring, etc.), before I was able to actually call myself an Agile Engineer.
I’m sure that there are more of you out there who want to be more Agile. And you want more Agile engineers so that you’ll have reasonable people to work with and learn from.
In the past 18 months I’ve picked up a lot of effective small practices that you won’t find in the White Book. Things like making sure you pick the next story no matter your comfort level. Or fixing a red CI build ASAP.
In order to spread the knowledge, I’m working on a presentation about the ‘practices that work’ to share with potential Agile Engineers at a future software conference near you.
So what day-to-day practices make you more Agile?
I just had a discussion with a co-Pivot about the resentment that many teams develop about Continuous Integration – especially when the release process requires a green tag from CI, and a broken build is standing in the way.
As anyone who has worked with me will attest, I’m hardcore on CI and consider any team which leaves a build red for longer than a workday to be sorely lacking in discipline.
OK, OK, there are always extenuating circumstances, but I still believe that most resentment of CI stems from underlying antipatterns and smells, rather than problems with CI itself. For example:
- “The Customer Has To See This Feature RIGHT NOW”: Frequent releases are a great thing, but if you cannot wait for a green build to deploy, you have some deeper problems. Often, this is because a team doesn’t manage customer expectations well. The customer should understand that CI is a critical part of the Agile process which ensures that only reliable, quality releases get pushed to staging or production. Any problem which is preventing a green build should be fixed before the release is deployed. If the customer is not willing to allow you that time and flexibility, perhaps they are too addicted to new features, and the entire team needs to have a heart-to-heart about Code Debt in the next retrospective.
- “It Works For Me, But Fails On CI: The important question is which environment is more like production – your development environment or CI? If you are developing on Windows or Mac, and your production box is some other flavor of Xnix, then your CI box should be as close a possible to production. Ideally, you should be able to log on to the CI instance and debug the failing test there. Usually, your CI box is not configured correctly. If it is hard to keep your CI environment in sync with production, then perhaps you should look into automation (because you KNOW you or your sysadmin will probably forget to do the same thing when you push to production, right?). If the problem is that your development environment is not the same as production, and it is a legitimate problem, then CI just saved you some stress on the next deploy.
- “Intermittent” Failures: Same deal as the prior point. CI runs your tests much more than you do. For web apps, it hopefully runs them in more browsers than you do. In my experience, many “intermittent” bugs are real bugs which are just very hard to isolate. It could be an AJAX bug that only happens when the site is run remotely, not via localhost. It could be a performance problem which only shows up on a slower system, not your fastest-on-the-market dev box. It could be a dependency on an external resource that happens to be unavailable sometimes, such as a web service, remote storage, etc. Again, just being aware of these issues puts you ahead of the game. For browser bugs, dig in and find out WHY it is failing intermittently. It may be a real bug. For intermittent outages of external resources, you may just have to live with it, but you don’t have to live with the intermittent failures in CI. Mock out the resource or disable the tests in the CI environment. Yes, this is OK, especially if you leave them enabled in your development environment. Another option is to automatically repeat these tests a few times with a delay, and only fail the entire build if they fail repeatedly. Big services like Amazon or Google might drop a request occasionally, but still respond to a subsequent request.
- Slow Test Suites: This is an insidious problem, because once your suite is slow, it is often a monumental effort to make it fast again. It is much better to be proactive, and monitor any slow-running tests like a hawk, relentlessly mocking out slow resources or replacing broad functional tests with faster, more targeted unit tests. You can also always split your tests into different suites, running your fastest tests continuously, and the entire slow deploy-test suite only nightly or periodically. As long as your customer isn’t addicted to immediate features, it should be fine to only deploy from nightly builds.
- The Failing Test That “Doesn’t Matter”: This is my pet peeve. Whenever I break CI, I fix it ASAP. If I ignore a “minor” broken test, the next thing I check in may be a major FUBAR which gets past my local tests for some reason (see prior points). Some who know me might even say it is LIKELY to be a major FUBAR. The point is, I don’t trust myself or my local box, I trust CI. Now, if ANOTHER developer breaks the build, and tries to tells me they are not going to fix it because it is a “minor” problem, that really chaps my hide. They are ripping huge holes in my nice safety net, forcing me to expend much more time and attention on the tests that I run on my local environment, and causing me more stress and work in general. Stop making excuses, and fix the damn build NOW, or comment out the failing test.
Now, I’m sure that all of the above points can be debated or shown to be inapplicable in a specific situation. Plus, if you are dealing with imperfect CI and development tools (which is always the case), you will have some degree of pain which is directly attributable to CI. It would be great to hear about some of these situations in the comments.
Bottom Line: Integration is always one of the most painful parts of software development. Doing integration with high quality and low risk is even harder. Most developers who have been on a non-Agile project of any significant size have experienced days-long integration hell and ulcer-inducing all-night production deployments. Continuous Integration doesn’t make that pain and stress go away, but it does break it down into small, bite-sized pieces that can be easily handled on a daily basis. All for the low, low cost of being proactive and disciplined, which makes you a better developer anyway.
Ask for Help
It was suggested that perhaps this is a timing issue. Maybe some required JS for the form hasn’t loaded before Selenium is trying the event.
One workaround would be to test only the form submission called by on-click instead of the click itself.
- If you need to add inflections in Rails, make sure you do it before your initializer block in environment.rb. If you put it after the initializer, it will cause your routes to be re-parsed, which will slow down app/test startup significantly. For example:
ActiveSupport::Inflector.inflections do |inflect|
inflect.irregular "criterion", "criteria"
inflect.irregular "Criterion", "Criteria"
Rails::Initializer.run do |config|
There is some strangeness with Rails’ handing of invalid dates outside of the range for a Time object. For example, February 31, 1900. This involves Rails turning it into a Time object, and then back into a Date.
Multiple have found an interesting edge case bug in MySql which results in inconsistent ordering. It has been repeated in versions 5.1a and 5.1b, but we didn’t navigate the MySql bug system yet to find/report it. Here are the query conditions to reproduce it, which could be common in test scenarios for some projects:
* LIMIT <= 5
* ORDER BY id DESC
* > 5 rows in table
* Where clause has index (not compound with id)
We had an excellent tech talk on Vertebra from Ezra Zygmuntowicz and the folks at EngineYard. If you’ve ever been a sysadmin responsible for many boxes, you’ll appreciate the awesome potential of Vertebra…
NetBeans and symlinks: a team was having problems with symlinks in SVN disappearing after a client edited the project in NetBeans. No, he was not using Windows…
Here are some tips for working with acts_as_paranoid (We have helpers for some of these in our common libraries):
Add :scope for validates_uniqueness_of
validates_uniqueness_of :user_id, :scope => :deleted_at
Add :conditions for named_scope with :joins
:select => "users.*",
:joins => :paranoid_relation,
:conditions => 'paranoid_relations.deleted_at IS NULL',
Ask for Help