Interesting Things
- Hpricot versions 0.4 and 0.5.x are causing us Segmentation faults. Why, Why?
- Rails gotcha: Rails’ automatically generated finder methods will not work properly if you have a class name with the word ‘And’ in it. For example, a RentalCarCompany might have many CarsAndTruck. You will have problems if you try to call
RentalCarCompany.find_by_cars_and_truck_idsbecause the_and_is reserved for methods such as find_by_name_and_address.
I’ve had a problem with segmentation faults on hpricot before although I cannot remember which version. I also forget how exactly I determined it (I think I gdb’ed the core file) but it was due to a buffer overflow on a char[1024] array that wasn’t big enough to handle ASP.NET session tracking strings (which is done with fields). The strings were far larger than 1024 bytes.
So you might want to check if there are any very long quoted values in the HTML you’re parsing. If so it may be the same problem. I was under the impression Why knew about this and fixed it in the latest release, but I could be mistaken. I decided against using the library because of memory/cpu usage and fell back on ruby’s String#scan.
I just found the ticket for it: http://code.whytheluckystiff.net/hpricot/ticket/13, Looks like its slated for the 0.6 release.
December 12, 2007 at 11:50 pm