Ilya’s slides are already on the web.
A few random notes:
- In 1994-1995 term frequency was state of the art in search engine relevancy.
- State of the art today = TF-IDF = Term Frequency – Inverse Document Frequency
- http://rubyforge.org/projects/gratr/ graph theory gem – gets slow after 1000 nodes but can manage about a million.
- Working with math in Ruby is not the best idea. Use GSL with one of the ruby binding gems.