Trampoline Systems

* Trampoline Description Here

Trampoline Systems

* Trampoline Description Here


Content

Machines

Ideas, thoughts and observations from Trampoline's technical brains

Archive for the ‘Ruby’ Category

mccraig

ActiveRecord-JDBC plugin for working with MySQL master-slave configurations

By craig mcmillan on March 20th, 2009

here’s a little plugin for ActiveRecord-JDBC which enables simple use of MySQL master-slave configurations

active-record-jdbc-mysql-master-slave

jan

type discussion on irc (my eyez)

By Jan Berkel on February 5th, 2009

A java programmer, a scala dev and a ruby guy meet on #irc. Says the java guy to the… wait, this is not the beginning of a joke.

jan: Map<String,Map<String,String>>  argggghh
thepete: mmm, readable code, yummy
David: Map<String, Map<String, String>> is perfectly reasonable. :)
       The real problem is that Java makes you declare it twice.
jan: Map<String, Map<String, Object>> m =
       new HashMap<String, Map<String, Object>>();
mccraig: stop that !
mccraig: my eyez
David: val m = new HashMap[String, Map[String, Object]]
thepete: ze goggles;
David: or val m = ne wHashMap[(String, String), Object] if you prefer. :)
jan: what type will m have? Map?
David: val m : Map[(String, String), Object] = new HashMap
mccraig: m = {}
David: m["1"] = "stuff"; m[1] # Why is this returning nil??? :(
mccraig: yeah i know, but whatever
jan: MapWithIndifferentAccess
mccraig: my eyez hurt less
David: my eyez weep
mccraig: u can get eyedrops for that
David: You can get gogglez. :-)
jan: can i blog this? :)

emma

Has Many + non default primary key loads incorrect data in Rails 2.2.2

By Emma Persky on January 29th, 2009

 

I found an interesting bug in Rails 2.2.2 yesterday. I couldn’t find a similar bug on the rails lighthouse so created a new ticket. What was most interesting though, was how quick the rails core team picked up the bug and assigned it to  someone. 

It turns out that the bug had already been fixed in the current master branch of the rails git repo, though apparently no one had noticed it’s existence because I can’t find any references to this anywhere. I guess the fix in activerecord, which is almost identical to my fix below, will form part of the next release whenever that is.

I assume this is probably also the case for other has_* relationships, but have not verified.

I have a has_many association from class Foo to class Bar, where, for this specific relationship, the primary key on Foo is not id, nor is the foreign key on Bar id.

class Foo
  has_many, :bars, :primary_key => 'a_non_standard_key_name', :foreign_key => 'another_non_standard_key_name'
end

The relationship is one way, I have no need to navigate from Bar back to Foo, but only call a_foo.bars.

This works fine when working with a single object, but breaks down when you want to do eager association preloading to avoid n+1 query problem of loading bars for many foos.

When performing the following you find that

f = Foo.find :all, :include => :Bar
f.bars = [SOMETHING_UNEXPECTED]

The reason is that ActiveRecord creates the preloading query based on the default primary key of Foo (normally id).

It queries for Bar.another_non_standard_key_name matching Foo.id not Foo.a_non_standard_key_name

This causes seriously unexpected behaviour, and could easily go unnoticed since no errors are thrown.

I have found the hook in ActiveRecord where this functionality should be included and monkey patched for my system, because I need it now. I can’t vouch for it’s correctness, but we have many many specs for our product and none of them have broken because of this.

I’m running frozen rails 2.2.2

vendor/activerecord/lib/active_record/association_preload.rb, line 221

Change

primary_key_name = reflection.through_reflection_primary_key_name

to

primary_key_name = reflection.through_reflection_primary_key_name || reflection.options[:primary_key]

Hope this helps someone!

daniel

JRuby + Clojure’s Immutable Data Structures = Easy to maintain, application data-model.

By Daniel Kwiecinski on January 22nd, 2009

Implementing an application with rich data-model which can be updated by multiple UI controls, many concurrent threads with undo/redo functionality may be somewhat cumbersome. In order to ease this task, the functional programming paradigm with the immutable data structures turned out to be useful.

Because all good developers are lazy, one should seek for reuse rather than reinventing required tools, especially when there is good existing one. I tried to follow that path. Since we are using JRuby as our language of choice here at Trampoline, I decided to look more closely at clojure’s immutable data structures. It is straightforward to use Java classes from JRuby which is described in many places on the web already (here, here & here). The unknown to me was how can I use clojure’s objects from Jruby. Apparently clojure data structures are delivered as pre-compiled java classes and no runtime interpretation/compilation of clojure scripts is needed. The task turned out to be very easy.

The simple implementation of graph data structure with no deletion functionality looks as simple as:

basic_graph1

 
 

In order to have Clojure collections look more like Ruby ones one can define aliases for their methods:

persistent_map

 

Unfortunately (or fortunately due to different contract) we can not do it with all the methods. Particularly with mutating ones. That’s because Ruby’s = (assign operator) semantics is to return the value being assign. It is analogous to []= method as well. So even if we redefine the []=(key, val) method so that the method returns the updated version of the collection, the Ruby interpreter will step into the scene and wrap the whole method, so that it eventually returns val. Anyway, whether this is good or bad is the topic for a whole other post.

jan

rails 2.2 + jruby + jetty = win

By Jan Berkel on November 27th, 2008

In case you missed it, rails 2.2 recently got released, finally promising thread safety among some other things. Thread safety has always been neglected by the rails core team, the standard way to scale up in rails (pre 2.2) is to run multiple processes, which makes deployment a lot harder (I think there’re at least 10 different ways to deploy rails apps at the moment, and people still come up with new solutions: apache+fcgi, mongrel, mongrel_cluster, thin, phusion, rack…).

Why has thread safety become a priority all of a sudden? I suspect one of the drivers is JRuby, which is now a viable alternative to MRI Ruby, and which also has the nice property of mapping Ruby threads to native threads. Another factor might be the arrival of merb, the new kid on the ‘ruby web framework’ block. Merb has been designed with thread safety in mind, and is now starting to get a lot of attention (1.0 has just been released).

Now, with a thread safe rails JRuby might become the platform of choice for deploying rails apps, especially given the performance progress the JRuby team is making. Having real threads does make a huge difference, reducing the memory footprint and making better use of multi core cpus.

There’re a couple of possibilites to deploy a rails application in JRuby, glassfish seems to be the recommended choice at the moment. However glassfish is anything but easily embeddable so I tried jetty as an option. Compared to glassfish, jetty is solid and proven (version 7 will be released soon), small and easily embeddable.

I didn’t want to use warbler (no web.xml please!), instead I used a combination of JRuby-Rack + Jetty7 and tied everything together with a simple JRuby script.

 server = org.mortbay.jetty.Server.new
 thread_pool = org.mortbay.thread.QueuedThreadPool.new
 thread_pool.min_threads  = 5  # adjust as needed
 thread_pool.max_threads  = 50
 server.set_thread_pool(thread_pool)
 connector = org.mortbay.jetty.nio.SelectChannelConnector.new
 connector.port = 3000

 context = org.mortbay.jetty.servlet.Context.new(nil, "/",
    org.mortbay.jetty.servlet.Context::NO_SESSIONS)
 context.add_filter("org.jruby.rack.RackFilter", "/*",
    org.mortbay.jetty.Handler::DEFAULT)
 context.set_resource_base(RAILS_DIR)
 context.add_event_listener(org.jruby.rack.rails.RailsServletContextListener.new)
 context.set_init_params(java.util.HashMap.new(
    'rails.root'=> '.', 'public.root' => 'public',
    'org.mortbay.jetty.servlet.Default.relativeResourceBase' => '/public',
    'jruby.max.runtimes' => '1'))
 context.add_servlet(org.mortbay.jetty.servlet.ServletHolder.new(
     org.mortbay.jetty.servlet.DefaultServlet.new), "/")

 server.set_handler(context)
 server.start

This will run jetty on port 3000, dispatching all requests for dynamic content to a single JRuby instance. It is important to set “‘jruby.max.runtimes” to 1, so it’ll create a shared application runtime for you, otherwise you’ll get the old one runtime per thread model.

On the rails side you need “config.threadsafe!” in the configuration file. Autoloading of classes will then be disabled, be sure to load all your dependencies upfront in environment.rb. We haven’t actually used this in production, but some initial tests look very promising (mongrel: 23req/s, jetty: 50 req/s). Also, deployment will be a lot easier, because static and dynamic content can be served by one single process.

david

Computing connected graph components via SQL

By David MacIver on November 19th, 2008

Hi, I don’t post to here much. I’m one of the devs working on SONAR, focusing on mostly theme extraction.

As with many applications, SONAR’s data crunching is basically relational database driven. We keep thinking about experimenting with graph DB based approaches, but never manage to find quite a compelling enough reason - there’s no way we’d give up the relational approach entirely, so it needs to be a really big win to be worth the annoyance of having to maintain two different types of database in synch with eachother.

Unfortunately this sometimes leaves us in the unenviable position of having to do graph algorithms in SQL. This is about as much fun as you might imagine it to be. Most recent challenge: Computing the connected components of a graph in SQL.

There’s always the option of loading it into memory and doing it there of course. But the graph in question is rather large. With our little demo data sets it would be fine to do that, but any larger (e.g. on a real live sonar deployment) and this starts to sound like a really bad idea.

It turns out this is surprisingly simple to do once you have the key insight. I couldn’t find anything on the web explaining this though, so thought I’d write a post about it in case anyone else needs to do the same. It’s not rocket science, but hopefully this will save someone some time.

Consider the following setup:

create table if not exists items(
  id int primary key,
  component_id int
);

create table if not exists links(
  first int references items (id),
  second int references items (id)
);

We consider entries in links as undirected edges in a graph and we want to update items so that all items in the same component have the same component_id and distinct components have distinct component ids.

We’ll do this incrementally and merge. In order to do this we need a new table which we’ll use as scratch space (this should be a temporary table, but MySQL has irritating restrictions on temporary tables which make this not work):

create table if not exists components_to_merge(
  component1 int,
  component2 int);

(Side note: All SQL here is tested only on MySQL. It shouldn’t be hard to make it work on any other database though).

The idea is that at each step we’ll merge components, using components_to_merge to map components to the component they’ll be merged with.

So we start with a set of candidate components. That’s simple enough: We take each node as a potential starting component.

Now at each stage we merge components by finding links between them. For every potential component we look at all other potential components it’s connected to via some link. We insert all component-component links into components to merge. This is straightforward enough:

insert into components_to_merge
            select distinct t1.component_id c1, t2.component_id c2
            from links
            join items t1
            on links.first = t1.id
            join items t2
            on links.second = t2.id
            where t1.component_id != t2.component_id

insert into components_to_merge
select component2, component1 from components_to_merge; -- ensure symmetricity

So, now we have a list of groups to merge in this table. If the table is empty then we’re done - all the groups are maximal (and because of the way we constructed them they’re connected - at each point they were built by joining together two connected sets which were the connected to eachother), so the component_ids currently in items describe the actual components. If not, we now reassign components:

    update items
    join (
      select component1 source, min(component2) target
      from components_to_merge
      group by source
    ) new_components
    on new_components.source = component_id
    set items.component_id = least(items.component_id, target)

This step merges each component with another one (although “merging” conveys a slightly inaccurate sense of what happens. Consider a graph 1-2-3. The first step would result in 1 and 2 being assigned component_id 1, and 3 being assigned component_id 2. So the {3} component took the place of the {2} component).

What’s the complexity of this code? Well, it’s not amazing, but it’s not terrible either. The complexity of each step in the loop is probably somewhere around O(n log(n)) depending on exactly what the database does to it. The worst case number of queries is the size of the largest component: It’s obviously an upper bound as the number groups decreases by at least one each time; In order to see that it’s achieved, consider an extension of the 1-2-3 example where we have a graph 1-2-…-n. Then at each stage what happens is that we end up with components [1], [2], …, [n] -> [1, 2], [3], …, [n] -> [1, 2, 3], [4], …, [n], etc, taking n steps to terminate. On the other hand, a complete graph terminates in one step.

I suspect that the expected run time is O(log(n)), with each part of the component chosen approximately doubling in size each time, but I confess to not actually having bothered to run the maths: For our particular use case at the moment this is fine - it turns out the graph we’re considering is fairly sparse and tends to have small components, so for the moment this is more than fast enough. On the other hand, it would be nice to have a better guaranteed time, so if anyone has a smarter approach I’d love to hear it.

Anyway, here’s some sample code that ties all of this together: http://code.trampolinesystems.com/components.rb

mccraig

removing global fixtures for ruby tests

By craig mcmillan on July 2nd, 2008

global fixtures are evil, but we’ve got a bunch of unit tests depending on them, so we still need them around

here’s a neat [and generally fast, though a degenerate O(#tables^2) case is possible] way of deleting all fixtures without invoking db dependent ways of ignoring foreign-key constraints, and without loading all the objects into memory :

classes = ActiveRecord::Base.connection.tables.map {|t|
  t.singularize.camelize.constantize rescue nil
}.compact.reject{|cls| !cls.ancestors.include?(ActiveRecord::Base)}

while classes.size > 0
  classes = classes.select{|c|
    begin
      c.delete_all
      false
    rescue
      true
    end
  }.reverse!
end
jan

Springy 0.3 released

By Jan Berkel on August 2nd, 2007

No big changes this time, mainly compatibility fixes for JRuby 1.0. It is now also possible to build the project using Maven, for those too afraid to use rake. Documentation and code for springy are available here.

I’m also happy to announce that Craig Walls, the author of “Spring in Action”, is going to talk about Springy as part of his “Spring Cleaning: Tips for Managing XML Clutter” talk at this year’s No Fluff Just Stuff series of events as well as the Spring Experience 2007 in Florida.

jan

Java and Rails integration with GoldSpike

By Jan Berkel on July 13th, 2007

While trying to create a unified testing framework (shared between Rails and our Java backend code) I came across ActiveRecordJDBC which is an adapter to use JDBC drivers with JRuby on Rails. It works fine, although it can be a bit complicated to get a DRY database.yml configuration. The goal is to get rid of our dbunit/manually crafted database tests on the Java side by using ActiveRecord fixtures. After some research I found out about the Rails integration project (now called GoldSpike), which tries to make it easy to deploy a Rails app on a Java servlet container such as jetty. As far as I know Thoughtworks uses this approach to deploy their new product, Mingle. GoldSpike is under constant development but it is already usable, although a few patches were required. After everything was set up, a simple


$ rake war:standalone:create
Reading user configuration
Assembling web application
Adding Java library commons-pool-1.3
Adding Java library rails-integration-1.1.1
Adding Java library activation-1.1
Adding Java library mysql-connector-java-5.0.5
Adding Java library bcprov-jdk14-124
Adding Java library jruby-complete-1.0
Adding web application
Adding Ruby gem ActiveRecord-JDBC version 0.4
Creating web archive

creates a .war file which can be directly deployed to a container. I’ve never been a big fan of war files, but in the context of JRuby+Rails it makes perfect sense, because Ruby and all the required gems can be packaged up in a single file. No need to worry about missing gems, C bindings or wrong versions of the installed software, all you need to deploy is Java and jetty (which is pretty lightweight).

Not quite sure what the implications of this are, but it seems like a good alternative to mongrel/CRuby deployments. Performance-wise it looks good, too, though we haven’t done any benchmarking. GoldSpike doesn’t have a project homepage yet, but you can find it on the jruby-extras project page. Let’s wait and see what this will mean for the adoption of Rails in corporate environments (Oracle is currently looking for Rails developers to join the Enterprise 2.0 team, for example).

jan

growl-lastfm

By Jan Berkel on July 10th, 2007

We use last.fm a lot in the office - but one thing I always found annoying was that there’s no easy way to find out what’s currently playing (you need to go to the web page and hit refresh, very distracting) so I knocked up a little ruby script which uses growl to display the currently playing song.


#!/usr/bin/env ruby
require 'rubygems'
require 'ruby-growl'
require 'hpricot'
require 'cgi'
require 'open-uri'

raise "#{$0} <user>" if ARGV.empty?
user = ARGV[0]
already_notified = []

while true do
now_listening = Hpricot(open("http://www.last.fm/user/#{user}"))/".nowListening td.subject"
unless now_listening.empty?
currently_played = now_listening.last.inner_text.strip
unless already_notified.include?(currently_played)
already_notified << currently_played
g = Growl.new("localhost", "ruby-growl", ["ruby-growl Notification"])
g.notify("ruby-growl Notification", "#{user} is listening to",
CGI::unescapeHTML(currently_played))
end
end
sleep 60
end

Make sure that you have hpricot and ruby-growl installed (gem install hpricot ruby-growl -y), then copy and paste the script (or download it) and start it with a last.fm username as parameter. You also need to have “Listen for incoming notifications” and “Allow remote registration” checked in growl’s preferences (System Preferences | Growl | Network).