Posted by Robby Russell 311 days ago
A few weeks ago, I moved RubyURL from subversion to git. During that process, I decided to use my invite to GitHub and have decided to go ahead and open up the source code.
It’s currently a whopping 92 LOC with a 1:2.5 code to spec ratio. (I had a goal to keep is below 100 LOC)
Feel free to grab it and help contribute. This has served almost 14 million redirects since August 2007 and is running on a Rails Boxcar.
To grab it with git.. run: git clone git://github.com/robbyrussell/rubyurl.git.
Feel free to submit tickets to the Rubyurl ticket system.
Enjoy!
UPDATE Ryan McGeary was kind enough to be the first person to help track down a bug and submit patches. :-)
Tags:
rubyurl,
ruby_on_rails,
programming,
planet_argon,
boxcar,
git,
subversion,
github,
open,
source,
rails,
rspec
Posted by Chris 672 days ago
Kyle Shank of the RadRails team has mentioned that he and Matt are both working on a web startup. Looks like the priority of RadRails is lower for them - after all, RadRails doesn't make money.
It's a shame that this sort of thing happens, but I can't say I'm all that surprised. I've been working on RDT for nearly 4 years now and I can definitely say that people just don't pay for free things. You can beg for donations, but you shouldn't expect them. Given the amount of time and effort - and the sheer number of downloads - it just doesn't pay the bills to run an open source project that passively solicits donations. I estimate the per-user donations for RDT to be at about 1.4 cents*. And if we take out the one large donor? .00071 cents per user.
That doesn't quite cut it for rent and food, unless of course you get the entire world to use your product.
I wish Kyle and Matt well and hope that others from the community step forward and help lead the project onward.
Update: Looks like RadRails isn't dying off - it's getting new ownership.
* This estimate assumes we count RadRails users as RDT users, because RadRails contains RDT. It also uses just the raw zip downloads from Sourceforge for both projects. There is a large number of users we are not counting here who have downloaded via Eclipse's update site mechanism, and who use RDT from other distributions available.
Posted by Robby Russell 693 days ago
In this new series, Get to Know a Gem, we’re going to take a look at hpricot.
What is Hpricot?
WhyTheLuckyStiff released Hpricot in July of 2006 in an effort to bring fast HTML parsing to the masses. It’s currently unknown what prompted it, but my guess would be that Why is secretly scraping all the pages on the internet that archive the future. To speed it up, Why has written the Hpricot scanner in C, to be much faster than the other options available in Ruby.
Installation
This process… is as always with most gems, very simple.
$ sudo gem install hpricot
Password:
Need to update 23 gems from http://gems.rubyforge.org
.......................
complete
Select which gem to install for your platform (powerpc-darwin8.7.0)
1. hpricot 0.5 (ruby)
2. hpricot 0.5 (mswin32)
3. hpricot 0.4 (mswin32)
4. hpricot 0.4 (ruby)
5. Cancel installation
> 1
Great, let’s now play with it!
Usage
In this first example, we’re going to use Hpricot to parse a web page through the Open-URI library. For this, we’ll need to require a few libs.
require 'rubygems'
require 'hpricot'
require 'open-uri'
Now that we have the libraries loaded, we can create a new Hpricot object and in this example, we’ll load the
PLANET ARGON About page.
# Open the PLANET ARGON about page
page = Hpricot( open( 'http://www.planetargon.com/about.html' ) )
Great, let’s have some parsing fun. Let’s parse for the first instance of a div with a class name of team. Hpricot will return array of elements that meet your search request.
page.search( "//div[@class='team']" ).size
=> 7
Great, this is a good sign that I need to add several people to the website. :-)
If we want to peak at the first instance of this class, we can do:
page.search( "//div[@class='team']" ).first
=> {elem "\n" {elem
{elem "Robby Russell" } ", Founder & Executive Director"
} ....SNIP
You’ll notice that there is a element within the results, which we can search deeper into this tree.
page.search( "//div[@class='team']" ).first.search( "//strong" )
=> # "Robby Russell" }]>
Hpricot provides a method named inner_html, which will return the contents within the element.
page.search( "//div[@class='team']" ).first.search( "//strong" ).inner_html
=> "Robby Russell"
Let’s now iterate through each of the elements and output all of the team member names.
# search for each team member div and iterate through them
page.search( "//div[@class='team']" ).each do |team|
puts team.search( "//strong").inner_html
end
Robby Russell
Allison Beckwith
Brian Ford
Nicole Fritz
Alain Bloch
Audrey Eschright
Gary Blessington
So, there you have it. A quick and basic introduction into using Hpricot for parsing HTML content. You can use Hpricot for a wide variety of structured data, such as XML and CSS. For more examples, please visit the HpricotBasics page.
Final Thoughts
I’m going to guess that Why built this for hoodwink.d, which I’ve been a regular user of for a long time. I haven’t spent much time playing with the XPath syntax and playing around with Hpricot has given me a much better understanding of it.
As mentioned at the beginning of this post, I am going to make Getting to Know a Gem a regular feature on my blog. If you know of a lesser known Gem that needs some attention, please send a suggestion to me.
Until next time…