Posts from Gluttonous...
Posted by kev less than a minute ago
We’re out, searching Wikipedia and FreeBase. Proud doesn’t begin to describe my feelings about what’s been created. I think it speaks for itself, and I’ve been using it instead of Wikipedia for the last few months.
Some people even seem to really get it. And that’s an amazing high.
Huge thanks are not only due to the Powerset team (I love working with these people), but also to all of the open source projects we’re making use of. Hadoop. HBase. Thrift. Ruby. Merb. Rails. god. Mongrel. Mootools. RabbitMQ. The ActiveMessaging Project. Memcache and MemcacheDB. Erlang. Fuzed. YAWS. Countless others. All those people who take the time to answer our questions, and respond to our bugs, and consider our patches, and write interesting articles, and make our code better. You guys rock, and we couldn’t do it without you. So much love is heading in your direction right now. Thank you, thank you, thank you.
I’m exausted, and going to bed. Good night, and good luck. And try the app. I’m stoked.

Posted by kev 18 days ago
With apologies to Gruber.
Original text from TechCrunch.
It doesn’t really matter if Twitter’s Chief Architect Blaine Cook was fired
or resigned. The important thing is that he’s gone now, and this gives
Twitter the opportunity to hire someone (or a team) who may actually be able
to scale the nearly two year old service and keep it live.
I haven’t done any research or spoken to anyone about it, so don’t ask, but I have found a scapegoat.
Cook was directly responsible for scaling Twitter, and he very much failed
in his job.
Twitter is down sometimes, and I’m angry about it.
A year ago he spoke at the Silicon Valley Ruby Conference about scaling
Rails applications. His presentation suggested Twitter’s problems were
behind them, but in fact some of their biggest stumbles hadn’t occurred yet.
Note in particular slide 9 of that presentation, where Cook says about
scaling Rails apps like Twitter: “It’s Easy. Really.”
I found a posted slide from a public conference during my “investigative reporting phase. Oh, and Twitter is on Rails, and I can blame Rails. It’s Easy. Really.
Whether Twitter’s woes were all on Cook’s shoulders or not, he should not
have been boasting about solving the problem last year.
I’m high as a kite.
Meanwhile, Twitter has made at least three key hires this year on the
technical side. Lee Mighdoll joined as VP Engineering and Operations in
January. And this week they hired two scaling experts - John Kalucki and
Steve Jenson (”known for his work scaling Blogger and Blogspot”).
And they hired someone, yada yada yada, sprinkle random facts in so nobody notices my complete incompetence. Perfect!

Posted by kev 36 days ago
Just drop this in your Rakefile. This is slightly modified from something I’m using in production.
Disect! Enjoy! Explanation (read: spoilers) after the jump.
begin
require 'rake_remote_task'
APP_NAME = "someapp"
DEPLOY_ROOT = "/usr/local/share/applications/#{APP_NAME}"
ON_DEPLOY_RESTART = ["someappd"]
role :app_server, "myserver.com"
def archive
commit = `git-rev-list --max-count=1 --abbrev=10 --abbrev-commit HEAD`.chomp
file = "#{APP_NAME}-#{commit}.tar.gz"
end
def restart_daemons
ON_DEPLOY_RESTART.each do |app|
run "sudo god restart #{app}"
end
end
namespace :deploy do
task :build do
sh "git archive --format=tar HEAD | gzip > #{archive}"
end
remote_task :push => :build do
rsync archive, "/tmp"
end
desc "Install a release from the latest commit"
remote_task :install => :push do
date_stamp = Time.now.strftime("%Y%m%d")
last_release = run("ls #{DEPLOY_ROOT}/rels | sort -r | head -n 1").chomp
if last_release =~ /#{date_stamp}\-(\d+)/
serial = $1.to_i + 1
else
serial = 0
end
rel = ("%d-%02d" % [date_stamp, serial])
rel_dir = "#{DEPLOY_ROOT}/rels/#{rel}"
run "sudo mkdir -p #{rel_dir}"
run "sudo tar -xzvf /tmp/#{archive} -C #{rel_dir} && rm -rf /tmp/#{archive}"
run "sudo ln -s -f -T #{rel_dir} #{DEPLOY_ROOT}/current"
restart_daemons
end
desc "Rollback to the previous release"
remote_task :rollback do
current_link = run("ls -alF #{DEPLOY_ROOT} | awk '/current -> .*/ { print $NF }'").chomp
current = File.basename(current_link)
releases = run("ls #{DEPLOY_ROOT}/rels | sort -r").split("\n")
previous = releases.find {|rel| current > rel}
raise "No previous release" if previous.nil?
run "sudo ln -s -f -T #{DEPLOY_ROOT}/rels/#{previous} #{DEPLOY_ROOT}/current"
restart_daemons
puts "Moved to #{previous}"
end
desc "Rollforward to the next release"
remote_task :rollforward do
current_link = run("ls -alF #{DEPLOY_ROOT} | awk '/current -> .*/ { print $NF }'").chomp
current = File.basename(current_link)
releases = run("ls #{DEPLOY_ROOT}/rels | sort -r").split("\n")
next_rel = releases.find {|rel| current < rel}
raise "No next release" if next_rel.nil?
run "sudo ln -s -f -T #{DEPLOY_ROOT}/rels/#{next_rel} #{DEPLOY_ROOT}/current"
restart_daemons
puts "Moved to #{next_rel}"
end
end
rescue LoadError => e
puts "NOTE: Install vlad to get Kevin's awesome deployment tasks"
end

Posted by kev 141 days ago
This fails horribly.
The solution is to make sure you’re only building for your architecture:
Odysseus:ext kev$ sudo -s
bash-3.2# ARCHFLAGS='-arch i386' gem install postgres
Building native extensions. This could take a while...
Successfully installed postgres-0.7.9.2007.12.22
Installing ri documentation for postgres-0.7.9.2007.12.22...
Installing RDoc documentation for postgres-0.7.9.2007.12.22...

Posted by kev 153 days ago
Reposted from my message to rubinius-dev. Congrats to the whole Rubinius team. This was entirely a group effort, and one hell of an achievement.
Here's the first Mongrel handler running on Rubinius:
http://pastie.caboo.se/paste/asset/126441/Picture_4.png
From this code:
$:.unshift "/Users/kev/code/mongrel/mongrel-1.1.1/lib"
puts "Requiring mongrel"
require 'mongrel'
class HelloHandler < Mongrel::HttpHandler
def process(request, response)
response.start(200) do |head, out|
head["Content-Type"] = "text/html"
out.write "Hello World! I'm running on Rubinius!"
end
end
end
server = Mongrel::HttpServer.new("0.0.0.0", 3000)
puts "Started Server"
server.register("/hello", HelloHandler.new)
puts "Registered handler"
t = server.run
t.join
***THE CATCH (as this may be viewed by many people)***
This isn't completely complete. rb_global_variable was #define'd out
to do nothing (so no garbage collection on the global vars), and there
was a slight modification from the trunk to make global aliasing
ignore the fact that the globals just weren't there. Mongrel's
http11.c was also _slightly (very very slightly)_ modified to use the
rb_str_get_char_* methods we've decided to move to from RSTRING()->ptr
and RSTRING()->len, and I haven't gotten around to defining ALLOC_N
yet, so it was changed to a simple malloc. That's it though.
And it seems to run. And I feel like I need to run around the block.
It's in 9976301ba.
WOOOOOOOOOOOOOOOOOOOOO!

Posted by kev 153 days ago
Install the do_postgres gem against postgresql82 on MacPorts:
Make sure that /opt/local/lib/postgresql82/bin/ is in your path. You need pg_config easily accessible. Then run:
sudo gem install do_postgres -- --with-pgsql-include-dir=/opt/local/include/postgresql82/ --with-pgsql-lib=/opt/local/lib/postgresql82/
Autotest with Rspec on Merb with a Leopard install using the supplied Ruby (whew)
This will break because it can’t find the “spec command”. It searches the configured bin directory, which with the supplied ruby is /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin.
ln -s /usr/bin/spec /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/spec

Posted by kev 166 days ago
Because I hadn’t implemented DFS in Ruby before, and it’s just so damn easy.
Update: Phillip rightly pointed out in the comments that with the yield at the end, it’s actually post-order traversal, not depth first search per se.
class TreeNode
attr_reader :name
def initialize(name)
@name = name
@children = []
end
def add_node(node)
@children << node
end
def each_depth_first
@children.each do |child|
child.each_depth_first do |c|
yield c
end
end
yield self
end
end
root = TreeNode.new("root")
root.add_node( a = TreeNode.new("a"))
a.add_node( b = TreeNode.new("b"))
a.add_node( c = TreeNode.new("c"))
c.add_node( d = TreeNode.new("d"))
root.add_node(e = TreeNode.new("e"))
e.add_node(f = TreeNode.new("f"))
e.add_node(g = TreeNode.new("g"))
root.each_depth_first do |child|
puts child.name
end

Posted by kev 185 days ago
I’ve apparently “hacked” someone’s unborn children. Or something.
And, among the lists of names, includes someone who goes by the name of Kevin Clark. And, if you were to take it even one step further, and INVESTIGATE this person, you would also come to the conclusion that he is a computer hacker who resides (or has resided) in San Francisco before.
Another coincidence, I suppose?
Update: She says in one of her latest comments that she has a gun now. It’s amusing, but please do not poke/provoke her. Look, but do not touch.
Update 2: She’s still accusing me of multiple felonies, but seems to have leveled off. I don’t think she’s going to track me down at this point. Woo personal safety. Oh, and technorati is apparently involved now:
I believe Mr. Clark is somehow routing fake websites through technorati via powerset and is doing something illegal.

Posted by kev 190 days ago
They’re all up at once. Wow.
Mine felt good, but it’s long. Rather long. 50 minutes fairly non-stop, ~600 megs long. Find some time before watching.
Episode 036: The Return of Kevin Clark
Kevin Clark takes a break from Powerset to give a full-throttle talk
on using Merb as a JSON-RPC service, god, gem2rpm, and heckle.
Episode 035: ActiveRecord Backup & MimetypeFu
Matt Aimonetti demonstrates his newest plugins: ActiveRecord Backup
and MimetypeFu.
Episode 034: Intro to JRuby
Brian Chapados shows how to install and work with the latest JRuby
release.
Episode 033: Life on Edge
If you’re a Rails junkie, you’ll want to develop on Edge Rails. Matt
Clark explains how to get started and shares some of the challenges
of working on Edge.
Episode 032: Capistrano
Rob Kaufman takes on Capistrano 2. What is it? How does it work?
What’s changed since version 1?
Episode 031: Seaside
Roger Whitney explores Seaside, the web application framework based
on Smalltalk.
Episode 030: Tuneshelf
Dominic Damian talks about his experiences building Tuneshelf, a web
application that allows music fans to keep track of their favorite
music albums.
Episode 029: Big Stinking Piles (of data)
What do you do when third-party data vendors don’t speak REST? Rob
Kaufman discuss real-world techniques for importing and exporting
data. (This talk was also given at RailsConf 2007.)
Episode 028: Simple Sidebar Plugin
Ryan Felton shows how to use Simple Sidebar plugin to DRY up sidebar
content in applications.
Episode 027: Headliner and Styler
Patrick Crowley talks about his newest plugins: Headliner and Styler.
Episode 026: ActsAsSolr
Rob Kaufman shows how easy it is to integrate Solr powered search
into your Rails application using the ActsAsSolr plugin.
Episode 025: Ajax CSS Star Rating with ActsAsRateable
Ryan Felton shows off how to build an Ajax-powered, CSS star rater
using the ActsAsRateable plugin and Komodo Media’s CSS Star Rating
Redux technique.
Episode 024: Using Ruby + Amazon SQS to build backdoors
Brian Chapados talks about using Ruby and Amazon’s Simple Que Service
web service to build backdoors into systems.

Posted by kev 195 days ago
I haven’t yet decided if this is a good idea or not.
I’ll be at RubyConf this weekend. Say hello, if you get the urge.
require "test/unit"
require 'rubygems'
require 'mocha'
require 'stubba'
module ForwardsToEnumerable
def self.included(klass)
klass.extend(ClassMethods)
end
module ClassMethods
def forward_to_enum(instance_var, *meths)
meths.each do |meth|
class_eval <<-METH
def #{meth}(*args, &block)
#{instance_var.to_s}.each do |i|
i.send(:#{meth}, *args, &block)
end
end
METH
end
end
end
end
class ForwardsToArray
include ForwardsToEnumerable
forward_to_enum :@array, :foo, :bar, :baz
def initialize(array)
@array = array
end
end
class TestForwardsToArray < Test::Unit::TestCase
def test_forward_to_enum
items = [mock(), mock(), mock()]
items.each {|i| i.expects(:foo); i.expects(:bar); i.expects(:baz) }
f = ForwardsToArray.new(items)
f.foo
f.bar
f.baz
end
end

Posted by kev 233 days ago
def <=>(other)
return 0 if self.version == other.version and self.rel == other.rel
versions = self.version.split(/[^[:alnum:]]/).push self.rel
other_versions = other.version.split(/[^[:alnum:]]/).push other.rel
return 1 if versions.size > other_versions.size
return -1 if versions.size < other_versions.size
versions.size.times do |n|
if versions[n] =~ /[^\d]/ && other_versions[n] =~ /[^\d]/
comparison = (versions[n] <=> other_versions[n])
elsif versions[n] !~ /[^\d]/ && other_versions[n] !~ /[^\d]/
comparison = (versions[n].to_i <=> other_versions[n].to_i)
else
comparison = -1
end
return comparison unless comparison.zero?
end
return 0
end
Original version sort was here.

Posted by kev 235 days ago
/* take care of the case where the two version segments are */
/* different types: one numeric and one alpha */
if (one == str1) return -1; /* arbitrary */
if (two == str2) return -1;
– rpm/lib/misc.c

Posted by kev 235 days ago
/usr/bin/git-quickserver
#!/bin/sh
git daemon --verbose --reuseaddr --export-all --base-path='.'
sisyphus:~/code/god kev$ git quickserver
sisyphus:~ kev$ git clone git://localhost/ somethin
Initialized empty Git repository in /Users/kev/somethin/.git/
remote: Generating pack...
remote: Done counting 1469 objects.
remote: Deltifying 1469 objects...
100% (1469/1remote: 469) done
Indexing 1469 objects...
remote: Total 1469 (delta 905), reused 1461 (delta 902)
100% (1469/1469) done
Resolving 905 deltas...
100% (905/905) done
(via KirinDave)

Posted by kev 241 days ago
if self.stillHopeful:
# oh, cruel reality cuts deep. no joy for you. This is the
# first failure. This flunks the overall BuildSet, so we can
# notify success watchers that they aren't going to be happy.
self.stillHopeful = False
self.status.giveUpHope()
self.status.notifySuccessWatchers()
– Buildbot Source

Posted by kev 268 days ago
require 'yaml'
require 'pp'
pp YAML.load(`svn info`)

Posted by kev 291 days ago
I hate the fact that googling syslog ruby didn’t turn up anything useful, and the rdoc doesn’t seem to be there. So, I’m posting the README from the extension in Ruby’s source. This is as much for me as for you. Using PageRank for good is.. well.. good I’d assume.

Posted by kev 325 days ago
Spread the word on digg
Powerset is fairly well-known in the Ruby community, but there’s a certain amount of confusion as to what we use it for. As a consequence, I’m regularly asked what the front end is going to be written in, and just as regularly have to leave the question unanswered. But today I’m happy to announce that we are, in fact, launching our front-end on Ruby.
Cool, huh? For everyone’s sanity (and in avoidance of some of the flame wars to ensue), do note that we are going to be using Ruby (the language) but not necessarily Ruby on Rails (the web framework).
In the spirit of Powerset’s new found openness, I’d like to take some time to explain why we’re making this decision where others might not.
Why Ruby?
1. We’ve already got the brains
One thing we haven’t kept secret is that we’ve hired some of the best Ruby developers around. Our total number of day in day out Ruby developers is somewhere around 10, and I’m constantly humbled to be working with this team. We’ve got the people and they have the skills, so it makes sense to apply them.

2. Ruby is already being used throughout the company
We’ve always spoken in general terms about how much Ruby is being used internally, but let’s get specific: a substantial part of our infrastructure is being written in Ruby or being accessed through Ruby services. Our scientists use Ruby to interact with our core language technology. Our packaging infrastructure is Ruby. A big portion of our system administration work is all done with Ruby. Frankly, we as an organization use Ruby a whole heck of a lot.
Additionally, all of our product demos and prototypes are also in Ruby. We’ve got an interesting mix of Rails, Merb and Camping apps (depending on the scope of the project) connecting to tiny Ruby services which hook into our various back-end systems. Day to day, the majority of the product team is hacking in Ruby in some capacity.
3. We’re not worried about scaling
So, inevitably, whenever we talk about Ruby or Rails scaling these days someone brings up Twitter and its scaling problems in the past. Twitter is right down the block from our offices and I know several of the devs personally, so before we made a final decision I arranged a sit down with Twitter’s lead developer, Blaine Cook, to talk about the situation. Blaine was kind enough to let me bring along our Search Architect (and former search architect at Yahoo!) Chad Walters , our Head of Product Scott Prevost, and our COO Steve Newcomb, to poke and prod and get their questions answered. The simple fact is that Ruby wasn’t the source of Twitter’s woes. As it often happens with rapidly growing sites, they ran into architectural problems. Some design decisions don’t hurt until they reach a massive scale and at that point you have to rethink your approach. In an email he writes:
For us, it’s really about scaling horizontally - to that end, Rails and
Ruby haven’t been stumbling blocks, compared to any other language or
framework. The performance boosts associated with a “faster” language would
give us a 10-20% improvement, but thanks to architectural changes that Ruby
and Rails happily accommodated, Twitter is 10000% faster than it was in
January
This is great news for Twitter, but even better for us because we don’t have the bottle necks that they’ve struggled with – databases, instant messaging servers, and regularly recycling cache systems – which makes scaling horizontally much much smoother. At that point, our scaling issue doesn’t concern Ruby. For a search engine, the front-end is largely just a templating system and the real work happens in the back when we process your query.
What does this mean for the community?
When writing this article, at some point I had to sit down and ask myself why anyone should care we’re adopting Ruby for the front-end. For me, it comes down to the fact that we’re good for the community as a whole.
First off, the fact that Powerset is deploying on Ruby means you’ve got one more high traffic site (potentially very high traffic) using Ruby in production. It’s one more case study, and one more example that Ruby as a whole is ready for the big show.
Personally, I think the more interesting and useful thing to take away from this is that as we do the heavy lifting, building up infrastructure around all the aspects of Ruby development and deployment in the company, we’re selecting large chunks to be open-sourced. I’ve got a list of things I’d love to put out into the wild (which is encouraged, and actually suggested by my manager. Man, I love this place) as soon as I can find the time. Already Tom Werner and Dave Fayram have pushed out Ruby to Erlang bindings and a sweet little (in-development) web server called Fuzed, I’ve gotten to hack at Merb, and a fair about of Rails patches have come directly from work in-house. Hopefully the community will be able to benefit from our code as much as we have.
Obviously we don’t have a search product open to the public yet, but we’ll be launching Powerlabs in September. In Powerlabs, you’ll be able to play with our products and give us feedback. If you want to keep track of what Powerset is doing, sign up.

Posted by kev 326 days ago
Spot the pattern?
('1'..'10').to_a
('2'..'10').to_a
('2'..'20').to_a
('3'..'20').to_a
('3'..'30').to_a
('4'..'30').to_a
('4'..'40').to_a
(2..10).to_a
('2'.to_i .. '10'.to_i).to_a

Posted by kev 371 days ago
One of the many awesome things about working at Powerset is the guys I get to hack with. Tonight, my buddies Tom Preston-Werner, Chris Van Pelt, and I were feeling whimsical. Full source, with quicksilver hook and startup scripts can be found here, but this is the meat:
growl_handler.rb
require 'rubygems'
require 'ruby-growl'
module Jakl
class GrowlHandler
def initialize
if `which growlnotify` =~ /^no .+ in/
@strategy = :ruby
@growl = Growl.new("localhost", "jakl", ["jakl_message"])
else
@strategy = :command
end
end
def notify(group, name, message)
case @strategy
when :command
img_path = File.join(File.dirname(__FILE__), '../../assets/jakl.png')
`growlnotify -n jakl --image
when :ruby
@growl.notify("jakl_message", "#{name} (#{group})", message)
else
raise StandardError.new('Invalid strategy')
end
end
end
end
client.rb
require 'rubygems'
require 'net/dns/mdns-sd'
require 'base64'
module Jakl
class Client
DNSSD = Net::DNS::MDNSSD
@@debug = false
def self.debug=(value)
@@debug = value
end
def self.debug
@@debug
end
def initialize(options={})
default_options = {
:default_recv => "jakl",
:timeout => 2,
:login => ENV['USER']
}
@options = default_options.merge(options)
end
def send(message, recv=nil)
recv ||= @options[:default_recv]
recv = recv.split(',').collect {|g| g.strip }
puts "Sending: '#{message}' to '#{recv.join(',')}'" if @@debug
find_recipients = DNSSD.resolve('jakl', '_jakl._tcp') do |r|
puts "Found jakl service at #{r.target}" if @@debug
recvs = r.text_record['recvs'].split(',').collect {|g| g.strip }
puts " responds to: #{recvs.join(', ')}" if @@debug
if (succ_recvs = recvs & recv).any?
puts "Sending to: #{r.target}:#{r.port}" if @@debug
data = [@options[:login], succ_recvs.first, message].map do |s|
Base64.encode64(s)
end.join(';')
TCPSocket.new(r.target, r.port).send(data, 0)
end
end
sleep @options[:timeout]
find_recipients.stop
end
end
end
server.rb
require 'rubygems'
require 'eventmachine'
require 'net/dns/mdns-sd'
require 'base64'
$:.unshift File.dirname(__FILE__)
require 'growl_handler'
module JaklEventServer
def receive_data(data)
name, recv, message = data.split(';').map {|s| Base64.decode64(s) }
Jakl::GrowlHandler.new.notify(recv, name, message)
$stderr.puts "Name: #{name}, Recipient: #{recv}, Message: #{message}" if Jakl::Server.debug
end
end
module Jakl
class Server
DNSSD = Net::DNS::MDNSSD
@@debug = false
def self.debug=(value)
@@debug = value
end
def self.debug
@@debug
end
def initialize(options={})
default_options = {
:recvs => "jakl",
:timeout => 5,
:login => ENV['USER']
}
@options = default_options.merge(options)
validate_login!
end
def start
DNSSD.register('jakl', '_jakl._tcp', 'local', 4180,
{'recvs' => @options[:recvs], 'login' => @options[:login]})
EventMachine::run {
EventMachine::start_server "0.0.0.0", 4180, JaklEventServer
puts "Listening for howls on 4180"
}
end
def validate_login!
name_validation = DNSSD.resolve('jakl', '_jakl._tcp') do |r|
if [r.text_record['login'],
r.text_record['recvs'].split(',')].flatten.include? @options[:login]
puts "The name #{@options[:login]} is already taken. Sorry :\\"
exit 1
end
end
@options[:recvs] = [@options[:recvs].split(',') << @options[:login]].join(',')
sleep 3
name_validation.stop
end
end
end
