First Seven Languages

My #FirstSevenLanguages:

  1. BASIC - I remember programming BASIC on my parents' IBM PC around the age of 6. For a long time it was mostly copying simple code from books and writing tight for loops. Eventually I progressed to QBASIC and some simple graphics stuff.
  2. C - My dad taught me C when I was getting fed up with BASIC. The only thing I remember writing was a crappy RPG based on the places I hung out with friends.
  3. TI-BASIC - I wrote a bunch of small programs that automated some geometry/trig/calculus rules for my TI-82 in High School. I also wrote some games, tic-tac-toe and 2-player checkers among the ones I recall.
  4. Pascal - I learned Pascal in high school, as part of the AP Computer Science program. My proudest moment was Uno and Battleship, both with simple AI opponents.
  5. Perl 5 - I learned Perl in my free time my freshman year at college. I created a small community website (wsvw1u.com) using CGI.
  6. C++ - The Computer Science courses at RPI were primarily in C++ at the time I was there. For parts of courses, I also dabbled in MIPS assembly, Smalltalk, and Scheme.
  7. JavaScript/JScript - My first full-time job after graduation was writing tools for the 24-hour NOC at Priceline.com. We used a mix of Perl, server-side JScript, and JavaScript.

JavaScript Flickr carousel

My wife and I made something a couple of weeks ago, and her name is Simone. Our families and some friends want to see lots of pictures, but most people don’t want us to flood their Facebook feeds with baby photos (nor do we want to). So we’ve been uploading them to Flickr mostly, but I wanted a slightly simpler page where people could just see a carousel slideshow of photos of our daughter. There was no good immediately-integrated-with-Flickr JavaScript carousel I could find, but I was pretty easily able to integrate it with Fotorama.

Here is the final product (and beautiful pictures of my girl). Here are the easy steps to do it yourself:

  1. Sign up for a Flickr API key. If this is a non-commercial site, it's free and instant.
  2. Grab the download or copy the hotlink markup from the setup page in the Fotorama docs
  3. Add this markup to your page:
    <div class="carousel" data-auto="false"></div>
    There's a lot of configuration options you can add besides that. Here's the full list.
  4. And now the final step, the JavaScript that populates the carousel:
      <script type="text/javascript">
        $(function() {
          var AddPhotosToCarousel = function(data) {
          var imgs = [];
          $.each(data.photoset.photo, function(index, photo) {
            imgs.push({img: photo['url_m'],
                       thumb: photo['url_sq'],
                       caption: photo['title']});
            });
            $('.carousel').fotorama({ data: imgs });
          };
          $.getJSON('[api.flickr.com/services/...](https://api.flickr.com/services/rest/?method=flickr.photosets.getPhotos&api_key=XXXXXXXXXX&photoset_id=YYYYYYYYYY&format=json&extras=url_sq,url_m&jsoncallback=?',) AddPhotosToCarousel);
        });
      </script>
    

You must put your API key and your photoset’s id into that big URL you pass to $.getJSON(). If you want the photos to show up in reverse order (as I did), change imgs.push() to imgs.unshift(). You can display things other than photosets (search results and such), but you’ll need to dig into the Flickr API docs to build those queries yourself.


A bunch of sort algorithms

Whenever I’m trying to get back into the swing of building and optimizing and evaluating algorithms, my first step is always to write a whole bunch of sorting implementations. I’m also trying to improve my knowledge of the core syntax of python. So here are four sorts in python: insertion, merge, heap, and quick. (The insertion and heap sort implementations are both in-place. The other two are not.)

The second step is probably going to be to implement a data structure I’ve never done before. Last time, it was a min-max heap in PHP. I’m thinking maybe a B-tree?

Update 3 Sept: Here is my implementation of a splay tree. Far simpler than I remembered, so I challenged myself to do it without parent links in the node objects.


Choosing random keys

Say you had code that generated random keys. These random keys were 6 letters long, all caps, with no duplicates. Here’s a few, as an example:

NTCYARDHIEWMINBVTXIOELUC
RKNBJXGKRANBDRYVQUYIFKTS
VAUPSGALWPOSCERSUYWAHJVM
MTXJSZRNLFXZVFIEXTVEOKIH

What are the chances that the key that you generate will be in alphabetical order? For instance, above, there’s only one in alphabetical order (CERSUY, in bold) And then, if you think you have that, generalize: For any string of length k distinct characters chosen from a set of n, what are the chances that they will be in order? My answer after the break.

I spent several minutes trying to solve this the straightforward way. You only have 21 choices for the first letter, because if it’s anything after U, there’s no way to have five more letters that come after it. Then for the second letter, you have 22-a1 (where a1 is the index of the first letter) choices. For instance, if the first letter was E, you have only 22-5, or 17 choices – namely anything between F and V. After going through all six letters, you end up with this ungainly thing:

[21 (22-a_{1}) (23-a_{2}) (24-a_{3}) (25-a_{4}) (26-a_{5})]/n!

It’s probably possible to simplify that somewhat by looking at reductive cases – you can model a two-letter key as a grid of allowed possibilities, but even that would get pretty challenging pretty quickly. If you look at it from another direction, however, things get much easier. Consider a single key, let’s say “NTCYAR” (the first in the table above). How many possible permutations of that key are there? Simple: 65432*1, or 6!. Of those, how many of them are in alphabetical order? Even simpler: 1 (ACNRTY). In fact, this is true for any set of six letters you choose – you could have picked those same letters in any of 6! possible orderings, and only one is in alphabetical order. So there you go. Your answer is:

1/6! = 1/720 = 0.138%

The interesting thing? The probability doesn’t depend on the length of your alphabet, only on the length of the key. The generalized probability is simply:

1/k!

Watching every MLB team play a game

Last April, to no one in particular, I asked the following question:

"What's the shortest possible trip (in miles) to see every MLB team play at least one game this season?"

It became clear, after a brief discussion with some friends, that the shortest possible trip is somewhere on the order of a hundred miles. Citi Field in the Bronx and Yankee Stadium in Queens are only 6 miles apart. Since the Mets and the Yankees are in different leagues and each team plays one series at home against every other team in its own league, you could just spend the whole season going back and forth between the two stadiums. (In fact, I’d be surprised if at least one New Yorker baseball fan with time and money to burn hadn’t done exactly this.)

In order to avoid this “trivial” solution, a modification to the puzzle would have to be introduced. After throwing around a bunch of attempts, I hit upon the perfect goal: 15 games, 15 stadiums, 30 teams. You’d see no team play more than once, you’d be in no stadium more than once.

Now that I had a problem worth solving, how to solve it? The sheer number of games makes it pretty clear that this is a task that can’t be solved by brute force. There are 2430 games in a regular season of baseball. The possible number of any selection of 15 of those games is 4.45 × 1038. Even if I could check a trillion schedules a second (which I can’t), it would still take 14 billion billion years (at which point, even the Red Sox would probably have a new stadium). If you visualized the problem as a graph, each game would be a node and each travel day would be an edge. The problem was that every game had an edge that led to every game that came after it. What I needed to do first was prune this tree.

So I made some assumptions for the sake of making a reasonable road trip. I wouldn’t want more than 2 days off between games, and I wouldn’t want to travel more than 500 miles on any given day. This change trimmed the 3-million-plus edges into a mere 86,000. But an exhaustive search would still take a prohibitively long time: at that same impossible speed, I’d have my answer in merely 11 billion years. I made the problem more than a billion times smaller, but realistically I wasn’t any closer to the solution.

It was about this time that I decided I was never going to get the perfect answer. I would have to resign myself to a Good Enough plan that could be calculated before the heat death of the universe (or better yet, before Opening Day). I brushed up on my Traveling Salesman Problem and pathfinding algorithms like A*, Dijkstra’s, and Floyd-Warshall. But my problem had a couple of quirks that made those approaches unsuitable:

  • With both A* and Dijkstra's, you have defined starting and destination nodes. I did not want to specify either. I wanted the schedule (and the map) to dictate the best time of the season to take the trip.
  • Pretty much every algorithm I read about optimizes for total distance, and nothing else. A perfect road trip might start at Wrigley and end at US Cellular a month later, but it most definitely would not be the shortest distance between those two points.
  • Most importantly, I was very particular about exactly which nodes could be on the same schedule. Two games might not have an overlap, but once I decided to go to both of them, the options for the third game get narrowed down. Likewise for every step along the way.

After three or four false starts (and one moment where a bug in my data structure made an exhaustive search running a couple quadrillion times faster than it should), I happened upon a promising technique. A commenter on Stack Overflow recommended (in response to a purposely vague question) doing a sort of prioritized breadth-first search. What I ended up doing was starting with a list of all of the one-game plans (2430 plan, one for each game). And then I did this:

  1. Remove the plan with the shortest average leg length from the list (ties are unimportant -- pick arbitrarily)
  2. Look at every game that you could add to the end of that plan that meets the restrictions: 2 or fewer off days, no more than 500 miles of travel per day.
  3. For every one of those games, check if it's legal to add to this plan (check for team duplications against every game already on the plan)
  4. If there are no duplicates, add it to the list of plans. Since each game has a couple dozen possible "next" games, this will likely result in the number of plans in the list growing.
  5. If there are more than some number of plans in the list, discard the longest ones. (After some experimentation, I went with about half a million. Much more on this another time.)

And I repeated until the “shortest plan” was 15 games long. On the first try, it took about 3 and a half hours on my laptop (Core i5 with 4GB of RAM) and had to inspect more than 12.5 million potential plans before finding one that was a full 15 games long. The solution was almost 5,500 miles long, and included back to back 800-mile legs from Los Angeles to Denver and then to St. Louis. That did not strike me as optimal. I considered what could be preventing the discovery of a good plan: the problem is that the shortest eight-game plan might not yield any short nine-game plans, and if the list is full and discards the longer eight-game plans before they can even be checked, then it will never pursue the most promising leads. In truth, I run four lists in parallel, and they fit 219-1 elements each (a little over 500k). But the lists are full after only three minutes of inspecting plans (less than half a million). These lists take up more than 2GB of memory right now.

I chose to try the script on an Extra Large Amazon Elastic Compute Cloud instance. This includes not quite as much CPU power as my laptop, but four times the memory. I also changed the way the algorithm treated “the shortest average leg length”, by giving plans with more legs a bonus. It found this schedule in less than ten minutes, and failed to find a better one even after searching through 20+ million more. (This is a thousand miles shorter than the one I found earlier, and the only really long leg is from Boston to Minneapolis.)

2012-07-18: Blue Jays @ New York Yankees

  • Travel 6.5 mi 2012-07-20: Dodgers @ New York Mets
  • Travel 91.2 mi 2012-07-21: Giants @ Philadelphia Phillies
  • Travel 121.6 mi 2012-07-22: Braves @ Washington (DC) Nationals
  • Travel 306.6 mi 2012-07-24: Tigers @ Cleveland Indians
  • Travel 309.9 mi 2012-07-27: Cardinals @ Chicago Cubs
  • Travel 76.4 mi 2012-07-30: Astros @ Milwaukee Brewers
  • Travel 326.4 mi 2012-07-31: Padres @ Cincinnati Reds
  • Travel 250.6 mi 2012-08-03: Angels @ Chicago White Sox
  • Travel 408.8 mi 2012-08-06: Diamondbacks @ Pittsburgh Pirates
  • Travel 196.9 mi 2012-08-07: Mariners @ Baltimore Orioles
  • Travel 358.4 mi 2012-08-08: Rangers @ Boston Red Sox
  • Travel 1128 mi 2012-08-11: Rays @ (Minneapolis) Minnesota Twins
  • Travel 414.8 mi 2012-08-14: Athletics @ Kansas City (Missouri) Royals
  • Travel 564.5 mi 2012-08-16: Marlins @ (Denver) Colorado Rockies Total distance: 4560.5 mi

Remember, this isn’t the best you could do, but it’s probably close, and it was computable in a very reasonable amount of time. Interestingly, the shortest 13-game road trip is about half as long (about 2400 miles, with no leg longer than 350). Picking up those last couple of games is quite expensive. It’s been suggested that seeing a team more than once for the sake of saving several hundred miles might be acceptable – but establishing an algorithmic rule might take some time. Stay tuned for further tweaks!

Now: time to rent a car and block off a month of vacation.

Here is the final version of the code that I used for this post.


Perl backticks and open files

When trying to write some debugging code, I noticed something very strange. I could never get fuser to admit that its parent had a file open when called from a Perl script:

$ perl -e ‘open(F, “» /tmp/foo”) or die; fuser -a /tmp/foo;’ /tmp/foo:

But when I replaced the fuser call with a sleep, and then called fuser from a different shell, I’d get Perl’s pid, as expected. I was about to post this question to SuperUser and decided to try this just to make sure:

$ fuser . | grep -c $$ 1

Wait, what? That worked? So I asked myself: Is this something specific to Perl? Is this something about the way Perl is running child processes? If only There Was More Than One Way To Do It. Oh wait, there is.

$ perl -e ‘open(F, “» /tmp/foo”) or die; system(“fuser -a /tmp/foo”);’ /tmp/foo: 3733

I haven’t been able to find documentation about what exactly backticks do with open filehandles that makes fuser report the wrong information, but system() clearly doesn’t do it. I just thought I’d document it here for the next person who is trying to do the same thing.


Stack Overflow

I’m not a huge fan of karma-accumulation websites. You know what kind of site I mean: Other people can vote up or down your stories and comments, and you get points when you’re voted up, and lose points when you’re voted down. There are lots of these, each with their own scoring foibles: Digg and Reddit are two of the most well-known. Slashdot has done it for a while (in fact, I think they’re why I still call this concept “karma”). The concept is good: You limit the requirements for moderators by allowing users to essentially moderate each other. If a post or comment gets enough down votes, it might vanish. If a post gets a lot of up votes, it becomes more prominent. Fewer dedicated moderators generally means lower overhead costs and, in theory, a “fairer” moderation policy.

In practice, though, there are a lot of things to not like about these schemes. Accumulating points becomes an end in itself, which leads to posting a lot of simple or purposely-misrepresented posts (“whoring”). When points are used to control if your posts are seen at all, creating multiple sockpuppet accounts to vote up your own contributions (“gaming”) is inevitable. On the larger sites, an above-the-fold link can mean tens of thousands of hits; gaming then becomes a lucrative business. Preventing whoring and gaming through algorithms can work, but search engine optimizers are smart and tenacious. It’s a continuous arms-race.

Stack Overflow logoStack Overflow is a point (“reputation”) accumulating site that somehow works extremely well. At first glance, it is a simple programming question-and-answer forum where you gain reputation for answering questions and having those answers voted up or chosen as “correct” by the asker. But there’s a number of things I haven’t seen anywhere else that makes Stack Overflow excel where other sites struggle. The most obvious difference is that you are able to do more things as you gain reputation. You can’t even vote up questions or answers until you’ve received a couple of votes of your own, increasing the burden to entry for sockpuppets and gaming. Maybe more importantly, points accumulation isn’t single dimensional. Stack Overflow gives you badges for activities that are difficult to game, and displays your badge count right next to your reputation.

But there’s a bigger reason why this works on Stack Overflow. There’s no real benefit to gaming or whoring. Since the focus of the website is question-and-answer, instead of directing traffic to other websites, there’s almost no way to make a profit from having a question voted way up. And the relative smallness of the audience compared to some of the giants out there means that even if there was a way to turn a profit, focusing on other sites would still give you a better return. What’s left to be seen is if Stack Overflow can keep such a high signal-to-noise ratio as it continues to grow.


Why I'm Still Writing Java

Just over a year ago, I was sitting in a classroom at Sun’s virtually deserted Burlington campus, learning a new programming language from a living human in person for possibly the first time in my whole life. The class itself was a little slow; it definitely wasn’t geared towards experienced developers. But it was more effective learning than fiddling at work had been, since it successfully prevented all distractions. Now that I’ve had a year to work with the language (and have gone back and forth to programming in Perl), there’s several reasons why I’m still using Java.

  1. Strong typing: Perl has no type safety, and I used to think that was an asset. But when you have a dozen programmers working on hundreds of source files, type conflicts can easily get introduced and go undetected for a long time. Objects are hashes, references are scalars, and arrays are quietly scalar-ized.
  2. WAR files: Releasing a new version of a website is brainless with Java. See that .war file that was built by Eclipse or Maven? Copy it to your server. Done.
  3. JUnit, Hibernate, Spring, log4j: Perl has a lot of freely available modules. But not a single one of them is as useful as Hibernate alone. It encapsulates database objects transparently, and is remarkably flexible. We've got a lot of awkward legacy database schemas, and without Hibernate's flexibility, we'd be building database objects by hand with JDBC. Spring's dependency injection and session management, JUnit's unit testing mechanisms, log4j's logging simplicity, and Maven's build architecture mean we spend less time planning and re-planning the infrastructure of our applications and more time implementing functionality.
  4. Eclipse: I know that there are Perl plugins for eclipse (EPIC in particular), but they never added that much useful functionality as far as I was concerned. Having a full-featured IDE with method completion and inline error display saves me huge amounts of time.
  5. Everything is a reference: Java's object-oriented nature reminds me a lot of C++ (that's the OO language I have the most experience with) except for the lack of the object/pointer paradigm. The fact that everything (okay, okay, besides primitives) is a reference keeps me from having to remember all of my pointer-management skills.

I could probably come up with a dozen other reasons that I enjoy programming in Java, but those are the big ones that make my life easy.


Lessons learned from Film Addict

It’s been a week since I posted Film Addict. It got posted to kottke.org, Slashfilm, USAToday.com, and became a Twitter and Facebook meme on some level. At this point, it has received 100,000 pageviews, and it was filled out 37,000 times (peaking at a rate of over 1000 per hour, but at this point still about once a minute).

Lessons learned:

  • If I use strict and warnings, I should keep an eye on the Apache error log. I generated 10GB of errors in the first 24 hours, filling up /var on my webhost, which resulted in 8 hours of downtime during the Slashfilm surge.
  • Google Adsense now allows "personal" domains. I tried to sign up in 2004, when Weboggle was all the rage, and I was turned down because it lived under plutor.org. I'm not sure when the policy changed, but I applied this week and was approved on Thursday.
  • Google ads don't really make you much money. At this point, the ads on the Film Addict page have made me $2.90, but even if I had had them from day one, I'd only have made about $15.
  • I sleep fine at night, even after creating something clearly intended to be meme-fodder.
  • With enough data, you can get awesome bell curves. Histogram of number of movies seen:

It didn’t end up going much of anywhere on Digg or Reddit or Delicious. I’m not sure why, but I think part of it might have been the individual nature of the form. You could compare your list against your friends', but there was no community interest. Another factor (especially on Delicious) was the unique id in every posted URL; there could never be enough posts of a single URL to get any momentum.


Film Addict

A couple friends of mine posted one of those lame Facebook chain note things today. It was a list of a couple hundred movies, mostly 18-25-year-old targeted franchises from the past decade or so (think Scream, Saw, American Pie, etc) with some others (mostly very popular) thrown in. “Copy this list to your profile and check off the ones you’ve seen”. The list’s arbitrary inclusion criteria angered me, so I decided to make my own, with a better inteface than “copy and paste it yourself”. The list is IMDB’s top 250.

I posted it to MetaFilter Projects. Pretty much immediately, Mathowie Twittered it. At this point, more than 200 people have taken it already. Give it a try, compare your list to mine.


Generating random user_ids

At work, each new user is assigned a totally random alphanumeric 12-character ID. They’re random instead of sequential because this is what goes into the user’s cookie (and in some cases into URLs) and we didn’t want the IDs to be discoverable. Sometimes we need to do what we call a subscriber load and generate thousands (or sometimes many thousands) of IDs at once. The subload process tends to be very slow, and one of my co-workers was tasked with making it faster. While profiling the code, he discovered that a big time sink was the ID generation procedure. After more research, we discovered that it was written in 2004 and had never been modified after the original checkin. It was hundreds of lines long, used all kinds of global variables (Perl hasn’t had static variables until 5.10), and involved big math with a magic prime number close to 7012. Worse, it was implemented as a hash function. And it was always passed a salt. And that salt was always random.

We replaced it with this code:

my @chars = ('A' .. 'Z', 'a' .. 'z', '0' .. '9');
sub randid {
    $rv = '';
    for my $i (1 .. 12) {
        $rv .= $chars[rand(@chars)];
    }
    return $rv;
}

It used to generate about 100 IDs per second. Now it can do 175,000 per second.


Redirect referer test

A web user is looking at page A. He clicks on a link for page B. That page has a META Refresh to page C. What is the value of HTTP_REFERER for that last request? What if the redirect was a Status 307? Or a location.replace() JavaScript call? What if he’s using Opera? I’ve been doing some redirect referer tests this week and I have results for some the most common browser/OS combinations. I hope to expand them further.


Live photocasting Halloween

Sometime during the planning of our two parallel Halloween parties, Chris and I realized we needed some way to allow the two groups to communicate. Videocasting was our first thought, but we didn’t have the equipment or the knowledge. But when it came to taking photos and putting them on websites, we had all kinds of both. Using something Chris had written a while ago as a guide, I wrote a quick script to pull the photos off the camera, resize them to a reasonable resolution, and upload them to our web server. I then wrote a CGI that would pick a random photo from each location and place them side-by-side. Chris asked me to make the algorithm weight towards newer photos, which was far easier than it would have been if we had been uploading to a service like Flickr or something.

After a rough (and late) start in Boston, things went well. Philly took some naughty shots early, which got people riled up here and for a period of time, things were pretty lewd. Eventually it became family-safe fun time party photos and some gentle photo-jabs were traded between the sister parties. Sometime around midnight, Chris texted me “this is the best thing we’ve ever done.” I agree. That said, we learned things. The script had all kinds of bugs (mostly because I wrote it without having the camera we were going to use or the software). The CGI was too weighted towards new photos. And whereas Chris had done this before and had a neat photobooth setup (a side room, tripod, IR trigger), Boston had a camera that had to be hand-shot and plugged back into the laptop every few minutes.

Will we ever be able to learn from the mistakes we made, and try out a new iteration of the script? I certainly hope so. Maybe we can hook in a third city. California friends, I’m gesturing in your direction.


Upgrade and coComment

Some behind-the-scenes stuff: I’ve finally upgraded this blog from WordPress 1.5 to 2.0.4. While I was at it, I decided to install the coComment plugin, so that you can follow your comments with coComment. They have a new Firefox extension that makes it even easier than before. Almost trivial, in fact.

Update, 13 Aug - I've created a small plugin to add the slash:comments element to my RSS feed. That should fix the discussion counts on my home page, and it's a cleaner solution than hacking the WP code directly.


Four random Saturday links

Here’s a random Saturday link-dump:

  1. Don't believe BusinessWeek's bubble-math - Web 2.0 plus shoddy journalism equals a firm foundation for another bubble. BusinessWeek takes a made up number, multiplies it by a rumored percentage, contradicts itself several times, and most readers are probably just thinking "Wow, what a smart kid!" Related: A hilarious parody.
  2. Saved locations on Google Maps - This is a great thing. I've been waiting for some sort of smart auto-complete on Google Maps since day one. The interface is a little crusty (I wish I could click on a bubble anywhere and say "save this location" instead of having to have all locations saved), but I'm certain this is just release number one.
  3. No Space World and Mario Galaxy could be available at launch. Or rather, no one has yet verified that Mario Galaxy won't be available at launch. Related: The early October release rumors still seem to have some air in them.
  4. Two Cool Bash Tricks - Holy cow. Both are total life savers, but the second more than the first. Redirecting output to two files before you can diff them is a big pain in the neck. (via)

Starfleet employee database design

How would you design a database of the names of Starfleet employees? At the very least, you’d have to handle standard European Human-style names (given name, family name); Bajoran-style names (family name, given name); and Klingon-style names (given name, son/daughter-of father’s name. And you’d probably need to have House name in there, too). I’m certain there are others that I’m leaving out and that we’re unaware of. How would you do this?

Corollary: How would you alphabetize a list of Starfleet employees names? Would Ro Laren come before or after Jean-Luc Picard?


Writing quines

I wrote my first quine today, in Perl. It was surprisingly simple:

$_=q($_=$$;
s/\\$\\$/q($_)/;
print;
);
s/\\$\\$/q($_)/;
print;

Next step: ETA.

Update 13:03: Someone beat me to it. Warning: the first line of the source is 11k characters long, which crashed my browser.


Three Words, Five Minutes

I wrote Three words, five minutes this morning after about a year of procrastinating. It was initially going to be big and complicated, but since that was the source of the procrastination, I made it a single frameset with Writeboard as the writing interface. TWFM makes it surprisingly easy to just go and get a little writing exercise, and even if it doesn’t inspire something longer, then it was worth it.

Nomad has been ignoring work with it all day.


String to DOM

I was working on a Greasemonkey script yesterday, and I ran into a problem for a second time. I had data (in this case, XML, in the previous case, HTML) that I had retrieved via GM_xmlhttpRequest() (Greasemonkey's cross-site-capable and slightly more flexible implementation of the XMLHTTPRequest object). I wanted to grab some small piece of information out of the huge string, but the easiest way to do this that I could think of was with big ugly Regular Expressions. The data was goddamn XML, why couldn't I use DOM? The ideal solution would be if Greasemonkey added a responseDocument field to the data that it passes to the onload callback. Since the ideal solution is not currently available, I had to cobble something else together.

Update 9 Nov: I found a far better solution today. I can hardly believe it.

var xmlDoc;
if (typeof(DOMParser) != "undefined") {
    var parser = new DOMParser();
    xmlDoc = parser.parseFromString(x, "application/xhtml+xml");
}

Greasemonkey Scripts

I discovered the Firefox extension Greasemonkey in early March 2005. I had probably heard of it before then, but I don't know why it had taken so long for me to really look at it. It's exactly the kind of thing that appeals to me: injecting Javascript (often with DOM actions, possibly with Ajax) into any web page. Every once in a while, I come up with an idea for a one-off script. They're all available from my profile on Greasemonkeyed.com.


Source SDK next week

The most recent news post for Steam contains this gem:

Next week we will be releasing the Source SDK, along with a surprise for the community.


Inefficient sort algorithms

Flashback with me to my CompSci days (and realize how much I’ve forgotten, and how easy it is to remember) with this inefficient sort algorithms analysis. Check out the O(n! ^ 2) code at the end!


Making GD::Graph beautiful

Before this morning, I never really looked too carefully at the pie charts I was generating with GD::Graph. Then I had an epiphany, and suddenly they were the ugliest things anywhere. So I set about making them more beautiful.

  1. Anti-aliasing. I generate the pie four times larger than it needs to be, and then I use copyResampled from the GD library to shrink it down. No more jaggies, and the rather obvious stray pixels disappear.
  2. Three-dimensional shadow. To emphasize the 3d look, I had to make a small change to the GD::Graph::pie module itself. Mine was in /usr/lib/perl5/site_perl/5.8.4/GD/Graph/pie.pm, but YMMV. I applied this patch: --- pie.pm 2004-10-12 11:10:30.272020883 -0400 +++ /usr/lib/perl5/site_perl/5.8.4/GD/Graph/pie.pm 2004-10-12 07:55:54.000000000 -0400 @@ -240,8 +240,13 @@
     for (my $i = 0; $i < @values; $i++)
     {
    
    •   my @col = $self->pick_data_clr($i + 1);
      
    •    # Set the data colour
      
    •    my $dc = $self->set_clr_uniq($self->pick_data_clr($i + 1));
      
    •    my $dc = $self->set_clr_uniq(@col);
      
    •   # Set the 3d shadow color (15% darker)
      
    •   my $sc = $self->set_clr_uniq( map {$_ * 0.85} @col );
      
         # Set the angles of the pie slice
         # Angle 0 faces down, positive angles are clockwise 
      

    @@ -292,7 +297,7 @@ { $self->{graph}->fillToBorder( $fill->[0], $fill->[1] + $self->{pie_height}/2,

    •                $ac, $dc);
      
    •                $ac, $sc);
             }
         }
      
      }
  3. Minor size tweaks. I think the graphs look much better when they're thicker (thickness about 10% total image height), and when the image is a little larger, and I even used 120x90, a magic 4:3 ratio.

DOM Image Zoom

Here it is, the next in my unbounded series of DOM experiments. DOM Image Zoom. Enjoy!


Drag and drop DOM

I do three kinds of things at work:

  1. Work. Sub-things: programming, monitoring management
  2. Not work. Sub-things: surfing, programming
  3. Semi-work. Usually proof-of-concept programming. This is the most interesting thing I do. WEBoggle started as a "what's this DOM thing I keep hearing about" semi-work. Here's another one.