Perl backticks and open files

When trying to write some debugging code, I noticed something very strange. I could never get fuser to admit that its parent had a file open when called from a Perl script:

$ perl -e ‘open(F, “» /tmp/foo”) or die; fuser -a /tmp/foo;’ /tmp/foo:

But when I replaced the fuser call with a sleep, and then called fuser from a different shell, I’d get Perl’s pid, as expected. I was about to post this question to SuperUser and decided to try this just to make sure:

$ fuser . | grep -c $$ 1

Wait, what? That worked? So I asked myself: Is this something specific to Perl? Is this something about the way Perl is running child processes? If only There Was More Than One Way To Do It. Oh wait, there is.

$ perl -e ‘open(F, “» /tmp/foo”) or die; system(“fuser -a /tmp/foo”);’ /tmp/foo: 3733

I haven’t been able to find documentation about what exactly backticks do with open filehandles that makes fuser report the wrong information, but system() clearly doesn’t do it. I just thought I’d document it here for the next person who is trying to do the same thing.


Why I'm Still Writing Java

Just over a year ago, I was sitting in a classroom at Sun’s virtually deserted Burlington campus, learning a new programming language from a living human in person for possibly the first time in my whole life. The class itself was a little slow; it definitely wasn’t geared towards experienced developers. But it was more effective learning than fiddling at work had been, since it successfully prevented all distractions. Now that I’ve had a year to work with the language (and have gone back and forth to programming in Perl), there’s several reasons why I’m still using Java.

  1. Strong typing: Perl has no type safety, and I used to think that was an asset. But when you have a dozen programmers working on hundreds of source files, type conflicts can easily get introduced and go undetected for a long time. Objects are hashes, references are scalars, and arrays are quietly scalar-ized.
  2. WAR files: Releasing a new version of a website is brainless with Java. See that .war file that was built by Eclipse or Maven? Copy it to your server. Done.
  3. JUnit, Hibernate, Spring, log4j: Perl has a lot of freely available modules. But not a single one of them is as useful as Hibernate alone. It encapsulates database objects transparently, and is remarkably flexible. We've got a lot of awkward legacy database schemas, and without Hibernate's flexibility, we'd be building database objects by hand with JDBC. Spring's dependency injection and session management, JUnit's unit testing mechanisms, log4j's logging simplicity, and Maven's build architecture mean we spend less time planning and re-planning the infrastructure of our applications and more time implementing functionality.
  4. Eclipse: I know that there are Perl plugins for eclipse (EPIC in particular), but they never added that much useful functionality as far as I was concerned. Having a full-featured IDE with method completion and inline error display saves me huge amounts of time.
  5. Everything is a reference: Java's object-oriented nature reminds me a lot of C++ (that's the OO language I have the most experience with) except for the lack of the object/pointer paradigm. The fact that everything (okay, okay, besides primitives) is a reference keeps me from having to remember all of my pointer-management skills.

I could probably come up with a dozen other reasons that I enjoy programming in Java, but those are the big ones that make my life easy.


Generating random user_ids

At work, each new user is assigned a totally random alphanumeric 12-character ID. They’re random instead of sequential because this is what goes into the user’s cookie (and in some cases into URLs) and we didn’t want the IDs to be discoverable. Sometimes we need to do what we call a subscriber load and generate thousands (or sometimes many thousands) of IDs at once. The subload process tends to be very slow, and one of my co-workers was tasked with making it faster. While profiling the code, he discovered that a big time sink was the ID generation procedure. After more research, we discovered that it was written in 2004 and had never been modified after the original checkin. It was hundreds of lines long, used all kinds of global variables (Perl hasn’t had static variables until 5.10), and involved big math with a magic prime number close to 7012. Worse, it was implemented as a hash function. And it was always passed a salt. And that salt was always random.

We replaced it with this code:

my @chars = ('A' .. 'Z', 'a' .. 'z', '0' .. '9');
sub randid {
    $rv = '';
    for my $i (1 .. 12) {
        $rv .= $chars[rand(@chars)];
    }
    return $rv;
}

It used to generate about 100 IDs per second. Now it can do 175,000 per second.