Hacking MT 2.6; Or, I Don’t Understand Perl

By Evan — September 17, 2006 — Web

So I’m trying to make a couple of minor modifications to my Movable Type installation’s MT::App::Comments Perl module, in order to make life a little more difficult for comment spammers. My goal is hampered slightly because I don’t know a damn thing about Perl.

In fact, the only language I know reasonably well is Java. Which means that any time I look at code for Language X, my brain does its best to overlay Java syntax on top of Language X, adding in any bits of Language X that I might have accidentally picked up over the years.^[1] For example, I can read C++, sort of, because Java’s syntax was designed to be a lot like C++. Except that C++ has all these funny characters sprinkled in: ampersands and asterisks and God knows what else. And when my brain goes down into the depths to retrieve the meaning of these funny characters, it unlocks a door called “CS 5“,^[2] and I curl up in a fetal ball and start rocking back and forth and moaning, and the only thing that snaps me out of it is hearing my co-worker Jud singing “The Lunchtime Song.” Yes, we really have a lunchtime song. Want to make something of it?

Anyway, Movable Type is written in Perl, and the module that handles comments contains several hundred lines of code, some of which looks like:

if (!$q->param('text')) {
    return $app->handle_error($app->translate("Comment text is required."));
}

Now, even my poor little Java-addled brain can understand what this means. For starters, I understand “if” and curly braces. I undertand the “!“, the “return“, and the basic idea behind “handle_error(stuff);“. Heck, I even think I understand “q->param('text')“! And in fact, if you try to submit a comment and you leave the “text” field blank, you do indeed get an error message:

Comment text is required.

Try it! It’s fun!

So as a test of my elite Perl-hacking skills, I changed the snippet above to:

if (!$q->param('text')) {
    return $app->handle_error($app->translate("Comment text is required, silly."));
}

But to my great disappointment, submitting a comment with no text resulted in:

Comment text is required.

This could mean one of two things:

The Perl module is cached in memory, and the server is not picking up my change.
The actual error message is specifed somewhere else in the code, and the snippet I was editing is just a red herring.

The red herring possibility seems unlikely. First, it would be perverse to create a module named MT::App::Comments, have a snippet that appears to handle the empty text field error, but actually handle the error somewhere else. I mean, I know the Perl world is whacky and crazy and There’s! More! Than! One! Way! To Do It!, but seriously. Second, I grepped through all of MT’s Perl code and did not find that particular error string anywhere other than in the Comments.pm file.

That leaves us with the caching possibility. To test this, I renamed the Comments.pm file, and lo! this completely broke comments. Excellent! Except that when I moved the file back, comment functionality resumed, but the system still wasn’t picking up my changes. So it seems that the system’s cache does bother to check whether any of its Perl module files have moved, but it doesn’t bother to waste time checking whether the module contents have actually changed. Perish the thought!

I can understand a caching system that is extremely “sticky” — even if you totally mess with the underlying files, the system doggedly continues to run with whatever it’s got in memory, until you somehow force the system to re-initialize. And I can understand a caching system that continually monitors the state of its files and obediently re-reads the code if it detects any changes. But the in-between behavior makes no sense. Why would you break if a file is missing, but not bother to read the file again when the file magically reappears? I have a hard time believing that this is actually the case. I must be misunderstanding how Perl is working here. Then again, after encountering the utter stupidity of pod2html for Perl 5.8 on Red Hat,^[3] I’m willing to believe anything.

Update: Success! Turns out I was grepping in the wrong place. The hack is now in place. See comments.

1. This is sort of like trying to read French by A) looking for words that are very similar in English and B) sprinkling in the thirty-odd actual French words that you dimly remember from high school. Which is, coincidentally, exactly how I read French. Quel surprise!

2. After taking that class, I swore I would never, ever work for the computer industry. And if I ever had to write software, it would only be to do something useful, like solving a physics problem.

3. As opposed to pod2html for Perl 5.8 on FreeBSD, which basically works okay-ish.

16 thoughts on “Hacking MT 2.6; Or, I Don’t Understand Perl”

Wade says:

September 17, 2006 at 7:12 pm

Long-time reader (via Auros), first-time poster. If you upgraded to the latest version of MT, 3.3 — which is now totally free for personal use — you’d get the ability to authenticate users via TypeKey, OpenID (the LJ system), or via a local members list. Switching to TypeKey cut down on my comment spam, which had gotten bad enough that I was thinking of shutting down comments.

If you don’t decide to upgrade, a lot of edge cases were taken care of inside MT 2.6 with unique, non-modularized code — a big reason they did a rewrite and rearchitect for 3.0. Look for some more code that looks like it does the same thing, you might not be changing the right code.
Evan says:

September 17, 2006 at 10:31 pm

Hi Wade! TypeKey and OpenID is a possibility, but I’m worried about it being too high a barrier for casual visitors. I’m pretty fed up with comment spammers, but TypeKey is a pretty big hammer.

What I’d really like to do is change my comment form’s “profile”. The default MT comments form presents certain form fields, and I think that if I change the names of those fields (or add a required field or two), that will defeat some of the spammers.

I might upgrade to MT 3.x at some point, but only if the spammers become so intolerable that I have no other option…
Evan says:

September 17, 2006 at 11:12 pm

In the meantime, I am still confused about why moving or deleting an underlying Perl module cripples Movable Type, but changing the content of a Perl module has no effect whatsoever. I’m pretty sure I’m changing the right code…
Auros says:

September 18, 2006 at 12:30 pm

I don’t understand much of the MT model, though I did once (partly because it was an interesting challenge, and partly because it was a favor for a girl I had a crush on) hack a module for displaying your NetFlix queue into a form that would display your GreenCine queue. I even added functionality that would let it locally cache the images of the disc covers, rather than simply SRCing the IMG tags back to the site. I was fairly proud of that trick.

Sadly, the module broke when they redid the site, since the pattern matching to find the key info no longer worked. And I never did get around to fixing it again.

In any case, there are a number of web platforms that require some sort of “reinit” command to read in changes. e.g. I know that on the ArsDigita Community System, written in the TCL scripting language (which might be located somewhere between TCL and Python, in terms of its flexibility, but with some evil elements of SmallTalk thrown in to drive you nuts), when you change files, you basically have to restart the whole HTTP server to get it to read in the changes.

I do agree it’s odd, though, that moving the file would cause problems if it’s not reading the code… I’m guessing the key is somewhere in those two functions wrapped around the message.
Evan says:

September 18, 2006 at 8:21 pm

I doubt those functions are doing anything *that* special. And I have no idea how to reinitialize Movable Type without bouncing the server. I should probably post on my web host’s forum and see if anyone there knows what’s going on…
Jacques Distler says:

September 18, 2006 at 10:07 pm

Internationalization.

Think about what the function

$app->translate()

might possibly do.

If you don’t care about the ability to suddenly start spouting error messages in Japanese, maybe you want to pass a string directly to $app->handle_error()

Despite my own travails, I might still second Wade’s suggestion that you switch to MT 3.32 before investing too much time in hacking the code.

There’s a lot of built-in spam-filtering technology in 3.3x that’s much easier to configure than stumbling through the Perl code for MT::App::Comments. Heck, there’s even an API for writing your own comment spam filters (should you ever decide to become a Master of Perl, yourself).
Evan says:

September 18, 2006 at 11:16 pm

Success!

Wade’s advice, ” Look for some more code that looks like it does the same thing, you might not be changing the right code,” was dead-on. It turns out that the module that handles comment error messages and rejections is not under the MT lib/ directory as I thought — it’s in a parallel extlib/ directory. Specifically, it’s contained in a 3rd-party module I installed a while back, one that came along with MT-Blacklist.

The MT-Blacklist module has some functionality that depends on the standard Comments.pm module, which is why I was able to break MT by renaming the module, adding garbage text, or even commenting out certain key functions. But the MT-Blacklist overrides a lot of functionality too, and it just so happens that the hack I wanted to make was in the part of the code where the MT-Blacklist module takes over. So those changes to Comments.pm, as long as they were valid Perl, were being ignored.

Anyway, I’ve now implemented my little 3-line hack in the correct module. As far as cutting down the spam goes, I got a fair bit of mileage out of simply renaming my comment script to “talk.cgi”… for a few months, anyway. Let’s see if this latest change has any effect.
Andrei says:

September 19, 2006 at 4:36 pm

‘Thank you for playing Perl. You lose. Game Over.’
Evan says:

September 19, 2006 at 5:23 pm

Hey, it beats XSL.
Andrei says:

September 19, 2006 at 6:07 pm

Saying that Perl is easier than XSL is like saying that K2 is shorter than Everest.
Evan says:

September 20, 2006 at 11:28 am

I don’t buy your K2/Everest analogy. I’m not the world’s biggest fan of Perl, but seriously:

<xsl:variable name="foo" select="'bar'" /> ?!?

But hey, the payoff for all that verbosity is, I can… use an XSL stylesheet to transform an XSL stylesheet into another XSL stylesheet! Or something that. I guess we should just thank our lucky stars that XPath isn’t expressed as XML, as some of the designers apparently wanted.
Auros says:

September 20, 2006 at 12:24 pm

Hmm. Rewriting a function, then leaving the original in. Sort of like C++ overloaded operators, except without the useful element of, you know, preserving both functionalities and using them in appropriate (and, one hopes, obvious) contexts.

It’d be nice if the process for installing the module did something like insert a line, right at the top of each function it’s replacing, explaining that
# this code is not in use, due to installation of module Foo; see /path/file.pm for the new version
Evan says:

September 20, 2006 at 1:14 pm

Nah… a Real Man (TM) would have read though and understood all of the Perl modules, both standard and 3rd-party. Then everything would have been trivial.

The only real source of truth is the code! Documentation is for losers.
Andrei says:

September 20, 2006 at 5:43 pm

At least XSL is readable. Perl people like to say that Perl is easy. But it goes like this:

Lesson 1
my $a = 2; print "$a * $a = ", $a * $a, "n";

Lesson 2
my $index = ( grep({$files[$_] =~ m/.copy$/} reverse(0..$#files) ) )[0]; splice(@files, 0, $index);

<insert Raiders of the Lost Ark head-melting scene here>
Henci says:

September 21, 2006 at 12:38 am

“The only real source of truth is the code! Documentation is for losers.”

Aren’t you, um, a tech writer? Which means you, well, write documentation for a living? In light of which, your exhortation seems kind of counterproductive. Just asking.

— Henci
Auros says:

September 21, 2006 at 5:23 pm

I write considerably more complex Perl statements than the things you gave as examples… but I make the names fairly instructive. For example…

$syl_trans = join '', map { $dev2rom{$_} } split //, $syl;

This is the second most important line in my code for transliterating Devanagari to Roman. We’ve previously manipulated a syllable in a few ways to deal with irregularities, and lastly we get its transliteration, by splitting the syllable into individual letters (the null pattern // indicating that we split on nothingness, i.e. between every letter), mapping those letters from Devanagari to Roman (using the dev2rom table), and then joining the results back together (with the null string, since we don’t want to insert anything between elements of the output).

Earlier, we’ve split a word into its syllables, using a complex regular expression — but that’s also quite readable, if you know Devanagari and Perl regexps. There are no literals in it at all; they’re all replaced by variables, which contain a string of characters and are put into brackets to form a class — things like [$bindu] (the term for nasalization modifiers that appear above the vowel), or [$cons] (the base consonants). The class [abc] matches any of the letters a, b, or c. By using a variable instead ( $var = 'abc', m/ [$var] /ox ) I can define the class right up at the top, and then not worry about it again. This has the added bonus that I can then copy most of the code from my Devanagari transliterator for use in a Bengali or Tamil or Gujarati version, by just changing the class definitions and fiddling with the code that handles irregular cases.

You shouldn’t have to know the details of this particular script in order to make sense of it, but I think it’s reasonable to expect anyone looking at the code to learn basic Perl, and to know a bit about the domain the script is dealing with (in this case, Indic languages). Like, you need to know that $_ basically means “the thing I’m currently thinking about”, so in this context it’s referring to the syllable that’s being mapped through the hash table. $whatever means a scalar (any single value, could be a number, a string, or even a pointer to something), and if a scalar is referenced as $foo{key} it’s the member of the table foo named key. And you should know how to read a regexp (what a [class] is, the fact that whitespace doesn’t matter because I added the x modifier, and so on).

Comments are closed.

goer.org

Writing, code, etc.

Hacking MT 2.6; Or, I Don’t Understand Perl

16 thoughts on “Hacking MT 2.6; Or, I Don’t Understand Perl”