(this is a continuation of the series started in this post, so you may want to start there)
New tool for today: Rocco Caputo's amazingly handy Lexical::Persistence module. We're going to use this to persist lexical variables - i.e. those declared with 'my $var' - between invocations. But before we can do that, we need to break Devel::REPL out a bit and move it from doing a simple string eval on the supplied line to building a subroutine reference from it. So, -now- we change $self->execute($line) in run_once to call $self->eval($line) instead, and expand the eval code as follows -
sub eval {
my ($self, $line) = @_;
my ($to_exec, @rest) = $self->compile($line);
return @rest unless defined($to_exec);
my @ret = $self->execute($to_exec);
return @ret;
}sub compile {
my $_REPL = shift;
my $compiled = eval $_REPL->wrap_as_sub($_[0]);
return (undef, $_REPL->error_return("Compile error", $@)) if $@;
return $compiled;
}sub wrap_as_sub {
my ($self, $line) = @_;
return qq!sub {\n!.$self->mangle_line($line).qq!\n}\n!;
}sub mangle_line {
my ($self, $line) = @_;
return $line;
}sub execute {
my ($self, $to_exec, @args) = @_;
my @ret = eval { $to_exec->(@args) };
return $self->error_return("Runtime error", $@) if $@;
return @ret;
}sub error_return {
my ($self, $type, $error) = @_;
return "${type}: ${error}";
}
The end result here is that we end up with '1 + 1;' in the REPL becoming "sub {\n1 +1;\n}\n" before execution; since perl treats the final expression in a sub as an implicit return if no explicit one is present, everything continues to work as before - but we now have a bunch of extra hooks with which to modify the execution flow (we'll need the hook offered by mangle_line today, and the error_return one will come in handy for printing backtraces and similar should we desire those later).
The reason for the odd setup in 'sub compile' is to ensure only the $_REPL variable is in scope when the compilation takes place; this means that we have any other variable name to ourself in the code being executed (and accidental specification of any other variable without declaring it will cause a compile-time error).
So, main code refactored, on with Devel::REPL::Plugin::LexEnv -
package Devel::REPL::Plugin::LexEnv;
use Moose::Role;
use namespace::clean -except => [ 'meta' ];
use Lexical::Persistence;has 'lexical_environment' => (
isa => 'Lexical::Persistence',
is => 'rw',
required => 1,
lazy => 1,
default => sub { Lexical::Persistence->new }
);
The -except in namespace::clean is new, and actually indicates a mistake I made in the first article - fortunately, since Devel::REPL itself inherits a meta method from Moose::Object it didn't actually break anything, but in the case of a plugin we can't unimport it since a role isn't a class and can't have a superclass. I'm using a slightly more verbose formatting for the attribute for clarity; anybody who prefers one or t'other should leave a comment to that effect. Also, note that the isa type is actually referencing a type constraint (see Moose::Util::TypeConstraints for the list of standard ones and the functions to create custom ones), which since it doesn't exist Moose automatically creates for us as a subtype of Object which requires that the value passes ->isa('Lexical::Persistence'). Now for the meat -
around 'mangle_line' => sub {
my $orig = shift;
my ($self, @rest) = @_;
my $line = $self->$orig(@rest);
my $lp = $self->lexical_environment;
return join('', map { "my $_;\n" } keys %{$lp->get_context('_')}).$line;
};around 'execute' => sub {
my $orig = shift;
my ($self, $to_exec, @rest) = @_;
my $wrapped = $self->lexical_environment->wrap($to_exec);
return $self->$orig($wrapped, @rest);
};
Ok, this is slightly involved for not many lines of code. So in order to make it clearer and ease your mind that I'm still sane, we'll examine it in reverse order.
return $self->$orig($wrapped, @rest);
Ok, by this point we've got a subroutine reference that's been wrapped by the Lexical::Persistence context, so we pass that off to the base Devel::REPL execute to deal with normally (note @rest currently doesn't do anything - passing it around all over the place is basically a politeness in case another plugin author wants to use it).
my $wrapped = $self->lexical_environment->wrap($to_exec);
This is the bit where Lexical::Persistence does its magic. It wraps the subroutine reference built by the $repl->compile process in code that fills out the lexical environment of the subroutine from the data it has stored - so if the $lp has data for a $foo variable and it sees a 'my $foo' lexical in the subref, it sets the value of that lexical to its data for it before execution - and then afterwards, it goes through the lexicals' current values and saves them away. Which means -
$ my $foo = 3;
3
$ $foo + 1;
4
$
- that the variables we've declared are now persistent between lines executed within the repl, even though each line becomes a separate subref with its own independent lexical environment and namespace.
Which begs the question, why did the second line compile? Without the plugin we'd simply get an error saying $foo isn't declared, since the code is still compiled under 'use strict'. The magic for this is in the first around, in this line:
return join('', map { "my $_;\n" } keys %{$lp->get_context('_')}).$line;
which pulls out all the keys from the default context (which currently is the only one we're using, and called '_' since other contexts in Lexical::Persistence are keyed by variable name prefix) and constructs declarations from them which are prepended onto the line during mangling and before the subref is compiled. To see what I mean by this a bit more clearly, I'll declare a couple more variables and then replicate the behaviour (getting $lp directly to save space)
$ my $foo = 3;
3
$ my @bar = (1, 2, 3);
1 2 3
$ my $baz = 'spoon!';
spoon!
$ join('', map { "my $_;\n" } keys %{$_REPL->lexical_environment->get_context('_')});
my $foo;
my $baz;
my @bar;$
and also, here we see one other useful behaviour - variable names starting with '_' aren't persisted by default, thus avoiding us needing to worry about $_REPL being accidentally stomped on by Lexical::Persistence. You should probably have a read of the full documentation to get the big picture, though, along with PadWalker and Devel::LexAlias if you want to understand how all this is implemented under the hood. And, of course, the complete final code to the plugin.
Here endeth part 3, two days late due to a combination of extreme tiredness one night and having to wait for my laptop to dry out after being caught in heavy rain on the walk home the next. I'm not entirely sure what I'll be writing about next time round, but hopefully a few of you will turn up to find out anyway. Later ...
So, last time we got from scratch to a basically working simple perl REPL. So now it's time to start writing plugins, since currently it's disturbingly basic and only really useful for ... well, for making a blog post about how it works at all :)
First, we should probably provide some history so we take advantage of the readline goodness available. This is, fortunately, going to be trivially simple, but I'm going to digress slightly before that and explain how our plugin system works. When you call the ->load_plugin method on the repl object with a plugin name, it searches for an installed module called Devel::REPL::Plugin::NameOfPlugin, which it expects to be a Moose Role. The basic concept of a role is 'a mixin on steroids' - not only can you provide methods but method modifiers, which are AOP-style before/after/around advice functions. If none of that made any sense, you'll see it in action in a moment.
Once the plugin has been loaded, it's composed into the current -object-, not the current class. Devel::REPL itself is unchanged, what MooseX::Object::Pluggable does is automatically create an anonymous subclass and move the Devel::REPL object into that (similar to ruby's singleton objects, I believe), then adds the role to that. This means we can have more than one object of our class with different plugins attached - probably not that useful just yet for a REPL but a nice feature to have, especially since I plan to start doing Devel::REPL development from within Devel::REPL as soon as I can.
Now, Term::Readline gives you automagic history handling, but it'd be really nice to have a way to display history lines ourselves and to fire previous commands quickly without hammering the up key. So, without further ado, it's time for Devel::REPL::Plugin::History. Here's lib/Devel/REPL/Plugin/History.pm -
package Devel::REPL::Plugin::History;
use Moose::Role;
has 'history' => (
isa => 'ArrayRef', is => 'rw', required => 1, lazy => 1,
default => sub { [] }
);sub push_history {
my ($self, $line) = @_;
push(@{$self->history}, $line);
}
Again, 'use Moose::Role' saves us needing strict and warnings. Roles can add attributes too, so we define our history attribute here to hold the history line list.
This time we -do- set a type constraint because it's important that everything can assume the history is an array reference - although there's no need to specify what the array contains. The one interesting part is specifying lazy on the attribute - were this a normal class or role there'd be no reason for it since the default value isn't dependent on anything else, but since this is a plugin designed to be added to an object -after- new() (construction) time the default wouldn't fire, so instead we make it lazy so it gets called on first get, just like out_fh in Devel::REPL itself.
The push_history method is pretty trivial, the @{$self->history} deferences the arrayref so 'push' can shove the line onto the end. The principle reason for making this a standalone method is to provide a hook for other plugin authors to use to wrap the process - maybe they'll want to log the history to a file as well, or throw away lines once there's 500 in the history (which bash does in most default configs).
Now we need to hook this method into the processing, and add a little magic for some extra syntax -
around 'read' => sub {
my $orig = shift;
my ($self, @args) = @_;
my $line = $self->$orig(@args);
The 'around' type of method modifier wraps all calls to the method of the same name in the class itself (there's also an 'override' type which provides some extra sugar but is designed for classes, not roles). It's special in that it provides an extra argument on the front which is a reference to the original method, so we shift that off the argument list first (list operations in perl default to @_ to save us typing for this sort of thing). Then, we grab the object and any arguments, pulling them into an @args list since we neither know nor care what they are, and call the original method to get the line. The $self->$orig trick is one of the beauties of perl - you can provide a variable containing -either- a method name or a reference to the sub implementing the method and perl Does The Right Thing (tm).
if (defined $line) {
if ($line =~ m/^!(.*)$/) {
my $call = $1;
$line = $self->history_call($call);
if (defined $line) {
$self->print($line."\n");
} else {
return "'Unable to find ${call} in history'";
}
}
if ($line =~ m/\S/) {
$self->push_history($line);
}
}
if ($line =~ m/\S/) {
$self->push_history($line);
}
}
return $line;
};
If the line's undef we're about to leave the repl, so we skip our processing. Then we check for !-style history requests and if so we filter the line -without- the ! through the history_call method and print the line that's been substituted from the history for reference if present, or return an error string if not. Next, if the line contains anything but whitespace (\S inverts the \s whitespace regexp match group) we add it to the history before returning it.
Note the semicolon after the closing } - this is required with method modifier declarations to end the statement since they aren't normal perl sub definitions - in fact they're implemented in terms of a sub call, so theoretically could take extra arguments albeit those provided by Moose don't.
Finally, we need to implement history_call -
sub history_call {
my ($self, $call) = @_;
if ($call =~ m/^(-?\d+)$/) { # handle !1 or !-1
my $idx = $1;
$idx-- if ($idx > 0); # !1 gets history element 0
my $line = $self->history->[$idx];
return $line;
}
my $re = qr/^\Q${call}\E/;
foreach my $line (reverse @{$self->history}) {
return $line if ($line =~ $re);
}
return;
}
The only interesting bit in the first part is the $idx-- if ($idx > 0) statement, since perl arrays can already handle
$x[2]; # third element
$x[-2]; # last but one element
but of course the tradition in shells is !1 should be the first history line rather than !0. $self->history->[$idx] then uses this, taking advantage of arrayref derefence via -> along the way.
Then the second part handles the !search style (I could have put this in an else but I prefer to have something for a sub to fall through to in case of weirdness and at least one return to be at top level - stylistic only); qr is 'quote regex' which makes $re a re-usable compiled regex object, done here for clarity rather than the usual performance reasons, \Q and \E disable and then enable regex metacharacters so a . or * in $call isn't misinterpreted.
So, not bothering to document the final 1; for a second time, here's a session with the plugin -
cain$ perl -Ilib -MDevel::REPL -e 'my $repl = Devel::REPL->new; $repl->load_plugin("History"); $repl->run;'
$ 1;
1
$ int(1.5);
1
$ (1..3);
1 2 3
$ !-1
(1..3);
1 2 3
$ !1
1;
1
$ !-30
Unable to find -30 in history
$ !int
int(1.5);
1
$ !foo
Unable to find foo in history
... so, check in the completed code, and I think we're about done for this iteration. Next time we'll look at adding lexical persistence so 'my' variables persist between calls. And maybe I'll finally get as far as sorting out my e-mail :)
=== update, April 25
For the impatient, part 2 and part 3 are already out and I'm aiming to publish a part per week until I run out of ideas and change to a different topic.
=== end update, original post follows
So, I need to sort out my personal e-mail - I've left it alone for a few weeks and it's accumulated >10k messages. Forwarding it to gmail sort of works, but I'm a die-hard mutt user. I also prefer doing mail access over IMAP, so things like procmail aren't spectacularly useful. Plus I find procmail even uglier than the worst perl I've ever seen, so ... no.
Which means I need to script mail classification and filtering over IMAP. Which means I need an easy way to experiment with the various CPAN IMAP modules without repeatedly fetching the header list. Which means a repl - read-eval-print loop, basically an interactive shell for $language_of_choice, ideally, so I can prat around interactively and make my mistakes in an environment where it's not going to screw me entirely when I get it wrong.
Now, ok, there are two already on CPAN. Great. Except Shell::Perl uses package variables to persist data between lines (think namespaced globals) and App::REPL is somewhat baroque and really, really wants to use bright colours everywhere. And I'm an old fvwm2-loving curmudgeon who really hates colourisation. Plus I really want to make something that's nice and easy to extend, which means I want to use the meta-object-orientation goodness of Moose and the runtime plugin facilities of the MooseX::Object::Pluggable role. Soo, sod it. I'll write a new one, and explain what I'm doing and why as I go along as an attempt to justify the level to which this is now yak shaving
Name first. Easy. Devel::REPL, because (a) it isn't taken and (b) a REPL to me is very very much a development tool, so it's a reasonably sane namespace. The actual script is going to be called re.pl, mostly because I can and because it amuses me. Last tools to mention before I start - Term::Readline, which will do the heavy lifting of handling readline capabilities, and namespace::clean which will let me clear out any helper functions I import from my classes so I can inherit methods of the same name without breaking anything.
First, setup the dist directory and open the module file -
cain$ mkdir Devel-REPL
cain$ cd Devel-REPL
cain$ mkdir -p lib/Devel
cain$ vi lib/Devel/REPL.pm
Declare the package (class) name and load the tools I need -
package Devel::REPL;
use Term::ReadLine;
use Moose;
use namespace::clean;
Note that I don't need to explicitly ask for 'strict' and 'warnings' as is normal at the top of a perl file - Moose does this automatically. namespace::clean comes last because it examines the package's namespace at the point it's use'd to figure out what to clean out afterwards - so far, just the stuff that came from Moose but there could easily be more later. I don't need to declare a base class because I get the standard Moose::Object but I do need to load the Pluggable role to get the load_plugin goodness -
with 'MooseX::Object::Pluggable';
Now, according to the Term::ReadLine synopsis, which handily is an extremely primitive REPL in and of itself, I'm going to need at least a term object, a prompt string and an output filehandle, so let's declare those as attributes -
has 'term' => (
is => 'rw', required => 1,
default => sub { Term::ReadLine->new('Perl REPL') }
);has 'prompt' => (
is => 'rw', required => 1,
default => sub { '$ ' }
);has 'out_fh' => (
is => 'rw', required => 1, lazy => 1,
default => sub { shift->term->OUT || \*STDOUT; }
);
The 'rw' means I'll get a getter/setter accessor type for each of these, required prevents them accidentally being set to something undefined, and I've made the 'out_fh' attribute lazy so that it can rely on being defaulted -after- the object's constructed so the call to 'term' will work. I could have set restrictions on what types the values provided to these attributes are by providing the 'isa' option to the has calls but I can't see any advantage to it right now and I might want to pass something odd in for interesting purposes later.
Next step is to create an initial runloop that calls on read, execute and print steps (why execute and not eval? we'll get to that in part 2 :) -
sub run {
my ($self) = @_;
while ($self->run_once) {
# keep looping
}
}sub run_once {
my ($self) = @_;
my $line = $self->read;
return unless defined($line); # undefined value == EOF
my @ret = $self->execute($line);
$self->print(@ret);
return 1;
}
Separating out run and run_once may seem largely pointless at this stage, but later on we may want to hook some sort of action to happen before every step - say incrementing a counter in the prompt (or something more interesting once I think of it :). I imagine a bunch of you are probably muttering 'yagni yagni yagni' under your breath, so in turn I'd like those of you who -are- to imagine me sticking my tongue out at you. Ok, we done now? Good.
run_once itself does, pretty much literally, read then execute then print. The only wrinkle is the return if $line is undefined; traditionally perl filehandles of any sort return the special value undef to indicate EOF, since '' or '0' both evaluate to false but are perfectly valid lines to read (even if they make no sense to the app reading them), and Term::ReadLine behaves just the same. Then at the end if we got that far, we return 1 to indicate success to run() so execution continues.
Of course, we still haven't actually defined the read, execute and print steps, so let's do that now.
sub read {
my ($self) = @_;
return $self->term->readline($self->prompt);
}
Simple enough; term and prompt are both stock accessors so calling them with no arguments returns the value - to set we'd call $self->term($new_term) or similar. Moose will happily let you create separate get_term and set_term methods via the 'reader' and 'writer' options to has, but it's not usual and it's more typing so I'm not going to.
sub execute {
my ($self, $to_exec) = @_;
my @ret = eval $to_exec;
@ret = ("ERROR: $@") if $@;
return @ret;
}
eval is used here in string mode to compile+execute at the same time - this currently means that all code is executing in the Devel::REPL namespace, which we don't really want but it'll do for a start. The return is made in list context in case the code's returning multiple values - it's unlikely to do any harm and having to put [] round code returning more than one thing would be -annoying-. A quick check of $@ afterwards for compile or execution errors and we're good to go.
sub print {
my ($self, @ret) = @_;
my $fh = $self->out_fh;
print $fh "@ret";
}1;
And now we can grab the appropriate filehandle (which will call the lazy default => sub the first time we ask for it) and print the output. Yay. The 1; at the end of the file indicates to perl that the .pm loading ok. So, check syntax -
cain$ perl -c lib/Devel/REPL.pm
lib/Devel/REPL.pm syntax OK
and try running the code (-Ilib tells perl to search 'lib' in the local dir, the -M loads the module and -e provides the code to execute since we don't have a script yet) -
cain$ perl -Ilib -MDevel::REPL -e 'Devel::REPL->new->run;'
$ 2 + 4
6
$ (1 .. 3)
1 2 3
$
cain$
And it lives, it evaluates, and Ctrl-D sends EOF and brings me back to the shell prompt. Lovely. So, a quick svk add + commit later, the first working code is in the repository.
Next time round I'll sort out history and add the first plugin - one to provide a persistant lexical environment so we can carry variables between lines without polluting the Devel::REPL namespace or giving up the joys of compile-time typo checking from 'use strict'. I'll see you there.
=== update, April 25
History handling turned out to be more interesting than I first expected so it and the plugin approach got part 2 all to themselves. Lexical environment handling is now covered by part 3
=== end update, original post follows
