Perlwikibot/clean sandbox.pl

clean_sandbox.pl is a simple script using the unreleased 3.0 version of perlwikibot. It simply cleans a wiki's sandbox by submitting an edit with some standard text, overwriting whatever was there previously.

Preamble
Here, we use two pragmas that should be considered mandatory: strict and warnings. These are tools to make you a better programmer by forcing you to follow some rules.

We also use Getopt::Long to get command line options from the user. This greatly simplifies parsing @ARGV - you should never do it yourself. Always use a Perl module - Getopt::Long is one, but there are others.

Pod::Usage allows us to give the user man-style documentation using Perl's POD markup embedded in the source code.

Config::General allows us to easily parse configuration files. Again, there are others, but this one provides several features we'll make use of, like heredocs.

Finally, we need  version 3.0.0 or higher. 3.0.0 is still under development, so method calls are subject to change.

The $VERSION string will be used later, and might be used by other scripts that use this file, like we did for MediaWiki::Bot.

POD
Here, we provide some documentation for the user. This uses POD (see ), and will be parsed and shown to the user if they provide the --help option. The =head1 control inserts a heading with the rest of the text on the line. More POD is included in the rest of the file, but isn't included in this page.

Parsing the command line
This declares variables for all our command line options, and gets Getopt::Long to parse @ARGV and assign to those variables for us. This is much better than attempting to do so manually.

On the left, is the option name, and any aliases. For example, help|h|? provide a name (help) and two aliases (h and ?). The presence of that option on the command line is assigned to $help. Others, like dry-run have only the canonical name. Still others take a mandatory parameter, like wiki=s. The = indicates that the parameter is mandatory; the s indicates that it is a string. Lastly, password has an optional string parameter indicated by :s. This option's parameter is optional because we prefer to prompt for the password interactively. Command line arguments are visible to all users on the system, so doing that should be avoided.

Version data
If the user specified --version on the command line, $version will be true. We print a simple message containing the version string we declared above, then exit.

Prompt for password
If --password was specified on the command line, we interactively prompt for their password. To do this, we can use Term::ReadKey, which provides several methods useful for this task. First, note that require is evaluated at runtime, whereas use is evaluated at compile-time, even if it would never run. use also import s default methods into the current context, whereas require doesn't. We could import ourselves, but it is just as easy not to in this case.

We'll use a standard method of reading in the password. First, we show the prompt, then set the terminal to 'noecho' readmode. This means the user's keystrokes won't display anything on-screen. Next, we read in the user's input and assign it to $password. Previously, this variable simply told us whether --password was specified on the command line - now it holds the text of the password. Finally, we restore the original characteristics of the user's terminal. If we don't do that, it continues operating in 'noecho' mode, which they won't like.

Reading configuration
If we don't have all of username, password, and wiki already, then we should read them in from a config file. Config::General provides the ParseConfig method to do this.

We give it the filename (relative to the current file), and a few options. UTF8 is important because this file can and will include UTF8 characters under many circumstances. An example config file:

default = Mike's bot account

 password   = fake password wiki       = enwikibooks

When Config::General reads this in, it creates a hash which represents the data. Once we have that hash, we try to get the data we're missing, and warn the user if we can't accomplish that.

Notice that some of the warn statements are conditional on $debug. This is another command line flag that asks the script to output additional information about what it is doing to make debugging easier. We ask MediaWiki::Bot to do the same.

Here, we use Config::General::ParseConfig to parse another config file. This one contains data for many wikis about where their sandbox is located, what standard text should be put on it, and what edit summary they want to be used. This is useful because some bot operators might not speak the language where their bot cleans the sandbox. It also means they don't have to always specify  every time. That data can be stored in the config file instead.

Because this file contains data for many wikis, and we don't need all of it, we throw away most of it. ParseConfig creates a large hash, but we keep only the part about the wiki we actually want to edit. Then, we find any data we're missing.

Create a bot object
Unlike earlier versions of Perlwikibot, 3.0 will handle lots automatically to make writing scripts easier. Here, we create a new bot object, which will automatically be logged in and configured for us. Check POD documentation for MediaWiki::Bot for details about the new constructor.

Make the edit
This is a here-document (heredoc) - it allows us to print a multi-line string easily. If the user specified --dry-run on the command line, we want to do everything up to this point, but not actually edit. So, if $dry_run is set, we print out this multi-line string showing what would have happened, and die. The << part of &lt;&lt;"END" tells Perl that a heredoc follows; END tells Perl what delimiter to look for to know the heredoc has ended; the double-quotes tell Perl that we want it to be an interpolated string.

Actually make the edit
This actually makes the edit by calling MediaWiki::Bot 's edit method, and passing it the page name, what text to put, the edit summary, and that it is a minor edit. See POD for details on edit.