229

Chapter 16:

CGI Scripting

Perl is the duct tape of the Internet.

-Hassan Schroeder (Sun Microsystems' first webmaster)

The advent of the World Wide Web and Perl's natural fit into web server
programs have provided a tremendous boost to Perl's popularity. Perl has
what most web server programs need: quick development time, superior text
and data manipulation, cross-platform availability, interprocess commu-
nication, and comprehensive operating system interaction.

The majority of web server programs use the Common Gateway Interface
(CGI). CGI programs can be written in any language, not just Perl. In fact,
many Mac-based web sites never employ Perl at all (a situation we hope to
remedy! :-). Remember, in any case, that there is more to Perl than CGI, and
more to CGI than Perl.

This is not a chapter on how to write CGIs;1please consult a book on CGI
programming for that. This chapter covers only the use of MacPerl for CGI
programming, and assumes a prior familiarity with CGI programming. See
Part V, Resources, for several good books on CGI programming.

Web Servers

If you want to serve CGIs from your Mac OS computer, you will need a web
server. Even if you plan to use MacPerl only for developing CGIs for use on
other servers, you should still install a local web server for testing. Just

IMAGE imgs/335.CGI01.gif

1"CGI programs" are often called, simply, "CGIs".


IMAGE imgs/335.CGI02.gif

remember that a Mac-based server will act differently in some respects
than a Unix- or Windows-based server:

  • The Mac is case-preserving, but not case-sensitive in handling file
    names (it sees
    readme, Readme, and READMEas the same file). Unix
    is case-sensitive.
  • The Mac allows almost any character (including ones with the high
    order bit set) in file names. Unix does not deal well with the presence
    of spaces and/or "wild card" characters (e.g.,
    *and ?) in file names,
    though any USASCII (7-bit ASCII) character (except /) is allowed.
    Microsoft Windows defines a large set of illegal characters.
  • The Mac uses carriage returns to terminate lines of text. Unix uses line
    feeds; Windows uses the sequence "carriage return / line feed". Most
    Web browsers are willing to accept any reasonable line termination,
    but the issue may come up when a MacPerl script attempts to read a
    file that was generated on another machine.

Not all web servers support CGI, though most do.2We will not advocate
the use of any server over another, though we do suggest that you look over
the servers which we have included on the CD-ROM. For instructions on
how to use a given server and for complete details of CGI support adminis-
tration for each server, please refer to the server's documentation.

Note: Check with your local network administrator before you put up a
web page. This may prevent a wide range of unwanted technical and/or
political problems. For instance, your company may not want its tele-
phone directory visible to outside parties.

The choice of a server can be tricky. Some good servers may not work on your
machine or may not provide the performance you need. Some servers may
not support CGI, opting instead to provide other interfaces.
3For our pur-
poses, of course, Perl support is mandatory. Here are some popular servers
that work under Mac OS:

IMAGE imgs/335.CGI01.gif

2CGIs on Mac OS are normally implemented through the WebStar CGI specification,
and MacPerl CGIs are no exception. When we discuss Mac OS CGIs or server capa-
bilities, we mean the WebStar specification unless otherwise stated.
3If a server's limitations on CGIs are too severe, it might be necessary to write WWW
programs using a server's own API, such as WebStar's W*API.


IMAGE imgs/335.CGI04.gif

Server
Name

Current
Version

Supports
CGIs

AppleShare IP
Apple Personal Web Sharing
MacHTTP
MS Personal Web Server
NetPresenz
Pictorius Net Servers
QuidProQuo
WebSTAR
WebTen
4

5.0
1.1
2.2.2
1.0c
4.1
1.17
2.0
2.1
1.1









NA

If you are testing your CGIs on an unconnected computer, feel free to use the
string
localhostin place of an Internet address in your web browser's URL
line ( http://localhost/test.html).

If your machine is not connected to other machines on a permanent basis, it
probably does not have an assigned IP(Internet Protocol) address or DNS
(Domain Name Service) name. The lack of an assigned IP address (let alone
the lack of a permanent network connection) will prevent other machines
from "finding" your web pages.

We cannot hope to give you sufficient information to understand the opera-
tion of a web server here, let alone HTML, CGI, or web server administra-
tion style and technique. Read your server documentation thoroughly and
consult books, web resources, your ISP, and your network administrator about
CGI support, DNS, HTML, web and Mac OS security issues, etc.

The CGI Script Extension

MacPerl comes with a file called CGI Scriptwhich resides in the MacPerl
Extensions
folder. This appears in the popup menu in MacPerl's Save As ...
dialog box, and saves your program in a CGI format.

A MacPerl CGI script is actually an application of its own. The web server
passes data to the script, which runs under MacPerl. The script processes
the data and generates results which are sent back to the server. All of the
communication in the Macintosh is handled through Apple Events.

IMAGE imgs/335.CGI01.gif

4WebTen comes with its own version of Perl integrated into the server's environment,
and does not use MacPerl for CGIs.


IMAGE imgs/335.CGI06.gif

After getting your server software set up, try a sample script, such as:

#!perl -w
print "Content-type: text/html\n\n";
print "Hello, world!\n";

The first printstatement generates the content type, followed by two line
breaks. This is the first output that all CGIs must make. MacPerl CGIs do
not require this line; if it is omitted, it is assumed, using
text/htmlas the
content type. You should include it, however, for clarity and portability.

Now save this script to your server's CGI directory (usually called cgi-bin5)
as a CGI script and access it through a web browser. You should see the text
Hello, world!in the browser.

If you want to see exactly what Apple Events are being sent where, bring
the CGI script to the front once it has been run (you can select it under the
Mac OS application menu) and hit the
Command-Lkey sequence on your
keyboard. This will create a text file called
MPCGI Logon your desktop.

Because your CGI script communicates via Apple Events, it has the same
limitations that all Apple Events have. Prior to Mac OS 7.6, an Apple
Event could only contain a limited amount of data (64 KB or less). Fortun-
ately, this limitation has been lifted in Mac OS 7.6 and 8.0.

The CGI script application stays open for five minutes of inactivity, then
closes automatically. If some user calls the CGI at least once every five
minutes, the script will never close. This eliminates a significant amount of
overhead (in launching the MacPerl application). If the script stays open,
it can respond more quickly to subsequent requests.

IMAGE imgs/335.CGI07.gif
IMAGE imgs/335.CGI01.gif

5See your server's documentation.


IMAGE imgs/335.CGI09.gif

If you are proficient with ResEdit, you might want to edit the amount of
time the CGI script stays open. The resource will revert to the five-minute
default any time the CGI script is opened and saved again from MacPerl;
however, if you make the ResEdit change to the
CGI Scriptextension itself,
the new value will replace five minutes as the default time-out period.
6

To make the change, open the file you want to change (either the saved
script or the
CGI Scriptextension) with ResEdit. Open the timeresource (ID
128). The middle column will contain the current time-out, in minutes. If it
has not been edited before, this value will be
05, the hexademical repre-
sentation of the decimal numeral 5. Change this to any hexademical num-
ber from
01to 7F(one minute to 127 minutes). A value of 00will keep the
script open indefinitely (until a reboot or some manual close takes place).

There is also a droplet on the CD-ROM (setCGImins.dp),7which can make
this change for you, without having to use ResEdit (but the same disclaim-
ers apply). The droplet will take files that are dropped on it and set the
timeresource of each file to the number of minutes (0 to 127) specified.

CGI vs. ACGI

Most advanced web servers have the ability to respond to more than one
request simultaneously. Unfortunately, most Mac OS web servers will wait
for a CGI to finish running before responding to any other requests, whether
for an HTML page, an image, or another CGI. CGIs can take a while to run,
so a CGI can appear to slow down the entire server significantly.

This is where Asynchronous CGI(ACGI) comes in. Web servers that can use
ACGIs (most do!) will respond to other requests while the ACGI is proces-
sing, instead of waiting for it to finish.

Making a CGI into an ACGI is very simple: instead of using the suffix .cgi,
use
.acgi. Actually, you should always use the .acgisuffix for your CGIs, as
there is really no reason not to (unless you wantto slow down the server :-).

Note:ACGI has nothing to do with how many simultaneous requests
MacPerl can handle. A given instance of MacPerl can only execute one
script at a time. So, if you are running MacPerl for your own purposes,

IMAGE imgs/335.CGI01.gif

6As always, use ResEdit at your own risk. Make backups!
7Part of the code for the script is in Ch. 12, under Mac::Resources.


IMAGE imgs/335.CGI11.gif

you may well get in the way of your CGIs (and vice versa!). Running
multiple copies of MacPerl is, however, a possible workaround.

Taint Checking

Perl has an advanced security framework that allows the programmer to
check for possibly "tainted" data. Basically, this is data that is imported
into the program from an outside source, and is explicitly untrusted.
8

Tainted data can still be used for most purposes; you can print it, use it for
addition, and whatnot. What you cannot do with tainted data, if taint
checks are on, is use it in any sort of system interaction. For instance, taint
checks will prevent you from opening a file whose name was typed into a
web page (the user might have specified a file that you do not want him to
see and/or overwrite).

my $data = get_form_data();# tainted!
open(F, ">$data") or die($!);

If the contents of $datahappened to be "::index.html", then your CGI
would go to the directory above the one containing the CGI and create a file
called
index.html, deleting any existing file that might have been there.
With taint checking on, MacPerl would have quit with an error, which is
exactly what we would have wanted it to do.

As of Perl version 5.004, taint checks can only be turned on with the com-
mand-line argument
-T. That is, if a script called myscript.plhas a first
line
#!/usr/bin/perl -Tand is executed by its name on the Unix com-
mand line, taint checks will be enabled. But, if the same script is executed
from the Unix command line as
perl myscript.pl, the script will gener-
ate an error, because the
-Tswitch was not on the command line that called
the script. It must be called with
perl -T myscript.pl.

This causes problems for MacPerl, because the first line of a script does not
actually call the Perl program as it does on Unix. So, trying to enable taint
checks by putting
#!perl -Tin a script will always generate an error.

At present, there are only two methods of turning on taint checking in Mac-
Perl. The primary way is with the menu option
Taint Checks, under the

IMAGE imgs/335.CGI01.gif

8Taint checks are not exclusive to CGI programming, but they are especially useful for
CGIs, so we are discussing them here. For more detailed information about taint checks
and Perl security, including how to deal with tainted data, see the
perlsecman page.


IMAGE imgs/335.CGI13.gif

Scriptmenu. This causes problems, however, because generally you won't
want taint checks on for all your scripts.

The other way is only used when sending a script to MacPerl from an outside
source using the
Do ScriptApple Event.9The parameter TAINwith a true
value will turn taint checks on for that event.

Because CGI Scripts also execute their contents via the Do Scriptmech-
anism, it is possible to send the same
TAINparameter via the CGI Script.
There is a special version of the
CGI Script extension called CGI Script
(Taint Check)
on the CD-ROM. This version does the same things as the reg-
ular version, save that it has MacPerl do the taint checks.

If you are using modules or required files that depend on your library prefer-
ences, you will have to take another MacPerl difference into account. The
library paths are hard-coded into the Unix Perl binary, but they are not
hard-coded into MacPerl.

Taint checks - for security's sake - wipe out your path preferences, so you
will have to restore them. Put the following at the top of your script:
10

BEGIN {# restore lib paths
$ENV{MACPERL} =~ /^(.+)$/;
my($f) = $1;
unshift(@INC,
"${f}lib:$MacPerl::Architecture:",
"${f}lib:");
}

It is a bit cumbersome to put this at the top of any CGI Script that will use
taint checks, but it is probably worth it. Considering the inherent differ-
ences (security, processes, etc.) between Mac OS and Unix, there could be
changes to the MacPerl security model to make it different (and more usa-
ble) in the future.

We recommend that you always use the taint checking version of the CGI
Script extension, unless you have a specific reason not to, and can complete-

IMAGE imgs/335.CGI01.gif

9See Chapter 18, AppleScript, Etc.
10This only works in MacPerl 5.1.6 or later, as $MacPerl::Architecturewas
introduced in that release.


IMAGE imgs/335.CGI15.gif

ly trust the incoming data.11Finally, remember that these taint checks are
only as secure as your computer and related files; the taint checks are use-
less to protect against anyone who has physical access to your computer.

Environment Variables

CGIs make heavy use of environment variables, which are accessed
through
%ENV. MacPerl provides basically all the same variables pro-
vided through any other CGI.

One primary difference is in $ENV{PATH_INFO}. Consider this call to a
Unix CGI:

http://www.host.com/cgi-bin/my.cgi/path/info?foo=bar

The CGI my.cgiwould be called, with an $ENV{PATH_INFO}of /path/
info
and an $ENV{QUERY_STRING}of foo=bar. But, in order for this to
work with a Mac OS CGI, a
$must be addded immediately after the CGI
name, so the server can know exactly where the CGI is:

http://www.host.com/cgi-bin/my.cgi$/path/info?foo=bar

Different web clients supply different environment variables, too. All good
CGI books contain a list of CGI environment variables, but if you want to see
exactly what environment variables are available to your CGI, try the fol-
lowing CGI:

#!perl -w
my($key);

print "Content-type: text/plain\n\n";
foreach $key (keys(%ENV)) {
print "$key => $ENV{$key}\n";
}

CGI.pm

CGI.pmhas become the de factostandard module for writing CGIs with
Perl. It is so widely used that it has been included in the standard distri-
bution and is the recommended method for writing CGIs in Perl.

IMAGE imgs/335.CGI01.gif

11You might think that you have all your bases covered and have no need for taint
checking; remember, though, that some crackers out there are often smarter than you are
(or at least very persistent!). No offense.


IMAGE imgs/335.CGI17.gif

CGI.pmeases much of the "grunt work" of doing CGIs, including entity
translation, input processing, address redirection, and header manipula-
tion. It also includes methods for producing HTML and forms more easily.
12

There are no known significant differences between CGI.pmon MacPerl and
on any other web server, with one exception: the file upload feature that
some browsers support.
CGI.pmcould likely be modified to work with Mac-
Perl in this regard, but as of this writing it does not.

CGI.pmincludes a convenient facility for debugging without a web server.
When the script is run without the web server (from the MPW command
line or with the
Run Scriptmenu command), it will open up the MacPerl
window and ask for user input. You can then type in your parameters as
name/value pairs, as below, hitting return after each one. After you input
all the pairs, hit
Control-Don your keyboard.

#!perl -w
use CGI;
my($cgi) = new CGI;

print $cgi->header();
printf("'%s'<BR>\n", $cgi->param('foo'));
printf("'%s'<BR>\n", $cgi->param('bar'));

Displays:

(offline mode: enter name=value pairs on the keyboard)
foo=fooval
bar=barval
Content-type: text/html

'fooval'<BR>
'barval'<BR>

Note the apparent extra space before 'fooval'. This is not really a space.
Networking applications normally use CRLF (
\015\012) as new lines, and
that is what
CGI.pmreturns. The MacPerl application renders the CR to a
line break and the LF as a character that looks like a space (or some funny

IMAGE imgs/335.CGI01.gif

12The HTML modules on CPAN may be more appropriate for advanced HTML
production, and they can complement CGI.pm nicely.


IMAGE imgs/335.CGI19.gif

character, depending on your font). Since this output is only used for debug-
ging puposes, you can ignore it here.

CGI.pmis a very large module, and has a ton of features available. Read
its documentation to find out more about it.

Copyright © 1997-1998 by Prime Time Freeware. All Rights Reserved.