DOS_PERL.TXT
                        John Dallman
                  jgd@cix.compulink.co.uk
                        December 1994

Some general notes on programming with Perl under MS-DOS. They're 
based on my experience with versions ports of Perl 3 and 4:

  dds    Didi Spenellis' 16-bit Perl 3.041
  eva    Eelco Van Asparen's 16-bit Perl 4.036
  big    Daryl Orzra's 32-bit Perl 4.036, aka BigPerl M03
         This (or a leter version) is the one to get.

To learn Perl, you'll need a Perl system and a book. Books are listed below; 
copies of Perl are available on the Internet (see below), or from shareware 
suppliers.  Usenet access is very helpful if you use Perl: the Usenet 
newsgroup comp.lang.perl is the world's leading source of help and advice on 
Perl.  Please *try* to find answers to questions in this document, the 
comp.lang.perl Frequently Asked Questions (FAQ) files or a book before 
asking questions - Dallman's first law of Usenet* applies strongly on 
comp.lang.perl.

[*Well, since you ask: 'Asking questions you don't understand produces 
answers you don't understand']


The Standard Questions:
=======================

1)      As of December 1994, there isn't a version of Perl for MS-Windows.
        BigPerl can run in a Windows DOS box, but has no 'visual' user 
        interface under these conditions.

        <<<< Does the NT version run under WIN32S?

2)      Perl version 5 for DOS is being prepared as I write this; it should 
be      available Real Soon Now.


Books About Perl
================

    Learning Perl
        by Randal L. Schwartz aka "The Llama Book" (engraving of a llama on
        cover) published by  O'Reilly & Associates ISBN 1-56592-042-2
        Tutorial-style introduction to Perl.

    Programming Perl
        by Larry Wall and Randal L. Schwartz aka  "The  Camel Book"
        (engraving  of  a camel on cover) published by O'Reilly & Associates
        ISBN 0-937175-61-1 Tutorial, reference, cookbook, examples, ...

    Perl by Example
        by Ellie Quigley;  published by Prentice Hall.  ISBN 0-13-122839-0
        A perl tutorial: every feature is presented via an annotated
        example, and sample output.  Includes a reference section ---
        fairly similar to the quickref, below.

    Perl Quick Reference Guide
        by Johan  Vromans.   This  is bundled with the camel book,
        and available in postscript on the Internet as:

        [] ftp://ftp.uu.net/languages/perl/perlref-4.036.1.tar.gz
        [] ftp://ftp.cs.ruu.nl/pub/DOC/perlref-4.036.1.tar.gz
        [] ftp://ftp.khoros.unm.edu/pub/perl/perlref-4.036.1.tar.gz


The Perl Meta-FAQ (short answers to common questions) is available on 
the Internet as:

     http://www.khoros.unm.edu/staff/neilb/perl/metaFAQ/metaFAQ.html
     ftp://ftp.khoros.unm.edu/pub/perl/metaFAQ.ps (PostScript)
     ftp://ftp.khoros.unm.edu/pub/perl/metaFAQ.txt (Plain ASCII)

Parts of this document, including all the Internet addresses, are lifted 
from the December 1994 version of the Meta-FAQ.

The full Perl FAQ is available as:

    North America:
        [] ftp://ftp.cis.ufl.edu/pub/perl/doc/FAQ
        [] ftp://rtfm.mit.edu/pub/usenet/news.answers/perl-faq/
        [] ftp://ftp.uu.net/usenet/news.answers/perl-faq/

    Europe:
        [] ftp://ftp.cs.ruu.nl/pub/NEWS.ANSWERS/perl-faq/
        [] ftp://ftp.funet.fi/pub/languages/perl/doc/faq
        [] ftp://src.doc.ic.ac.uk/packages/perl/FAQ


Where can I get Perl 4.036?
===========================
    [] ftp://ftp.uu.net/languages/perl/perl.tar.gz
    [] ftp://ftp.cis.ufl.edu/pub/perl/src/4.0
    [] ftp://ftp.khoros.unm.edu/pub/perl/perl-4.0.36.tar.gz
    [] ftp://ftp.cs.ruu.nl/pub/PERL/perl.4.0.36.tar.gz
    [] ftp://ftp.demon.co.uk/pub/perl/perl-4.036.tar.gz
    [] 
ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/unix/perl-4.036.tar.Z    
[] ftp://src.doc.ic.ac.uk/packages/perl/perl-4.036.tar.gz


Perl 4.036 for MS-DOS and related operating systems
===================================================

    MS-DOS:
        [] ftp://ftp.ee.umanitoba.ca/pub/msdos/perl
        [] ftp://ftp.cis.ufl.edu/pub/perl/src/msdos
        [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/msdos
        [] ftp://src.doc.ic.ac.uk/packages/ibmpc/simtel/perl/

    NT:
        [] ftp://ftp.cis.ufl.edu/pub/perl/src/ntperl
        [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/nt

    OS/2:
        [] part of the standard perl distribution, in directory $PERLSRC/os2
        [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/os2

    Netware:
        [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/netware


About Perl
==========
The Perl programming language resembles 'C', with many added 
features from programs provided with the UNIX operating system. 
Perl programs ('scripts') aren't compiled to COM or EXE files, but are run 
by the program PERL.EXE. The process is similar to a DOS batch 
file. However, PERL.EXE reads and checks an entire script 
before starting to run it, which makes the procedure faster and 
more reliable.

A copy of Perl consists, at a minimum, of PERL.EXE and a text file
containing the GNU licence agreement.  Perl is 'copyleft' free
software, and you should read and understand the licence agreement. Perl
systems usually contain documentation, sample programs and additional
programs: there is often a PERLGLOB.EXE program which expands wildcard
filenames for Perl.  The standard Perl library of scripts containing
subroutines is often supplied, but often isn't much use under DOS: many
of its components rely on UNIX features.

Perl scripts are simple text files and can be written with any 
normal text editor. Under DOS, their filenames normally have an 
extension of .PL, but this is a convention, not a standard.

The differences between UNIX Perl and DOS Perl are important. 
The basic programming language is the same, but many functions used 
for UNIX system management, or intended for operation in a multitasking 
environment, aren't implemented. See "DOS Perl", below, for 
details.


Memory
======
16-bit Perls run as conventional MS-DOS programs within the 
normal 640 of memory managed by MS-DOS. As PERL.EXE is at least 
300Kb, there isn't much memory available for Perl program variables.
BigPerl runs as a 32-bit DOS-extended program and obviates this 
problem; it can use all available memory (it needs at least a 
4Mb RAM 386), plus virtual memory if required. It's also much faster
than a 16-bit Perl, and is the only "real" implementation for 
modern PCs. 



Installation
============
It's conventional practice to install Perl and its support files
in their own directory (eg, C:\PERL). Add this directory to the PATH
statement in your AUTOEXEC.BAT file.  Environment variables are commonly
used to configure an MS-DOS Perl:

  - PERL: Holds the pathname of the directory where the Perl executable is
    installed. Not used by PERL.EXE, but useful for MS-DOS setup.
    Example:

         set perl=c:\perl

  - PERLLIB: Holds a list of directories which PERL.EXE will search for 
    library and files (it adds them to the Perl array variable @INC, which 
    is a list of directories to search.  Example:

        set perllib=c:\perl;c:\perl\lib;c:\myscript

If you use #!PERL (see below) to start Perl scripts under DOS, it can
use two further environment variables: PERL_EE and PERLSCRIPT.


Developing with Perl
====================
The write-code/test-code cycle is very short when working with Perl; 
when I'm tinkering with a short, but troublesome program, the cycle is 
often under a minute. If you don't normally use a DOS command line 
editing system (e.g., the DOSKEY program supplied with DOS 5.0), try 
it for Perl work.


Octal numbers
=============
By convention, UNIX usually uses octal, rather then hexadecimal numbers. 
Perl is supposed to support both, but it's easier to work with Perl
if you think octal


DOS Perl
========
This section covers the special limitations and changes that DOS 
imposes on Perl.

#!/usr/bin/perl
---------------
The Perl documentation makes a great fuss about this piece of UNIX 
shell syntax for running Perl programs. This doesn't work under DOS; 
several tricks using batch files have been produced, but I don't
find any of them satisfactory. I've created a utility program, 
#!PERL.EXE, which implements #! functionality via a simple trick.
It is available from the umantioba FTP site as hbp_30.zip (later
versions will be hbp40.zip, hbp50.zip, etc).

Wall & Schwartz, page 12
------------------------
The first substantial example program in Wall & Schwartz won't 
work under DOS Perl, as it relies on the UNIX program FIND. 
A modified version, which uses the enhanced DIR command 
of DOS 5.0 onwards as a substitute, is at the end of this file.
Some comments have been added and the script has been modified 
to a 'C'-style layout for clarity.

In general, Perl scripts that use extenal UNIX commands won't work on
DOS.  The MKS Toolkit is a package of UNIX utilities ported to DOS that
some people filed useful.  Personally, I'm used to DOS and prefer not to
modify it too much.

Regular expressions
-------------------
If you have trouble with the concept or syntax of "regular expressions"
consult a book on UNIX, specifically the UNIX text-processing utilities.  
Unfortunately, regular expressions are so common (and so standard) in
work with UNIX that many books assume a basic understanding of them. 
One very complete explaination is given in Sed and Awk, by Dale Dougherty, 
published by O'Reilly & associates in 1990. ISBN 0-937175-59-5.  This 
describes the Sed and Awk programs, ancestors of Perl and has a 
very complete introduction to regular expressions.

Handling \n and \r\n
--------------------
By convention, UNIX programs use a 'new line' character to mark the 
end of a line of text. This character is written '\n'; on most UNIX 
machines, it is the ASCII line feed character (LF, 0x0A). DOS normally 
uses two characters to mark the end of a line: Carriage Return and 
Line Feed (CR & LF, 0x0D, 0x0A). In Perl, as with most programs, these 
are written '\r' and '\n'.

As the inner workings of Perl often assume that "end-of-line" 
is one character, DOS Perl treats '\r\n' as end-of-line, and then 
"eats" the '\r' before handing the line over to the Perl script. 
It then replaces '\n' with '\r\n' in its output. If you want to handle 
'\r' yourself, you can turn off Perl's handling of it with the binmode() 
function.

Unimplemented words
-------------------
The following Perl words and concepts are described in Wall & Schwartz, 
but aren't available in (some versions of) DOS Perl:

Word            Reason(s)
Accept          UNIX IPC function.
Alarm           UNIX IPC, not in Perl 3.0.
Bind            UNIX IPC, not in Perl 3.0.
Caller          Not in Perl 3.0.
Chown           UNIX permission system.
Chroot          UNIX superuser function.
Connect         UNIX IPC.
Crypt           "Unimplmented due to excessive paranoia".
Dbm...          Only available in BigPerl.
Exec            Doesn't work in BigPerl, since the DOS extender doesn't
                support it.  In 16-bit Perls, it terminates Perl and runs 
                the command specified. Remember to distinguish it from 
                eval(), which runs a Perl script, inside Perl.
Fcntl           UNIX facility, not available on DOS.
Flock           UNIX facility, not available on DOS.
Fork            UNIX facility, not available on DOS.
Get...          (except getc) UNIX facilities, not available on DOS, 
                many not in Perl 3.0.
Kill            UNIX IPC.
Link            UNIX facility, not available on DOS.
Listen          UNIX IPC.
Msg...          UNIX IPC, not in Perl 3.0.
Ndbm...         Only available in BigPerl.
Pipe            UNIX facility, not available on DOS (but the pipe 
                functions of open() are implemented).
Readlink        UNIX facility, not available on DOS.
Recv            UNIX facility, not available on DOS.
Require         Not in Perl 3.0.
Scalar          The scalar() function is not available in Perl 3.0.
Select          The Select(RBITS, WBITS, EBITS, TIMEOUT) form of 
                select() is a UNIX function, not available on DOS.
Sem...          UNIX facility, not available on DOS, not in Perl 3.0.
Send            UNIX facility, not available on DOS.
Set...          UNIX facility, not available on DOS.
Shm...          UNIX facility, not available on DOS, not in Perl 3.0.
Shutdown        UNIX facility, not available on DOS.
Socket...       UNIX facility, not available on DOS.
Symlink         UNIX facility, not available on DOS.
Syscall         UNIX facility, not available on DOS.
Sysread         Not in Perl 3.0.
Syswrite        Not in Perl 3.0.
Truncate        Not in Perl 3.0.
Umask           UNIX facility, not available on DOS.
Wait...         UNIX facility, not available on DOS.


Command line arguments
----------------------

-D      This isn't implemented in any DOS Perl that I've seen; 
        debuging support code would increase the size of PERL.EXE 
        considerably.

-P      Don't expect this to work: most DOS machines don't 
        have a C compiler installed, and many DOS C compilers 
        don't allow you to use the preprocessor without compiling 
        the file as a C program.  

-u      This isn't implemented: see Dump() above.

Different words
===============

Some Perl commands operate differently on DOS. This section gives 
some of the important differences.

If you specify pathnames with '\', Perl treats the backslash as a 
special character, just like C. Put pathnames in single quotes, or use 
'\\', or '/', which Perl file operations can use as a pathname element 
separator.

Chdir() has been bodged in eva and BigPerl to let it change the 
current disk drive as well as the directory.

Chmod() can only be used to set or clear the read-only attribute 
of DOS files. 'File attributes', below, gives details.

Delete() does *not* remove files (use unlink()), it removes elements
from arrays.

Dump() doesn't work: it produces an "Abnormal program termination" 
message. DOS is one of the systems where dump() (and the -u 
command line argument) can't be implemented.

Localtime() gives a time() value (see below) based on the current 
DOS clock time. Gmtime() has a fixed offset from localtime(); since
DOS doesn't support the UNIX time zone system, the offset depends 
on the assumptions compiled into your copy of Perl. 

Ioctl() was partially implemented in the dds and eva implementations, 
but not in BigPerl. The MS-DOS IOCTL mechanism isn't nearly as 
general or useful as the UNIX implementation and isn't used for much.

Lstat() operates as stat(), as DOS doesn't have symbolic links.

mkdir(): Ignore the notes about subprocesses and mkdir(); it isn't an 
issue under DOS. Use a MODE value of 0; under UNIX, MODE can be used 
to create private subdirectories (see 'File attributes'), but this 
isn't possible under DOS, and 0 is handled as a special case.

Rename() can rename files or directories, and can move them 
to different directories. However, it can't move a file or directory 
to a different DOS drive letter, even if they're on the same physical 
disk drive.

Stat() returns 11 values. Using the notation of page 188 of 
Wall & Schwartz, these are:

$dev            Drive number: 0 for A:, 2 for C:, etc.
$ino            Always zero: the "inode number" under UNIX.
$mode           The file's attributes, in an emulation of the UNIX format.
                See 'File attributes' for details.
$nlink          Always 1: the number of links to the file under UNIX.
$uid            Always 0: the user ID of the file under UNIX.
$gid            Always 0: the group ID of the file under UNIX.
$rdev           Always equal to $dev: under UNIX, the 'real' device 
                holding the file.
$size           File size in bytes.
$atime          Date/time stamp of file, seconds since 1970 (see time()).
$mtime          Always equal to $atime
$ctime          Always equal to $atime

System() runs a program 'inside' Perl, in the same way as the 
Run Other system runs a program inside RoboCAD. Perl remains in memory 
(leaving about 200Kb free for the system()'ed command on a 16-bit Perl, 
or 500kb on BigPerl) and the Perl script continues when the program 
terminates.

Time() returns the current time in seconds since 00:00:00, 1st 
Jan 1970. This is the standard format for date/time values when managing 
files under UNIX; the DOS format is different, but Perl converts them.

Utime() sets the date and time stamps of the files it is given 
to process. UNIX has several stamps for each file; even though DOS 
only records one, you must give utime() two times (preferably 
identical ones). Note that the times are in seconds since 1970, as 
returned by time() and stat().

Unlink() deletes files. Delete() removes elements from arrays, not files
from directories.


File attributes
===============
Both DOS and UNIX have the concept of "File attributes": various 
properties of a file, stored in the file's directory entry, and capable 
of being set On or Off. However, the attributes used by the two operating 
systems are very different. Perl, as a UNIX-based program, retains 
a UNIX viewpoint even when running under DOS.

DOS has attributes to denote directory entries as Read-Only, Hidden 
or "System" files, as being disk volume labels or subdirectories, 
and as files which have been changed since they were backed up (the 
"Archive" flag).

UNIX has a more complex system. Nine flags exist to control access 
or use of the file. These are the "permissions"; three further 
flags control the execution of programs held in the file. These flags 
are collectively known as the "mode" of the file, which is 
usually expressed as an octal number. You can use the Perl function 
oct() to convert a numeric character string to an octal value 
- see page 69 of Wall & Schwartz.

The UNIX file mode is made up of the bitwise OR of the following octal 
values, most of which aren't meaningful under DOS.

0400            Owner can read the file.
0200            Owner can write to the file.
0100            Owner can execute the file (or search the directory).

0040            Members of owner's group can read.
0020            Members of owner's group can write.
0010            Members of owner's group can run/search.

0004            Anybody can read.
0002            Anybody can write.
0001            Anybody can run/search.

4000            Program runs under its owner's user ID
2000            Program runs under its owner's group ID
1000            Program will be run by many users at once

Chmod attributes

DOS Perl uses the owner's write permission bit in a UNIX-style file 
mode value to control the DOS Read-Only flag for each file being processed 
by chmod(). For example:

chmod 0755, "test.doc" ;

Under UNIX, this gives the file's owner read, write and execute permission 
and all other users read and execute permission. DOS Perl sees that 
the "owner" is allowed to write to the file, so it doesn't 
flag the file as Read-Only.

chmod 0555, "test.doc" ;

When handing this command, DOS Perl sees that the user doesn't have 
write permission, and sets the file to Read-Only. As far as I've 
established, none of the other DOS flags can be set by chmod(), 
and none of the other UNIX mode bits are significant.

Stat attributes
===============
The $mode field of stat()'s output produces UNIX-style 
attributes. Only a few values can be produced, some of 
which are non-standard:

Octal value     Meaning

0020666         Console input or output: default for STDIN, 
                STDOUT and STDERR
0100666         Disk file; read and write permitted. Redirected 
                STDIN or STDOUT produce this value.
0100444         Read-only disk file.
0100777         COM, EXE or BAT file.

STATTEST.PL, below, may help with exploring stat() results.


########################## STATTEST.PL #########################

# STATTEST.PL: Perl script to explore stat() mode values under MS-DOS.

sub add_stats
{
        $stats{$_[0]} = (stat( $_[0])) [2] ;
}

foreach $arg (@ARGV)
{
        &add_stats( $arg) ;
        print "$arg is executable\n" if -x $arg ;
}

$stats{ STDERR} = (stat( STDERR)) [2] ;
$stats{ STDIN} = (stat( STDIN)) [2] ;
$stats{ STDOUT} = (stat( STDOUT)) [2] ;

foreach $dev (sort keys( %stats))
{
        printf( "%s has mode %lo\n", $dev, $stats{$dev}) ;
        printf( "%s has mode %o\n", $dev, $stats{$dev}) ;
}

########################## PAGE_12.PL #########################

# Example from page 12 of Programming Perl, for MS-DOS Perl,
# amplified in places and with some notes

# As these is no FIND, we use DIR. The pipe-handing shown here
# resorts to some outrageous fakery under MS-DOS, but works.
# Note, however, that the error message never gets used:
# if dir fails, it reports its error message to its own stdout
# and an empty file is returned.  This needs MS-DOS 5.00 for the
# /b parameter to DIR.

open( FIND, "dir *.* /b |") || die "Couldn't run dir: $!\n" ;

FILE:
while( $filename = <FIND>) 
{
        chop( $filename) ;
        print "File $filename... " ;
        if (! -T $filename)
        {
                print "not a text file\n" ;
                next FILE ;
        }

        # We use -T to eliminate subdirectories, EXE files and maybe others?

        if (!open( TEXTFILE, $filename))
        {
                print STDERR "Can't open $filename -- continuing...\n"  ;
                next FILE ;
        }

        print "searching... " ;

        while (<TEXTFILE>)
        {
                foreach $word ( @ARGV)
                {
                        if (index( $_, $word) >= 0)
                        {
                                print "found \"", $word, "\" \n" ;
                                # We alter this message to be a little 
clearer

                                next FILE ;     
                                # So we don't find multiple hits on one file
                        }       
                }
        }
        print "processed \n"  ;
}