DOS_PERL.TXT John Dallman jgd@cix.compulink.co.uk December 1994 Some general notes on programming with Perl under MS-DOS. They're based on my experience with versions ports of Perl 3 and 4: dds Didi Spenellis' 16-bit Perl 3.041 eva Eelco Van Asparen's 16-bit Perl 4.036 big Daryl Orzra's 32-bit Perl 4.036, aka BigPerl M03 This (or a leter version) is the one to get. To learn Perl, you'll need a Perl system and a book. Books are listed below; copies of Perl are available on the Internet (see below), or from shareware suppliers. Usenet access is very helpful if you use Perl: the Usenet newsgroup comp.lang.perl is the world's leading source of help and advice on Perl. Please *try* to find answers to questions in this document, the comp.lang.perl Frequently Asked Questions (FAQ) files or a book before asking questions - Dallman's first law of Usenet* applies strongly on comp.lang.perl. [*Well, since you ask: 'Asking questions you don't understand produces answers you don't understand'] The Standard Questions: ======================= 1) As of December 1994, there isn't a version of Perl for MS-Windows. BigPerl can run in a Windows DOS box, but has no 'visual' user interface under these conditions. <<<< Does the NT version run under WIN32S? 2) Perl version 5 for DOS is being prepared as I write this; it should be available Real Soon Now. Books About Perl ================ Learning Perl by Randal L. Schwartz aka "The Llama Book" (engraving of a llama on cover) published by O'Reilly & Associates ISBN 1-56592-042-2 Tutorial-style introduction to Perl. Programming Perl by Larry Wall and Randal L. Schwartz aka "The Camel Book" (engraving of a camel on cover) published by O'Reilly & Associates ISBN 0-937175-61-1 Tutorial, reference, cookbook, examples, ... Perl by Example by Ellie Quigley; published by Prentice Hall. ISBN 0-13-122839-0 A perl tutorial: every feature is presented via an annotated example, and sample output. Includes a reference section --- fairly similar to the quickref, below. Perl Quick Reference Guide by Johan Vromans. This is bundled with the camel book, and available in postscript on the Internet as: [] ftp://ftp.uu.net/languages/perl/perlref-4.036.1.tar.gz [] ftp://ftp.cs.ruu.nl/pub/DOC/perlref-4.036.1.tar.gz [] ftp://ftp.khoros.unm.edu/pub/perl/perlref-4.036.1.tar.gz The Perl Meta-FAQ (short answers to common questions) is available on the Internet as: http://www.khoros.unm.edu/staff/neilb/perl/metaFAQ/metaFAQ.html ftp://ftp.khoros.unm.edu/pub/perl/metaFAQ.ps (PostScript) ftp://ftp.khoros.unm.edu/pub/perl/metaFAQ.txt (Plain ASCII) Parts of this document, including all the Internet addresses, are lifted from the December 1994 version of the Meta-FAQ. The full Perl FAQ is available as: North America: [] ftp://ftp.cis.ufl.edu/pub/perl/doc/FAQ [] ftp://rtfm.mit.edu/pub/usenet/news.answers/perl-faq/ [] ftp://ftp.uu.net/usenet/news.answers/perl-faq/ Europe: [] ftp://ftp.cs.ruu.nl/pub/NEWS.ANSWERS/perl-faq/ [] ftp://ftp.funet.fi/pub/languages/perl/doc/faq [] ftp://src.doc.ic.ac.uk/packages/perl/FAQ Where can I get Perl 4.036? =========================== [] ftp://ftp.uu.net/languages/perl/perl.tar.gz [] ftp://ftp.cis.ufl.edu/pub/perl/src/4.0 [] ftp://ftp.khoros.unm.edu/pub/perl/perl-4.0.36.tar.gz [] ftp://ftp.cs.ruu.nl/pub/PERL/perl.4.0.36.tar.gz [] ftp://ftp.demon.co.uk/pub/perl/perl-4.036.tar.gz [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/unix/perl-4.036.tar.Z [] ftp://src.doc.ic.ac.uk/packages/perl/perl-4.036.tar.gz Perl 4.036 for MS-DOS and related operating systems =================================================== MS-DOS: [] ftp://ftp.ee.umanitoba.ca/pub/msdos/perl [] ftp://ftp.cis.ufl.edu/pub/perl/src/msdos [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/msdos [] ftp://src.doc.ic.ac.uk/packages/ibmpc/simtel/perl/ NT: [] ftp://ftp.cis.ufl.edu/pub/perl/src/ntperl [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/nt OS/2: [] part of the standard perl distribution, in directory $PERLSRC/os2 [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/os2 Netware: [] ftp://ftp.funet.fi/pub/languages/perl/ports/perl4/netware About Perl ========== The Perl programming language resembles 'C', with many added features from programs provided with the UNIX operating system. Perl programs ('scripts') aren't compiled to COM or EXE files, but are run by the program PERL.EXE. The process is similar to a DOS batch file. However, PERL.EXE reads and checks an entire script before starting to run it, which makes the procedure faster and more reliable. A copy of Perl consists, at a minimum, of PERL.EXE and a text file containing the GNU licence agreement. Perl is 'copyleft' free software, and you should read and understand the licence agreement. Perl systems usually contain documentation, sample programs and additional programs: there is often a PERLGLOB.EXE program which expands wildcard filenames for Perl. The standard Perl library of scripts containing subroutines is often supplied, but often isn't much use under DOS: many of its components rely on UNIX features. Perl scripts are simple text files and can be written with any normal text editor. Under DOS, their filenames normally have an extension of .PL, but this is a convention, not a standard. The differences between UNIX Perl and DOS Perl are important. The basic programming language is the same, but many functions used for UNIX system management, or intended for operation in a multitasking environment, aren't implemented. See "DOS Perl", below, for details. Memory ====== 16-bit Perls run as conventional MS-DOS programs within the normal 640 of memory managed by MS-DOS. As PERL.EXE is at least 300Kb, there isn't much memory available for Perl program variables. BigPerl runs as a 32-bit DOS-extended program and obviates this problem; it can use all available memory (it needs at least a 4Mb RAM 386), plus virtual memory if required. It's also much faster than a 16-bit Perl, and is the only "real" implementation for modern PCs. Installation ============ It's conventional practice to install Perl and its support files in their own directory (eg, C:\PERL). Add this directory to the PATH statement in your AUTOEXEC.BAT file. Environment variables are commonly used to configure an MS-DOS Perl: - PERL: Holds the pathname of the directory where the Perl executable is installed. Not used by PERL.EXE, but useful for MS-DOS setup. Example: set perl=c:\perl - PERLLIB: Holds a list of directories which PERL.EXE will search for library and files (it adds them to the Perl array variable @INC, which is a list of directories to search. Example: set perllib=c:\perl;c:\perl\lib;c:\myscript If you use #!PERL (see below) to start Perl scripts under DOS, it can use two further environment variables: PERL_EE and PERLSCRIPT. Developing with Perl ==================== The write-code/test-code cycle is very short when working with Perl; when I'm tinkering with a short, but troublesome program, the cycle is often under a minute. If you don't normally use a DOS command line editing system (e.g., the DOSKEY program supplied with DOS 5.0), try it for Perl work. Octal numbers ============= By convention, UNIX usually uses octal, rather then hexadecimal numbers. Perl is supposed to support both, but it's easier to work with Perl if you think octal DOS Perl ======== This section covers the special limitations and changes that DOS imposes on Perl. #!/usr/bin/perl --------------- The Perl documentation makes a great fuss about this piece of UNIX shell syntax for running Perl programs. This doesn't work under DOS; several tricks using batch files have been produced, but I don't find any of them satisfactory. I've created a utility program, #!PERL.EXE, which implements #! functionality via a simple trick. It is available from the umantioba FTP site as hbp_30.zip (later versions will be hbp40.zip, hbp50.zip, etc). Wall & Schwartz, page 12 ------------------------ The first substantial example program in Wall & Schwartz won't work under DOS Perl, as it relies on the UNIX program FIND. A modified version, which uses the enhanced DIR command of DOS 5.0 onwards as a substitute, is at the end of this file. Some comments have been added and the script has been modified to a 'C'-style layout for clarity. In general, Perl scripts that use extenal UNIX commands won't work on DOS. The MKS Toolkit is a package of UNIX utilities ported to DOS that some people filed useful. Personally, I'm used to DOS and prefer not to modify it too much. Regular expressions ------------------- If you have trouble with the concept or syntax of "regular expressions" consult a book on UNIX, specifically the UNIX text-processing utilities. Unfortunately, regular expressions are so common (and so standard) in work with UNIX that many books assume a basic understanding of them. One very complete explaination is given in Sed and Awk, by Dale Dougherty, published by O'Reilly & associates in 1990. ISBN 0-937175-59-5. This describes the Sed and Awk programs, ancestors of Perl and has a very complete introduction to regular expressions. Handling \n and \r\n -------------------- By convention, UNIX programs use a 'new line' character to mark the end of a line of text. This character is written '\n'; on most UNIX machines, it is the ASCII line feed character (LF, 0x0A). DOS normally uses two characters to mark the end of a line: Carriage Return and Line Feed (CR & LF, 0x0D, 0x0A). In Perl, as with most programs, these are written '\r' and '\n'. As the inner workings of Perl often assume that "end-of-line" is one character, DOS Perl treats '\r\n' as end-of-line, and then "eats" the '\r' before handing the line over to the Perl script. It then replaces '\n' with '\r\n' in its output. If you want to handle '\r' yourself, you can turn off Perl's handling of it with the binmode() function. Unimplemented words ------------------- The following Perl words and concepts are described in Wall & Schwartz, but aren't available in (some versions of) DOS Perl: Word Reason(s) Accept UNIX IPC function. Alarm UNIX IPC, not in Perl 3.0. Bind UNIX IPC, not in Perl 3.0. Caller Not in Perl 3.0. Chown UNIX permission system. Chroot UNIX superuser function. Connect UNIX IPC. Crypt "Unimplmented due to excessive paranoia". Dbm... Only available in BigPerl. Exec Doesn't work in BigPerl, since the DOS extender doesn't support it. In 16-bit Perls, it terminates Perl and runs the command specified. Remember to distinguish it from eval(), which runs a Perl script, inside Perl. Fcntl UNIX facility, not available on DOS. Flock UNIX facility, not available on DOS. Fork UNIX facility, not available on DOS. Get... (except getc) UNIX facilities, not available on DOS, many not in Perl 3.0. Kill UNIX IPC. Link UNIX facility, not available on DOS. Listen UNIX IPC. Msg... UNIX IPC, not in Perl 3.0. Ndbm... Only available in BigPerl. Pipe UNIX facility, not available on DOS (but the pipe functions of open() are implemented). Readlink UNIX facility, not available on DOS. Recv UNIX facility, not available on DOS. Require Not in Perl 3.0. Scalar The scalar() function is not available in Perl 3.0. Select The Select(RBITS, WBITS, EBITS, TIMEOUT) form of select() is a UNIX function, not available on DOS. Sem... UNIX facility, not available on DOS, not in Perl 3.0. Send UNIX facility, not available on DOS. Set... UNIX facility, not available on DOS. Shm... UNIX facility, not available on DOS, not in Perl 3.0. Shutdown UNIX facility, not available on DOS. Socket... UNIX facility, not available on DOS. Symlink UNIX facility, not available on DOS. Syscall UNIX facility, not available on DOS. Sysread Not in Perl 3.0. Syswrite Not in Perl 3.0. Truncate Not in Perl 3.0. Umask UNIX facility, not available on DOS. Wait... UNIX facility, not available on DOS. Command line arguments ---------------------- -D This isn't implemented in any DOS Perl that I've seen; debuging support code would increase the size of PERL.EXE considerably. -P Don't expect this to work: most DOS machines don't have a C compiler installed, and many DOS C compilers don't allow you to use the preprocessor without compiling the file as a C program. -u This isn't implemented: see Dump() above. Different words =============== Some Perl commands operate differently on DOS. This section gives some of the important differences. If you specify pathnames with '\', Perl treats the backslash as a special character, just like C. Put pathnames in single quotes, or use '\\', or '/', which Perl file operations can use as a pathname element separator. Chdir() has been bodged in eva and BigPerl to let it change the current disk drive as well as the directory. Chmod() can only be used to set or clear the read-only attribute of DOS files. 'File attributes', below, gives details. Delete() does *not* remove files (use unlink()), it removes elements from arrays. Dump() doesn't work: it produces an "Abnormal program termination" message. DOS is one of the systems where dump() (and the -u command line argument) can't be implemented. Localtime() gives a time() value (see below) based on the current DOS clock time. Gmtime() has a fixed offset from localtime(); since DOS doesn't support the UNIX time zone system, the offset depends on the assumptions compiled into your copy of Perl. Ioctl() was partially implemented in the dds and eva implementations, but not in BigPerl. The MS-DOS IOCTL mechanism isn't nearly as general or useful as the UNIX implementation and isn't used for much. Lstat() operates as stat(), as DOS doesn't have symbolic links. mkdir(): Ignore the notes about subprocesses and mkdir(); it isn't an issue under DOS. Use a MODE value of 0; under UNIX, MODE can be used to create private subdirectories (see 'File attributes'), but this isn't possible under DOS, and 0 is handled as a special case. Rename() can rename files or directories, and can move them to different directories. However, it can't move a file or directory to a different DOS drive letter, even if they're on the same physical disk drive. Stat() returns 11 values. Using the notation of page 188 of Wall & Schwartz, these are: $dev Drive number: 0 for A:, 2 for C:, etc. $ino Always zero: the "inode number" under UNIX. $mode The file's attributes, in an emulation of the UNIX format. See 'File attributes' for details. $nlink Always 1: the number of links to the file under UNIX. $uid Always 0: the user ID of the file under UNIX. $gid Always 0: the group ID of the file under UNIX. $rdev Always equal to $dev: under UNIX, the 'real' device holding the file. $size File size in bytes. $atime Date/time stamp of file, seconds since 1970 (see time()). $mtime Always equal to $atime $ctime Always equal to $atime System() runs a program 'inside' Perl, in the same way as the Run Other system runs a program inside RoboCAD. Perl remains in memory (leaving about 200Kb free for the system()'ed command on a 16-bit Perl, or 500kb on BigPerl) and the Perl script continues when the program terminates. Time() returns the current time in seconds since 00:00:00, 1st Jan 1970. This is the standard format for date/time values when managing files under UNIX; the DOS format is different, but Perl converts them. Utime() sets the date and time stamps of the files it is given to process. UNIX has several stamps for each file; even though DOS only records one, you must give utime() two times (preferably identical ones). Note that the times are in seconds since 1970, as returned by time() and stat(). Unlink() deletes files. Delete() removes elements from arrays, not files from directories. File attributes =============== Both DOS and UNIX have the concept of "File attributes": various properties of a file, stored in the file's directory entry, and capable of being set On or Off. However, the attributes used by the two operating systems are very different. Perl, as a UNIX-based program, retains a UNIX viewpoint even when running under DOS. DOS has attributes to denote directory entries as Read-Only, Hidden or "System" files, as being disk volume labels or subdirectories, and as files which have been changed since they were backed up (the "Archive" flag). UNIX has a more complex system. Nine flags exist to control access or use of the file. These are the "permissions"; three further flags control the execution of programs held in the file. These flags are collectively known as the "mode" of the file, which is usually expressed as an octal number. You can use the Perl function oct() to convert a numeric character string to an octal value - see page 69 of Wall & Schwartz. The UNIX file mode is made up of the bitwise OR of the following octal values, most of which aren't meaningful under DOS. 0400 Owner can read the file. 0200 Owner can write to the file. 0100 Owner can execute the file (or search the directory). 0040 Members of owner's group can read. 0020 Members of owner's group can write. 0010 Members of owner's group can run/search. 0004 Anybody can read. 0002 Anybody can write. 0001 Anybody can run/search. 4000 Program runs under its owner's user ID 2000 Program runs under its owner's group ID 1000 Program will be run by many users at once Chmod attributes DOS Perl uses the owner's write permission bit in a UNIX-style file mode value to control the DOS Read-Only flag for each file being processed by chmod(). For example: chmod 0755, "test.doc" ; Under UNIX, this gives the file's owner read, write and execute permission and all other users read and execute permission. DOS Perl sees that the "owner" is allowed to write to the file, so it doesn't flag the file as Read-Only. chmod 0555, "test.doc" ; When handing this command, DOS Perl sees that the user doesn't have write permission, and sets the file to Read-Only. As far as I've established, none of the other DOS flags can be set by chmod(), and none of the other UNIX mode bits are significant. Stat attributes =============== The $mode field of stat()'s output produces UNIX-style attributes. Only a few values can be produced, some of which are non-standard: Octal value Meaning 0020666 Console input or output: default for STDIN, STDOUT and STDERR 0100666 Disk file; read and write permitted. Redirected STDIN or STDOUT produce this value. 0100444 Read-only disk file. 0100777 COM, EXE or BAT file. STATTEST.PL, below, may help with exploring stat() results. ########################## STATTEST.PL ######################### # STATTEST.PL: Perl script to explore stat() mode values under MS-DOS. sub add_stats { $stats{$_[0]} = (stat( $_[0])) [2] ; } foreach $arg (@ARGV) { &add_stats( $arg) ; print "$arg is executable\n" if -x $arg ; } $stats{ STDERR} = (stat( STDERR)) [2] ; $stats{ STDIN} = (stat( STDIN)) [2] ; $stats{ STDOUT} = (stat( STDOUT)) [2] ; foreach $dev (sort keys( %stats)) { printf( "%s has mode %lo\n", $dev, $stats{$dev}) ; printf( "%s has mode %o\n", $dev, $stats{$dev}) ; } ########################## PAGE_12.PL ######################### # Example from page 12 of Programming Perl, for MS-DOS Perl, # amplified in places and with some notes # As these is no FIND, we use DIR. The pipe-handing shown here # resorts to some outrageous fakery under MS-DOS, but works. # Note, however, that the error message never gets used: # if dir fails, it reports its error message to its own stdout # and an empty file is returned. This needs MS-DOS 5.00 for the # /b parameter to DIR. open( FIND, "dir *.* /b |") || die "Couldn't run dir: $!\n" ; FILE: while( $filename = <FIND>) { chop( $filename) ; print "File $filename... " ; if (! -T $filename) { print "not a text file\n" ; next FILE ; } # We use -T to eliminate subdirectories, EXE files and maybe others? if (!open( TEXTFILE, $filename)) { print STDERR "Can't open $filename -- continuing...\n" ; next FILE ; } print "searching... " ; while (<TEXTFILE>) { foreach $word ( @ARGV) { if (index( $_, $word) >= 0) { print "found \"", $word, "\" \n" ; # We alter this message to be a little clearer next FILE ; # So we don't find multiple hits on one file } } } print "processed \n" ; }