Skip to content

Modernish is a library for writing robust, portable, readable, and powerful programs for POSIX-based shells and utilities.

License

Notifications You must be signed in to change notification settings

modernish/modernish

Repository files navigation

Releases

For code examples, see EXAMPLES.md and share/doc/modernish/examples

modernish – harness the shell

  • Sick of quoting hell and split/glob pitfalls?
  • Tired of brittle shell scripts going haywire and causing damage?
  • Mystified by line noise commands like[,[[,((?
  • Is scripting basic things just too hard?
  • Ever wish thatfindwere a built-in shell loop?
  • Do you want your script to work on nearly any shell on any Unix-like OS?

Modernish is a library for shell script programming which provides features like safer variable and command expansion, new language constructs for loop iteration, and much more. Modernish programs are shell programs; the new constructs are mixed with shell syntax so that the programmer can take advantage of the best of both.

There is no compiled code to install, as modernish is written entirely in the shell language. It can be deployed in embedded or multi-user systems in which new binary executables may not be introduced for security reasons, and is portable among numerous shell implementations. The installer can also bundle a reduced copy of the library with your scripts, so they can run portably with a known version of modernish without requiring prior installation.

Join us and help breathe some new life into the shell!We are looking for testers, early adopters, and developers to join us. Download the latest release or check out the very latest development code from the master branch. Read through the documentation below. Play with the example scripts and write your own. Try to break the library and send reports of breakage.

Table of contents

Getting started

Runinstall.shand follow instructions, choosing your preferred shell and install location. After successful installation you can run modernish shell scripts and write your own. Rununinstall.shto remove modernish.

Both the install and uninstall scripts are interactive by default, but support fully automated (non-interactive) operation as well. Command line options are as follows:

install.sh[-n] [-sshell] [-f] [-Ppathspec] [-dinstallroot] [-Dprefix] [-Bscriptfile... ]

  • -n:non-interactive operation
  • -s:specify default shell to execute modernish
  • -f:force unconditional installation on specified shell
  • -P:specify an alternativeDEFPATH for the installation (be careful; usuallynotrecommended)
  • -d:specify root directory for installation
  • -D:extra destination directory prefix (for packagers)
  • -B:bundle modernish with your scripts (-Drequired,-nimplied), see Appendix F

uninstall.sh[-n] [-f] [-dinstallroot]

  • -n:non-interactive operation
  • -f:delete*/modernishdirectories even if files left
  • -d:specify root directory of modernish installation to uninstall

Two basic forms of a modernish program

In thesimple form,modernish is added to a script written for a specific shell. In theportable form,your script is shell-agnostic and may run on any shell that can run modernish.

Simple form

Thesimplestway to write a modernish program is to source modernish as a dot script. For example, if you write for bash:

#!/bin/bash
.modernish
use safe
use sys/base
...your program starts here...

The modernishusecommand load modules with optional functionality. The safemodule initialises thesafe mode. Thesys/basemodule contains modernish versions of certain basic but non-standardised utilities (e.g.readlink,mktemp,which), guaranteeing that modernish programs all have a known version at their disposal. There are many other modules as well. SeeModulesfor more information.

The above method makes the program dependent on one particular shell (in this case, bash). So it is okay to mix and match functionality specific to that particular shell with modernish functionality.

(Onzsh,there is a way to integrate modernish with native zsh scripts. See Appendix E.)

Portable form

Themost portableway to write a modernish program is to use the special generic hashbang path for modernish programs. For example:

#!/usr/bin/env modernish
#!use safe
#!use sys/base
...your program begins here...

For portability, it is important there is no space afterenv modernish; NetBSD and OpenBSD consider trailing spaces part of the name, soenvwill fail to find modernish.

A program in this form is executed by whatever shell the user who installed modernish on the local system chose as the default shell. Since you as the programmer can't know what shell this is (other than the fact that it passed some rigorous POSIX compliance testing executed by modernish), a program in this formmust be strictly POSIX compliant– except, of course, that it should also make full use of the rich functionality offered by modernish.

Note that modules are loaded in a different way: theusecommands are part of hashbang comment (starting with#!like the initial hashbang path). Only such lines thatimmediatelyfollow the initial hashbang path are evaluated; even an empty line in between causes the rest to be ignored. This special way of pre-loading modules is needed to make any aliases they define work reliably on all shells.

Interactive use

Modernish is primarily designed to enhance shell programs/scripts, but also offers features for use in interactive shells. For instance, the newrepeat loop construct from thevar/loopmodule can be quite practical to repeat an action x times, and thesafemodule on interactive shells provides convenience functions for manipulating, saving and restoring the state of field splitting and globbing.

To use modernish on your favourite interactive shell, you have to add it to your.profile,.bashrcor similar init file.

Important:Upon initialising, modernish adapts itself to other settings, such as the locale. It also removes certain aliases that may keep modernish from initialising properly. So you have to organise your .profileor similar file in the following order:

  • first,define general system settings (PATH,locale, etc.);
  • then,.modernishanduseany modules you want;
  • thendefine anything that may depend on modernish, and set your aliases.

Non-interactive command line use

After installation, themodernishcommand can be invoked as if it were a shell, with the standard command line options from other shells (such as -cto specify a command or script directly on the command line), plus some enhancements. The effect is that the shell chosen at installation time will be run enhanced with modernish functionality. It is not possible to use modernish as an interactive shell in this way.

Usage:

  1. modernish[--use=module|shelloption... ] [scriptfile] [arguments]
  2. modernish[--use=module|shelloption... ] -c[script[me-name[arguments] ] ]
  3. modernish --test[testoption... ]
  4. modernish[--version|--help]

In the first form, the script in the filescriptfileis loaded and executed with anyargumentsassigned to the positional parameters.

In the second form,-cexecutes the specified modernish script,optionally with theme-nameassigned to$MEand the argumentsassigned to the positional parameters.

The--useoption pre-loads any given modernishmodules before executing the script. Themoduleargument to each specified--useoption is split using standard shell field splitting. The first field is the module name and any further fields become arguments to that module's initialisation routine.

Any given short-form or long-formshelloptions are set or unset before executing the script. Both POSIX shell options and shell-specific options are supported, depending on the shell executing modernish. Using the shell option-eor-o errexitis an error, because modernish does not support itand would break.

The--testoption runs the regression test suite and exits. This verifies that the modernish installation is functioning correctly. See Appendix B for more information.

The--versionand--helpoptions output the relative information and exit.

Non-interactive usage examples

  • Count to 10 using abasic loop:
    modernish --use=var/loop -c 'LOOP for i=1 to 10; DO putln "$i"; DONE'
  • Run aportable-form modernish program using zsh and enhanced-prompt xtrace:
    zsh /usr/local/bin/modernish -o xtrace /path/to/program.sh

Shell capability detection

Modernish includes a battery of shell feature, quirk and bug detection tests, each of which is given a special capability ID. SeeAppendix Afor a list of shell capabilities that modernish currently detects, as well as further general information on the capability detection framework.

thisshellhasis the central function of the capability detection framework. It not only tests for the presence of shell features/quirks/bugs, but can also detect specific shell built-in commands, shell reserved words, shell options (short or long form), and signals.

Modernish itself extensively uses capability detection to adapt itself to the shell it's running on. This is how it works around shell bugs and takes advantage of efficient features not all shells have. But any script using the library can do this in the same way, with the help of this function.

Test results are cached in memory, so repeated checks usingthisshellhas are efficient and there is no need to avoid calling it to optimise performance.

Usage:

thisshellhasitem...

  • Ifitemcontains only ASCII capital letters A-Z, digits 0-9 or_, return the result status of the associated modernish capability detection test.
  • Ifitemis any other ASCII word, check if it is a shell reserved word or built-in command on the current shell.
  • Ifitemis--(end-of-options delimiter), disable the recognition of operators starting with-for subsequent items.
  • Ifitemstarts with--rw=or--kw=,check if the identifier immediately following these characters is a shell reserved word (a.k.a. shell keyword).
  • Ifitemstarts with--bi=,similarly check for a shell built-in command.
  • Ifitemstarts with--sig=,check if the shell knows about a signal (usable bykill,trap,etc.) by the name or number following the=. If a number > 128 is given, the remainder of its division by 128 is checked. If the signal is found, its canonicalised signal name is left in the REPLYvariable, otherwiseREPLYis unset. (If multiple--sig=items are given and all are found,REPLYcontains only the last one.)
  • Ifitemis-ofollowed by a separate word, check if this shell has a long-form shell option by that name.
  • Ifitemis any other letter or digit preceded by a single-,check if this shell has a short-form shell option by that character.
  • itemcan also be one of the following two operators.
    • --cacheruns all external modernish shell capability tests that have not yet been run, causing the cache to be complete.
    • --showperforms a--cacheand then outputs all the IDs of positive results, one per line.

thisshellhascontinues to processitems until one of them produces a negative result or is found invalid, at which point any furtheritems are ignored. So the function only returns successfully if all theitems specified were found on the current shell. (To check if either oneitemor another is present, use separatethisshellhasinvocations separated by the ||shell operator.)

Exit status: 0 if this shell has all theitemsin question; 1 if not; 2 if anitemwas encountered that is not recognised as a valid identifier.

Note:The tests for the presence of reserved words, built-in commands, shell options, and signals are different from capability detection tests in an important way: they only check if an item by that name exists on this shell, and don't verify that it does the same thing as on another shell.

Names and identifiers

All modernish functions require portable variable and shell function names, that is, ones consisting of ASCII uppercase and lowercase letters, digits, and the underscore character_,and that don't begin with digit. For shell option names, the constraints are the same except a dash-is also accepted. An invalid identifier is generally treated as a fatal error.

Internal namespace

Function-local variables are not supported by the standard POSIX shell; only global variables are provided for. Modernish needs a way to store its internal state without interfering with the program using it. So most of the modernish functionality uses an internal namespace_Msh_*for variables, functions and aliases. All these names may change at any time without notice.Any names starting with_Msh_should be considered sacrosanct and untouchable; modernish programs should never directly use them in any way. Of course this is not enforceable, but names starting with_Msh_should be uncommon enough that no unintentional conflict is likely to occur.

Modernish system constants

Modernish provides certain constants (read-only variables) to make life easier. These include:

  • $MSH_VERSION:The version of modernish.
  • $MSH_PREFIX:Installation prefix for this modernish installation (e.g. /usr/local).
  • $MSH_MDL:Mainmodulesdirectory.
  • $MSH_AUX:Main helper scripts directory.
  • $MSH_CONFIG:Path to modernish user configuration directory.
  • $ME:Path to the current program. Replacement for$0.This is necessary if the hashbang path#!/usr/bin/env modernishis used, or if the program is launched likesh /path/to/bin/modernish /path/to/script.sh,as these set$0to the path to bin/modernish and not your program's path.
  • $MSH_SHELL:Path to the default shell for this modernish installation, chosen at install time (e.g. /bin/sh). This is a shell that is known to have passed all the modernish tests for fatal bugs. Cross-platform scripts should use it instead of hard-coding /bin/sh, because on some operating systems (NetBSD, OpenBSD, Solaris) /bin/sh is not POSIX compliant.
  • $SIGPIPESTATUS:The exit status of a command killed bySIGPIPE(a broken pipe). For instance, if you usegrep something somefile.txt | moreand you quitmorebeforegrepis finished,grepis killed by SIGPIPEand exits with that particular status. Hardened commands or functions may need to handle such aSIGPIPEexit specially to avoid unduly killing the program. The exact value of this exit status is shell-specific, so modernish runs a quick test to determine it at initialisation time.
    IfSIGPIPEwas set to ignore by the process that invoked the current shell,$SIGPIPESTATUScan't be detected and is set to the special value 99999. See also the description of the WRN_NOSIGPIPE ID for thisshellhas.
  • $DEFPATH:The default system path guaranteed to find compliant POSIX utilities, as given bygetconf PATH.
  • $ERROR:A guaranteed unset variable that can be used to trigger an error that exits the (sub)shell, for instance: :"${4+${ERROR:?excess arguments}}"(error on 4 or more arguments)

Control character, whitespace and shell-safe character constants

POSIX does not provide for the quoted C-style escape codes commonly used in bash, ksh and zsh (such as$'\n'to represent a newline character), leaving the standard shell without a convenient way to refer to control characters. Modernish provides control character constants (read-only variables) with hexadecimal suffixes$CC01..$CC1Fand$CC7F,as well as$CCe, $CCa,$CCb,$CCf,$CCn,$CCr,$CCt,$CCv(corresponding with printfbackslash escape codes). This makes it easy to insert control characters in double-quoted strings.

More convenience constants, handy for use in bracket glob patterns for use withcaseor modernishmatch:

  • $CONTROLCHARS:All ASCII control characters.
  • $WHITESPACE:All ASCII whitespace characters.
  • $ASCIIUPPER:The ASCII uppercase letters A to Z.
  • $ASCIILOWER:The ASCII lowercase letters a to z.
  • $ASCIIALNUM:The ASCII alphanumeric characters 0-9, A-Z and a-z.
  • $SHELLSAFECHARS:Safe-list for shell-quoting.
  • $ASCIICHARS:The complete set of ASCII characters (minus NUL).

Usage examples:

#Use a glob pattern to check against control characters in a string:
ifstr match"$var""*[$CONTROLCHARS]*";then
putln"\$var contains at least one control character"
fi
#Use '!' (not '^') to check for characters *not* part of a particular set:
ifstr match"$var""*[!$ASCIICHARS]*";then
putln"\$var contains at least one non-ASCII character";;
fi
#Safely split fields at any whitespace, comma or slash (requires safe mode):
use safe
LOOPfor--split=$WHITESPACE,/ fieldin$my_items;DO
putln"Item:$field"
DONE

Reliable emergency halt

Thediefunction reliably halts program execution, even from within subshells,optionally printing an error message. Note thatdieis meant for an emergency program halt only, i.e. in situations were continuing would mean the program is in an inconsistent or undefined state. Shell scripts running in an inconsistent or undefined state may wreak all sorts of havoc. They are also notoriously difficult to terminate correctly, especially if the fatal error occurs within a subshell:exitwon't work then. That's whydieis optimised for killingallthe program's processes (including subshells and external commands launched by it) as quickly as possible. It should never be used for exiting the program normally.

On interactive shells,diebehaves differently. It does not kill or exit your shell; instead, it issuesSIGINTto the shell to abort the execution of your running command(s), which is equivalent to pressing Ctrl+C. In addition, ifdieis invoked from a subshell such as a background job, it kills all processes belonging to that job, but leaves other running jobs alone.

Usage:die[message]

If thetrap stack module is active, a special DIEpseudosignal can be trapped (using plain oldtrapor pushtrap) to perform emergency cleanup commands upon invokingdie.

If theMSH_HAVE_MERCYvariable is set in a script anddieis invoked from a subshell, thendiewill only terminate the current subshell and its subprocesses and will not executeDIEtraps, allowing the script to resume execution in the parent process. This is for use in special cases, such as regression tests, and is strongly discouraged for general use. Modernish unsets the variable on init so it cannot be inherited from the environment.

Low-level shell utilities

Outputting strings

The POSIX shell lacks a simple, straightforward and portable way to output arbitrary strings of text, so modernish adds two commands for this.

  • putprints each argument separated by a space, without a trailing newline.
  • putlnprints each argument, terminating each with a newline character.

There is no processing of options or escape codes. (Modernish constants $CCn,etc. can be used to insert control characters in double-quoted strings. To process escape codes, use printf instead.)

Theechocommand is notoriously unportable and kind of broken, so is deprecatedin favour ofputandputln.Modernish does provide its own version ofecho,but it is only activated for portable-form) scripts. Otherwise, the shell-specific version ofechois left intact. The modernish version ofechodoes not interpret any escape codes and supports only one option,-n,which, like BSDecho,suppresses the final newline. However, unlike BSDecho,if-nis the only argument, it is not interpreted as an option and the string-nis printed instead. This makes it safe to output arbitrary data using this version ofechoas long as it is given as a single argument (using quoting if needed).

Legibility aliases:not,so,forever

Modernish sets three aliases that can help to make the shell language look slightly friendlier. Their use is optional.

notis a new synonym for!.They can be used interchangeably.

sois a command that tests if the previous command exited with a status of zero, so you can test the preceding command's success withif soor if not so.

foreveris a new synonym forwhile:;.This allows simple infinite loops of the form:forever dostuff;done.

Enhancedexit

Theexitcommand can be used as normal, but has gained capabilities.

Extended usage:exit[-u] [status[message] ]

  • As per standard, ifstatusis not specified, it defaults to the exit status of the command executed immediately prior toexit. Otherwise, it is evaluated as a shell arithmetic expression. If it is invalid as such, the shell exits immediately with an arithmetic error.
  • Any remaining arguments afterstatusare combined, separated by spaces, and taken as amessageto print on exit. The message shown is preceded by the name of the current program ($MEminus directories). Note that it is not possible to skipstatuswhile specifying amessage.
  • If the-uoption is given, and the shell functionshowusageis defined, that function is run in a subshell before exiting. It is intended to print a message showing how the command should be invoked. The-uoption has no effect if the script has not defined ashowusagefunction.
  • Ifstatusis non-zero, themessageand the output of theshowusage function are redirected to standard error.

chdir

chdiris a robustcdreplacement for use in scripts.

Thestandardcdcommand is designed for interactive shells and appropriate to use there. However, for scripts, its features create serious pitfalls:

  • The$CDPATHvariable is searched. A script may inherit a user's exported$CDPATH,socdmay change to an unintended directory.
  • cdcannot be used with arbitrary directory names (such as untrusted user input), as some operands have special meanings, even after--.POSIX specifies that-changes directory to$OLDPWD.On zsh (even in sh mode on zsh <= 5.7.1), numeric operands such as+12or-345represent directory stack entries. All such paths need escaping by prefixing./.
  • Symbolic links in directory path components are not resolved by default, leaving a potential symlink attack vector.

Thus, robust and portable use ofcdin scripts is unreasonably difficult. The modernishchdirfunction callscdin a way that takes care of all these issues automatically: it disables$CDPATHand special operand meanings, and resolves symbolic links by default.

Usage:chdir[-f] [-L] [-P] [--]directorypath

Normally, failure to change the present working directory todirectorypath is a fatal error that ends the program. To tolerate failure, add the-f option; in that case, exit status 0 signifies success and exit status 1 signifies failure, and scripts should always check and handle exceptions.

The options-L(logical: don't resolve symlinks) and-P(physical: resolve symlinks) are the same as incd,except that-Pis the default. Note that on a shell withBUG_CDNOLOGIC(NetBSD sh), the-Loption tochdirdoes nothing.

To use arbitrary directory names (e.g. directory names input by the user or other untrusted input) always use the--separator that signals the end of options, or paths starting with-may be misinterpreted as options.

insubshell

Theinsubshellfunction checks if you're currently running in a subshell environment (usually called simplysubshell).

Asubshellis a copy of the parent shell that starts out as an exact duplicate (including non-exported variables, functions, etc.), except for traps. A new subshell is invoked by constructs like(parentheses), $(command substitutions),pipe|lines, and&(to launch a background subshell). Upon exiting a subshell, all changes to its state are lost.

This is not to be confused with a newly initialised shell that is merely a child process of the current shell, which is sometimes (confusingly andwrongly) called a "subshell" as well. This documentation avoids such a misleading use of the term.

Usage:insubshell[-p|-u]

This function returns success (0) if it was called from within a subshell and non-success (1) if not. One of two options can be given:

  • -p:Store the process ID (PID) of the current subshell or main shell inREPLY.
  • -u:Store an identifier inREPLYthat is useful for determining if you've entered a subshell relative to a previously stored identifier. The content and format are unspecified and shell-dependent.

isset

issetchecks if a variable, shell function or option is set, or has certain attributes. Usage:

  • issetvarname:Check if a variable is set.
  • isset -vvarname:Id.
  • isset -xvarname:Check if variable is exported.
  • isset -rvarname:Check if variable is read-only.
  • isset -ffuncname:Check if a shell function is set.
  • isset -optionletter(e.g.isset -C): Check if shell option is set.
  • isset -ooptionname:Check if shell option is set by long name.

Exit status: 0 if the item is set; 1 if not; 2 if the argument is not recognised as avalid identifier. Unlike most other modernish commands,issetdoes not treat an invalid identifier as a fatal error.

When checking a shell option, a nonexistent shell option is not an error, but returns the same result as an unset shell option. (To check if a shell option exists, usethisshellhas.

Note: justisset -fchecks if shell option-f(a.k.a.-o noglob) is set, but with an extra argument, it checks if a shell function is set. Similarly,isset -xchecks if shell option-x(a.k.a-o xtrace) is set, butisset -xvarnamechecks if a variable is exported. If you use unquoted variable expansions here, make sure they're not empty, or the shell's empty removal mechanism will cause the wrong thing to be checked (even in thesafe mode).

setstatus

setstatusmanually sets the exit status$?to the desired value. The function exits with the status indicated. This is useful in conditional constructs if you want to prepare a particular exit status for a subsequent exitorreturncommand to inherit under certain circumstances. The status argument is a parsed as a shell arithmetic expression. A negative value is treated as a fatal error. The behaviour of values greater than 255 is not standardised and depends on your particular shell.

Testing numbers, strings and files

Thetest/[command is the bane of casual shell scripters. Even advanced shell programmers are frequently caught unaware by one of the many pitfalls of its arcane, hackish syntax. It attempts to look like shell grammar without beingshell grammar, causing myriad problems (1, 2). Its-a,-o,(and)operators areinherently and fatally brokenas there is no way to reliably distinguish operators from operands, so POSIX deprecates their use; however, most manual pages do not include this essential information, and even the few that do will not tell you what to do instead.

Ksh, zsh and bash offer a[[alternative that fixes many of these problems, as it is integrated into the shell grammar. Nevertheless, it increases confusion, as entirely different grammar and quoting rules apply within[[...]]than outside it, yet many scripts end up using them interchangeably. It is also not available on all POSIX shells. (To make matters worse, Busybox ash has a false-friend[[that is just an alias of[,with none of the shell grammar integration!)

Finally, the POSIXtest/[command is incompatible with the modernish "safe mode" which aims to eliminate most of the need to quote variables. Seeuse safefor more information.

Modernish deprecatestest/[and[[completely. Instead, it offers a comprehensive alternative command design that works with the usual shell grammar in a safer way while offering various feature enhancements. The following replacements are available:

Integer number arithmetic tests and operations

To test if a string is a valid number in shell syntax,str isintis available. SeeString tests.

The arithmetic commandlet

An implementation ofletas in ksh, bash and zsh is now available to all POSIX shells. This makes C-style signed integer arithmetic evaluation available to every supported shell, with the exception of the unary++and--operators (which are a nonstandard shell capability detected by modernish under the ID of ARITHPP).

This meansletshould be used for operations and tests, e.g. both let "x=5"andif let "x==5"; then... are supported (note: single=for assignment, double==for comparison). See POSIX 2.6.4 Arithmetic Expansion for more information on the supported operators.

Multiple expressions are supported, one per argument. The exit status oflet is zero (the shell's idea of success/true) if the last expression argument evaluates to non-zero (the arithmetic idea of true), and 1 otherwise.

It is recommended to adopt the habit to quote eachletexpression with "double quotes",as this consistently makes everything work as expected: double quotes protect operators that would otherwise be misinterpreted as shell grammar, while shell expansions starting with$continue to work.

Arithmetic shortcuts

Various handy functions that make common arithmetic operations and comparisons easier to program are available from the var/arithmodule.

String and file tests

The following notes apply to all commands described in the subsections of this section:

  1. "True" is understood to mean exit status 0, and "false" is understood to mean a non-zero exit status – specifically 1.
  2. Passingmorethan the number of arguments specified for each command is afatal error.(If the safe modeis not used, excessive arguments may be generated accidentally if you forget to quote a variable. The test result would have been wrong anyway, so modernish kills the program immediately, which makes the problem much easier to trace.)
  3. Passingfewerthan the number of arguments specified to the command is assumed to be the result of removal of an empty unquoted expansion. Where possible, this is not treated as an error, and an exit status corresponding to the omitted argument(s) being empty is returned instead. (This helps make thesafe modepossible; unlike withtest/[,paranoid quoting to avoid empty removal is not needed.)

String tests

Thestrfunction offers various operators for tests on strings. For example,str in $foo "bar"tests if the variablefoocontains "bar".

Thestrfunction takes unary (one-argument) operators that check a property of a single word, binary (two-argument) operators that check a word against a pattern, as well as an option that makes binary operators check multiple words against a pattern.

Unary string tests

Usage:stroperator[word]

Thewordis checked for the property indicated byoperator;if the result is true,strreturns status 0, otherwise it returns status 1.

The available unary string testoperators are:

  • empty:Thewordis empty.
  • isint:Thewordis a decimal, octal or hexadecimal integer number in valid POSIX shell syntax, safe to use withlet,$((...))and other arithmetic contexts on all POSIX-derived shells. This operator ignores leading (but not trailing) spaces and tabs.
  • isvarname:Thewordis a valid portable shell variable or function name.

Ifwordis omitted, it is treated as empty, on the assumption that it is an unquoted empty variable. Passing more than one argument after the operatoris a fatal error.

Binary string matching tests

Usage:stroperator[ [word]pattern]

Thewordis compared to thepatternaccording to theoperator;if it matches,strreturns status 0, otherwise it returns status 1. The available binary matchingoperators are:

  • eq:wordis equal topattern.
  • ne:wordis not equal topattern.
  • in:wordincludespattern.
  • begin:wordbegins withpattern.
  • end:wordends withpattern.
  • match:wordmatchespatternas a shell glob pattern (as in the shell's nativecaseconstruct). Apatternthat ends in an unescaped backslash is considered invalid and causesstrto return status 2.
  • ematch:wordmatchespatternas a POSIX extended regular expression. An emptypatternis a fatal error. (In UTF-8 locales, check if thisshellhasWRN_EREMBYTE before matching multi-byte characters.)
  • lt:wordlexically sorts before (is 'less than')pattern.
  • le:wordis lexically 'less than or equal to'pattern.
  • gt:wordlexically sorts after (is 'greater than')pattern.
  • ge:wordis lexically 'greater than or equal to'pattern.

Ifwordis omitted, it is treated as empty on the assumption that it is an unquoted empty variable, and the single remaining argument is assumed to be thepattern.Similarly, if bothwordandpatternare omitted, an empty wordis matched against an emptypattern.Passing more than two arguments after theoperatoris a fatal error.

Multi-matching option

Usage:str -Moperator[ [word... ]pattern]

The-Moption causesstrto compare any number ofwords to the pattern.The availableoperators are the same as the binary string matching operators listed above.

All matchingwords are stored in theREPLYvariable, separated by newline characters ($CCn) if there is more than one match. If nowords match,REPLYis unset.

The exit status returned bystr -Mis as follows:

  • If nowords match, the exit status is 1.
  • If onewordmatches, the exit status is 0.
  • If between two and 254words match, the exit status is the number of matches.
  • If 255 or morewords match, the exit status is 255.

Usage example: the following matches a given GNU-style long-form command line option$1against a series of available options. To make it possible for the options to be abbreviated, we check if any of the options begin with the given argument$1.

ifstr -M begin --fee --fi --fo --fum --foo --bar --baz --quux"$1";then
putln"OK. The given option$1matched$REPLY"
else
case$?in
( 1 ) putln"No such option:$1">&2;;
(*) putln"Ambiguous option:$1""Did you mean:""$REPLY">&2;;
esac
fi

File type tests

These avoid the snags with symlinks you get with[and[[. By default, symlinks arenotfollowed. Add-Lto operate on files pointed to by symlinks instead of symlinks themselves (the-Lmakes no difference if the operands are not symlinks).

These commands all take one argument. If the argument is absent, they return false. More than one argument is a fatal error. See notes 1-3 in the parent section.

is presentfile:Returns true if the file is present in the file system (even if it is a broken symlink).

is -L presentfile:Returns true if the file is present in the file system and is not a broken symlink.

is symfile:Returns true if the file is a symbolic link (symlink).

is -L symfile:Returns true if the file is a non-broken symlink, i.e. a symlink that points (either directly or indirectly via other symlinks) to a non-symlink file that is present in the file system.

is regfile:Returns true iffileis a regular data file.

is -L regfile:Returns true iffileis either a regular data file or a symlink pointing (either directly or indirectly via other symlinks) to a regular data file.

Other commands are available that work exactly likeis regandis -L reg but test for other file types. To test for them, replaceregwith one of:

  • dirfor a directory
  • fifofor a named pipe (FIFO)
  • socketfor a socket
  • blockspecialfor a block special file
  • charspecialfor a character special file

File comparison tests

The following notes apply to these commands:

  • Symlinks arenotresolved/followed by default. To operate on files pointed to by symlinks, add-Lbefore the operator argument, e.g.is -L newer.
  • Omitting any argument is a fatal error, because no empty argument (removed or otherwise) would make sense for these commands.

is newerfile1file2:Compares file timestamps, returning true iffile1 is newer thanfile2.Also returns true iffile1exists, butfile2does not; this is consistent for all shells (unliketest file1 -nt file2).

is olderfile1file2:Compares file timestamps, returning true iffile1 is older thanfile2.Also returns true iffile1does not exist, butfile2 does; this is consistent for all shells (unliketest file1 -ot file2).

is samefilefile1file2:Returns true iffile1andfile2are the same file (hardlinks).

is onsamefsfile1file2:Returns true iffile1andfile2are on the same file system. If any non-regular, non-directory files are specified, their parent directory is tested instead of the file itself.

File status tests

These always follow symlinks.

is nonemptyfile:Returns true if thefileexists, is not a broken symlink, and is not empty. Unlike[ -s file ],this also works for directories, as long as you have read permission in them.

is setuidfile:Returns true if thefilehas its set-user-ID flag set.

is setgidfile:Returns true if thefilehas its set-group-ID flag set.

I/O tests

is onterminalFD:Returns true if file descriptorFDis associated with a terminal. TheFDmay be a non-negative integer number or one of the special identifiersstdin,stdoutandstderrwhich are equivalent to 0, 1, and 2. For instance,is onterminal stdoutreturns true if commands that write to standard output (FD 1), such asputln,would write to the terminal, and false if the output is redirected to a file or pipeline.

File permission tests

Any symlinks given are resolved, as these tests would be meaningless for a symlink itself.

can readfile:True if the file's permission bits indicate that you can read the file - i.e., if anrbit is set and applies to your user.

can writefile:True if the file's permission bits indicate that you can write to the file: for non-directories, if awbit is set and applies to your user; for directories, bothwandx.

can execfile:True if the file's type and permission bits indicate that you can execute the file: for regular files, if anxbit is set and applies to your user; for other file types, never.

can traversefile:True if the file is a directory and its permission bits indicate that a path can traverse through it to reach its subdirectories: for directories, if anxbit is set and applies to your user; for other file types, never.

The stack

In modernish, every variable and shell option gets its own stack. Arbitrary values/states can be pushed onto the stack and popped off it in reverse order. For variables, both the value and the set/unset state is (re)stored.

Usage:

  • push[--key=value]item[item... ]
  • pop[--keepstatus] [--key=value]item[item... ]

whereitemis a valid portable variable name, a short-form shell option (dash plus letter), or a long-form shell option (-ofollowed by an option name, as two arguments).

Before pushing or popping anything, both functions check if all the given arguments are valid andpopchecks all items have a non-empty stack. This allows pushing and popping groups of items with a check for the integrity of the entire group.popexits with status 0 if all items were popped successfully, and with status 1 if one or more of the given items could not be popped (and no action was taken at all).

The--key=option is an advanced feature that can help different modules or functions to use the same variable stack safely. If a key is given to push,then for eachitem,the given keyvalueis stored along with the variable's value for that position in the stack. Subsequently, restoring that value withpopwill only succeed if the key option with the same key value is given to thepopinvocation. Similarly, popping a keyless value only succeeds if no key is given topop.If there is any key mismatch, no changes are made andpopreturns status 2. Note that this is a robustness/convenience feature, not a security feature; the keys are not hidden in any way.

If the--keepstatusoption is given,popwill exit with the exit status of the command executed immediately prior to callingpop.This can avoid the need for awkward workarounds when restoring variables or shell options at the end of a function. However, note that this makes failure to pop (stack empty or key mismatch) a fatal error that kills the program, aspop no longer has a way to communicate this through its exit status.

The shell options stack

pushandpopallow saving and restoring the state of any shell option available to thesetbuiltin. The precise shell options supported (other than the ones guaranteed by POSIX) depend on the shell modernish is running on. To facilitate portability, nonexistent shell options are treated as unset.

Long-form shell options are matched to their equivalent short-form shell options, if they exist. For instance, on all POSIX shells,-fis equivalent to-o noglob,andpush -o noglobfollowed bypop -fworks correctly. This also works for shell-specific short & long option equivalents.

On shells with a dynamicnooption name prefix, that is on ksh, zsh and yash (where, for example,noglobis the opposite ofglob), theno prefix is ignored, so something likepush -o globfollowed bypop -o noglobdoes the right thing. But this depends on the shell and should never be used in portable scripts.

The trap stack

Modernish can also make traps stack-based, so that each program component or library module can set its own trap commands without interfering with others. This functionality is provided by thevar/stack/trapmodule.

Modules

As modularity is one of modernish's design principles, much of its essential functionality is provided in the form of loadable modules, so the core library is kept lean. Modules are organised hierarchically, with names such assafe,var/loopandsys/cmd/harden.The usecommand loads and initialises a module or a combined directory of modules.

Internally, modules exist in files with the name extension.mmin subdirectories oflib/modernish/mdl– for example, the module var/stack/trapcorresponds to the filelib/modernish/mdl/var/stack/trap.mm.

Usage:

  • usemodulename[argument... ]
  • use[-q|-e]modulename
  • use -l

The first form loads and initialises a module. All arguments, including the module name, are passed on to the dot script unmodified, so modules know their own name and can implement option parsing to influence their initialisation. See also Two basic forms of a modernish program for information on how to use modules in portable-form scripts.

In the second form, the-qoption queries if a module is loaded, and the-e option queries if a module exists.usereturns status 0 for yes, 1 for no, and 2 if the module name is invalid.

The-loption lists all currently loaded modules in the order in which they were originally loaded. Just add| sortfor alphabetical order.

If a directory of modules, such assys/cmdor even justsys,is given as the modulename,then all the modules in that directory and any subdirectories are loaded recursively. In this case, passing extra arguments is a fatal error.

If a module fileX.mmexists along with a directoryX,resolving to the samemodulename,thenusewill load theX.mmmodule file without automatically loading any modules in theXdirectory, because it is expected thatX.mmhandles the submodules inXmanually. (This is currently the case forvar/loopwhich auto-loads submodules containing loop types on first use).

The completelib/modernish/mdldirectory path, which depends on where modernish is installed, is stored in the system constant$MSH_MDL.

The following subchapters document the modules that come with modernish.

use safe

Thesafemodule sets the 'safe mode' for the shell. It removes most of the need to quote variables, parameter expansions, command substitutions, or glob patterns. It uses shell settings and modernish library functionality to secure and demystify split and glob mechanisms. This creates a new and safer way of shell script programming, essentially building a new shell language dialect while still running on all POSIX-compliant shells.

Why the safe mode?

One of the most common headaches with shell scripting is caused by a fundamental flaw in the shell as a scripting language:constantly active field splitting(a.k.a. word splitting)and pathname expansion (a.k.a. globbing). To cope with this situation, it is hammered into programmers of shell scripts to be absolutely paranoid about properly quotingnearly everything, including variable and parameter expansions, command substitutions, and patterns passed to commands likefind.

These mechanisms were designed for interactive command line usage, where they do come in very handy. But when the shell language is used as a programming language, splitting and globbing often ends up being applied unexpectedly to unquoted expansions and command substitutions, helping cause thousands of buggy, brittle, or outright dangerous shell scripts.

One could blame the programmer for forgetting to quote an expansion properly, orone could blame a pitfall-ridden scripting language design where hammering punctilious and counterintuitive habits into casual shell script programmers is necessary. Modernish does the latter, then fixes it.

How the safe mode works

Every POSIX shell comes with a little-used ability to disable global field splitting and pathname expansion:IFS=''; set -f.An emptyIFSvariable disables split; the-f(or-o noglob) shell option disables pathname expansion. The safe mode sets these, and two others (see below).

The reason these safer settings are hardly ever used is that they are not practical to use with the standard shell language. For instance,for textfile in *.txt,orfor item in $(some command)which both (!) field-splitsandpathname-expands the output of a command, all break.

However, that is where modernish comes in. It introduces several powerful newloop constructs,as well as arbitrary code blocks withlocal settings,each of which has straightforward, intuitive operators for safely applying field splitting orpathname expansion – to specific command arguments only. By default, they arenot bothapplied to the arguments, which is much safer. And your script code as a whole is kept safe from them at all times.

With global field splitting and pathname expansion removed, a third issue still affects the safe mode: the shell'sempty removalmechanism. If the value of an unquoted expansion like$varis empty, it will not expand to an empty argument, but will be removed altogether, as if it were never there. This behaviour cannot be disabled.

Thankfully, the vast majority of shell and Un*x commands order their arguments in a way that is actually designed with empty removal in mind, making it a good thing. For instance, when doingls $option some_dir,if$optionis -lthe listing will be long-format and if is empty it will be removed, which is the desired behaviour. (An empty argument there would cause an error.)

However, one command that is used in almost all shell scripts,test/[, iscompletely unable to cope with empty removaldue to its idiosyncratic and counterintuitive syntax. Potentially empty operands come before options, so operands removed as empty expansions cause errors or, worse, false positives. Thus, the safe mode doesnotremove the need for paranoid quoting of expansions used withtest/[commands. Modernish fixes this issue bydeprecatingtest/[completelyand offering a safe command design to use instead, which correctly deals with empty removal.

With the 'safe mode' shell settings, plus the safe, explicit and readable split and glob operators andtest/[replacements, the only quoting requirements left are:

  1. a very occasional need to stop empty removal from happening;
  2. to quote"$@"and"$*"until shell bugs are fixed (see notes below).

In addition to the above, the safe mode also sets these shell options:

  • set -C(set -o noclobber) to prevent accidentally overwriting files using output redirection. To force overwrite, use>|instead of>.
  • set -u(set -o nounset) to make it an error to use unset (that is, uninitialised) variables by default. You'll notice this will catch many typos before they cause you hard-to-trace problems. To bypass the check for a specific variable, use${var-}instead of$var(be careful).

Important notes for safe mode

  • The safe mode isnotcompatible with existing conventional shell scripts, written in what we could now call the 'legacy mode'. Essentially, the safe mode is a new way of shell script programming. That is why it is not enabled by default, but activated by loading thesafemodule.It is highly recommended that new modernish scripts start out withuse safe.
  • The shell applies entirely different quoting rules to string matching glob patterns withincaseconstructs. The safe mode changes nothing here.
  • Due toshell bugsID'ed asBUG_PP_*,the positional parameters expansions$@and$*should stillalwaysbe quoted. As of late 2018, these bugs have been fixed in the latest or upcoming release versions of all supported shells. But, until buggy versions fall out of use and modernish no longer supports anyBUG_PP_*shell bugs, quoting"$@" and"$*"remains mandatory even in safe mode (unless you know with certainty that your script will be used on a shell with none of these bugs).
  • The behaviour of"$*"changes in safe mode. It uses the first character of$IFSas the separator for combining all positional parameters into one string. SinceIFSis emptied in safe mode, there is no separator, so it will string them together unseparated. You can use something like push IFS; IFS=' '; var= "$*"; pop IFS orLOCAL IFS=' '; BEGIN var= "$*"; END to use the space character as a separator. (If you're outputting the positional parameters, note that the put command always separates its arguments by spaces, so you can safely pass it multiple arguments with"$@"instead.)

Extra options for the safe mode

Usage:use safe[-k|-K] [-i]

The-kand-Kmodule options install an extra handler that reliably kills the program if it tries to execute a command that is not found, on shells that have the ability to catch and handle 'command not found' errors (currently bash, yash, and zsh). This helps catch typos, forgetting to load a module, etc., and stops your program from continuing in an inconsistent state and potentially causing damage. TheMSH_NOT_FOUND_OKvariable may be set to temporarily disable this check. The uppercase-Kmodule option aborts the program on shells that cannot handle 'command not found' errors (so should not be used for portable scripts), whereas the lowercase-kvariant is ignored on such shells.

If the-ioption is given, or the shell is interactive, two extra one-letter functions are loaded,sandg.These are pre-command modifiers for use when split and glob are globally disabled; they allow running a single command with local split and glob applied to that command's arguments only. They also have some options designed to manipulate, examine, save, restore, and generally experiment with the global split and glob state on interactive shells. Type s --helpandg --helpfor more information. In general, the safe mode is designed for scripts and is not recommended for interactive shells.

use var/loop

Thevar/loopmodule provides an innovative, robust and extensible shell loop construct. Several powerful loop types are provided, while advanced shell programmers may find it easy and fun to create their own. This construct is also ideal for the safe mode: thefor,selectandfindloop types allow you to selectively apply field splitting and/or pathname expansion to specific arguments without subjecting a single line of your code to them.

The basic form is a bit different from native shell loops. Note the caps:
LOOPlooptypearguments;DO
your commands here
DONE

The familiardo...doneblock syntax cannot be used because the shell will not allow modernish to add its own functionality to it. The DO...DONEblock does behave in the same way asdo...done:you can append redirections at the end, pipe commands into a loop, etc. as usual. Thebreakandcontinueshell builtin commands also work as normal.

Remember:using lowercasedo...donewith modernishLOOPwill cause the shell to throw a misleading syntax error.So will using uppercase DO...DONEwith the shell's native loops. To help you remember to use the uppercase variants for modernish loops, theLOOPkeyword itself is also in capitals.

Loops exist in submodules ofvar/loopnamed after the loop type; for instance, thefindloop lives in thevar/loop/findmodule. However, the corevar/loopmodule will automatically load a loop type's module when that loop is first used, souse-ing individual loop submodules at your script's startup time is optional.

TheLOOPblock internally uses file descriptor 8 to do its thing. If your script happens to use FD 8 for other purposes, you should know that FD 8 is made local to each loop block, and always appears initially closed withinDO...DONE.

Simple repeat loop

This simply iterates the loop the number of times indicated. Before the first iteration, the argument is evaluated as a shell integer arithmetic expression as inlet and its value used as the number of iterations.

LOOP repeat 3;DO
putln"This line is repeated 3 times."
DONE

BASIC-style arithmeticforloop

This is a slightly enhanced version of the FORloop in BASIC. It is more versatile than therepeatloop but still very easy to use.

LOOP forvarname=initialtolimit[stepincrement]; DO
some commands
DONE

To count from 1 to 20 in steps of 2:

LOOPfori=1 to 20 step 2;DO
putln"$i"
DONE

Note thevarname=initialneeds to be one argument as in a shell assignment (so no spaces around the=).

If "stepincrement"is omitted,incrementdefaults to 1 iflimitis equal to or greater thaninitial,or to -1 iflimitis less than initial(so counting backwards 'just works').

Technically precise description: On entry, theinitial,limitand incrementvalues are evaluated once as shell arithmetic expressions as in let, the value ofinitialis assigned tovarname,and the loop iterates. Before every subsequent iteration, the value ofincrement(as determined on the first iteration) is added to the value ofvarname,then thelimit expression is re-evaluated; as long as the current value ofvarnameis less (ifincrementis non-negative) or greater (ifincrementis negative) than or equal to the current value oflimit,the loop reiterates.

C-style arithmeticforloop

A C-style for loop akin tofor (( ))in ksh93, bash and zsh is now available on all POSIX-compliant shells, with a slightly different syntax. The one loop argument contains three arithmetic expressions (as in let), separated by semicolons within that argument. The first is only evaluated before the first iteration, so is typically used to assign an initial value. The second is evaluated before each iteration to check whether to continue the loop, so it typically contains some comparison operator. The third is evaluated before the second and further iterations, and typically increases or decreases a value. For example, to count from 1 to 10:

LOOPfor"i=1; i<=10; i+=1";DO
putln"$i"
DONE

However, using complex expressions allows doing much more powerful things. Any or all of the three expressions may also be left empty (with their separating;character remaining). If the second expression is empty, it defaults to 1, creating an infinite loop.

(Note that++iandi++can only be used on shells with ARITHPP, buti+=1ori=i+1can be used on all POSIX-compliant shells.)

Enumerativefor/selectloop with safe split/glob

The enumarativeforandselectloop types mirror those already present in native shell implementations. However, the modernish versions provide safe field splitting and globbing (pathname expansion) functionality that can be used without globally enabling split or glob for any of your code – ideal for thesafemode. They also add a unique operator for processing text in fixed-size slices. Theselectloop type brings selectfunctionality to all POSIX shells and not just ksh, zsh and bash.

Usage:

LOOP[for|select] [operators]varnameinargument...; DOcommands;DONE

Simple usage example:

LOOPselect--globtextfilein*.txt;DO
putln"You chose text file$textfile."
DONE

If the loop type isfor,the loop iterates once for eachargument,storing it in the variable namedvarname.

If the loop type isselect,the loop presents before each iteration a numbered menu that allows the user to select one of thearguments. The prompt from thePS3variable is displayed and a reply read from standard input. The literal reply is stored in theREPLYvariable. If the reply was a number corresponding to anargumentin the menu, thatargumentis stored in the variable namedvarname.Then the loop iterates. If the user enters ^D (end of file),REPLYis cleared and the loop breaks with an exit status of 1. (To break the menu loop under other conditions, use thebreakcommand.)

The following operators are supported. Note that the split and glob operators are only for use in thesafe mode.

  • One of--splitor--split=characters.This operator safely applies the shell's field splitting mechanism to thearguments given. The simple --splitoperator applies the shell's default field splitting by space, tab, and newline. If you supply one or more of your owncharactersto split by, each of these characters will be taken as a field separator if it is whitespace, or field terminator if it is non-whitespace. (Note that shells withQRK_IFSFINALtreat both whitespace and non-whitespace characters as separators.)
  • One of--globor--fglob.These operators safely apply shell pathname expansion (globbing) to thearguments given. Eachargumentis taken as a pattern, whether or not it contains any wildcard characters. For any resulting pathname that starts with-or+or is identical to!or (,./is prefixed to keep various commands from misparsing it as an option or operand. Non-matching patterns are treated as follows:
    • --glob:Any non-matching patterns are quietly removed. If none match, the loop will not iterate but break with exit status 103.
    • --fglob:All patterns must match. Any nonexistent path terminates the program. Use this if your program would not work after a non-match.
  • --base=string.This operator prefixes the givenstringto each of the arguments,after first applying field splitting and/or pathname expansion if specified. If--globor--fglobare given, then thestringis used as a base directory path for pathname expansion, without expanding any wildcard characters in that base directory path itself. If such base directory can't be entered, then if--globwas given, the loop breaks with status 98, or if--fglobwas given, the program terminates.
  • One of--sliceor--slice=number.This operator divides the arguments in slices of up tonumbercharacters. The default slice size is 1 character, allowing for easy character-by-character processing. (Note that shells withWRN_MULTIBYTEwill not slice multi-byte characters correctly.)

If multiple operators are given, their mechanisms are applied in the following order: split, glob, base, slice.

Thefindloop

This powerful loop type turns your local POSIX-compliant findutility into a shell loop, safely integrating bothfind andxargsfunctionality into the POSIX shell. The infamous pitfalls and limitations of usingfindandxargsas external commands are gone, as all the results fromfindare readily available to your main shell script. Any "dangerous" characters in file names (including whitespace and even newlines) "just work", especially if the safe mode is also active. This gives you the flexibility to use either thefind expression syntax, or shell commands (including your own shell functions), or some combination of both, to decide whether and how to handle each file found.

Usage:

LOOP find[options]varname[inpath... ] [find-expression];DOcommands;DONE

LOOP find[options]--xargs[=arrayname] [inpath... ] [find-expression];DOcommands;DONE

The loop recursively walks down the directory tree for eachpathgiven. For each file encountered, it uses thefind-expressionto decide whether to iterate the loop with the path to the file stored in the variable referenced byvarname.Thefind-expressionis a standard find utility expression except as described below.

Any number of paths to search may be specified after theinkeyword. By default, a nonexistent path is afatal error. The entireinclause may be omitted, in which case it defaults toin. so the current working directory will be searched. Any argument that starts with a-,or is identical to!or(,indicates the end of thepaths and the beginning of thefind-expression;if you need to explicitly specify a path with such a name, prefix./to it.

Except for syntax errors, any errors or warnings issued byfindare considered non-fatal and will cause the exit status of the loop to be non-zero, so your script has the opportunity to handle the exception.

Availableoptions
  • Any single-letter options supported by your localfindutility. Note that POSIX specifies -Hand-Lonly, so portable scripts should only use these. Options that require arguments (-fon BSDfind) are not supported.
  • --xargs.This operator is specifiedinsteadof thevarname;it is a syntax error to have both. Instead of one iteration per found item, as many items as possible per iteration are stored into the positional parameters (PPs), so your program can access them in the usual way (using"$@"and friends). Note that--xargstherefore overwrites the current PPs (however, a shell function orLOCALblock will give you local PPs). Modernish clears the PPs upon completion of the loop, but if the loop is exited prematurely (such as bybreak), the last chunk survives.
    • On shells with theKSHARRAY capability,an extra variant is available:--xargs=arraynamewhich uses the named array instead of the PPs. It otherwise works identically.
  • --try.If this option is specified, then if one of the primaries used in thefind-expressionis not supported by either thefindutility used by the loop or by modernish itself,LOOP findwill not throw a fatal error but will instead quietly abort the loop without iterating it, set the loop's exit status to 128, and leave the invalid primary in theREPLYvariable. (Expression errors other than 'unknown primary' remain fatal errors.)
  • One of--splitor--split=characters.This operator, which is only accepted in thesafe mode,safely applies the shell's field splitting mechanism to thepathname(s) given(butnot to any patterns in thefind-expression,which are passed on to thefind utility as given).The simple--splitoperator applies the shell's default field splitting by space, tab, and newline. Alternatively, you can supply one or morecharactersto split by. If any pathname resulting from the split starts with-or+or is identical to!or(,./is prefixed.
  • One of--globor--fglob.These operators are only accepted in the safe mode.They safely apply shell pathname expansion (globbing) to thepathname(s) given(butnotto any patterns in thefind-expression,which are passed on to thefindutility as given).Allpathnames are taken as patterns, whether or not they contain any wildcard characters. If any pathname resulting from the expansion start with-or+or is identical to!or(,./is prefixed. Non-matching patterns are treated as follows:
    • --glob:Any pattern not matching an existing path will output a warning to standard error and set the loop's exit status to 103 upon normal completion, even if other existing paths are processed successfully. If none match, the loop will not iterate.
    • --fglob:Any pattern not matching an existing path is a fatal error.
  • --base=basedirectory.This operator prefixes the givenbasedirectory to each of thepathnames (and thus to each path found byfind), after first applying field splitting and/or pathname expansion if specified. If--globor--fglobare given, then wildcard characters are only expanded in thepathnames and not in the prefixedbasedirectory. If thebasedirectorycan't be entered, then either the loop breaks with status 98, or if--fglobwas given, the program terminates.
Availablefind-expressionoperands

LOOP findcan use all expression operands supported by your localfind utility; see its manual page. However, portable scripts should use only operands specified by POSIX along with the modernish additions described below.

The modernish-iterateexpression primary evaluates as true and causes the loop to iterate, executing yourcommandsfor each matching file. It may be used any number of times in thefind-expressionto start a corresponding series of loop iterations. If it is not given, the loop acts as if the entire find-expressionis enclosed in parentheses with-iterateappended. If the entirefind-expressionis omitted, it defaults to-iterate.

The modernish-askprimary asks confirmation of the user. The text of the prompt may be specified in one optional argument (which cannot start with- or be equal to!or(). Any occurrences of the characters{}within the prompt text are replaced with the current pathname. If not specified, the default prompt is:"{}"?If the answer is affirmative (yorYin the POSIX locale),-askyields true, otherwise false. This can be used to make any part of the expression conditional upon user input, and (unlike commands in the shell loop body) is capable of influencing directory traversal mid-run.

The standard-execand-okprimaries are integrated into the main shell environment. When used withLOOP find,they can call a shell builtin command or your own shell function directly in the main shell (no subshell). Its exit status is used in thefindexpression as a true/false value capable of influencing directory traversal (for example, when combined with-prune), just as if it were an external command -exec'ed with the standard utility.

Some familiar, easy-to-use but non-standardfindoperands from GNU and/or BSD may be used withLOOP findon all systems. Before invoking thefind utility, modernish translates them internally to portable equivalents. The following expression operands are made portable:

  • The-or,-andand-notoperators: same as-o,-a,!.
  • The-trueand-falseprimaries, which always yield true/false.
  • The BSD-style-depthnprimary, e.g.-depth +4yields true on depth greater than 4 (minimum 5),-depth -4yields true on depth less than 4 (maximum 3), and-depth 4yields true on a depth of exactly 4.
  • The GNU-style-mindepthand-maxdepthglobal options. Unlike BSD-depth,these GNU-isms are pseudo-primaries that always yield true and affect the entireLOOP findoperation.

Expression primaries that write output (-printand friends) may be used for debugging or logging the loop. Their output is redirected to standard error.

Picking afindutility

Upon initialisation, thevar/loop/findmodule searches for a POSIX-compliant findutility under various names in$DEFPATHand then in$PATH.To see a trace of the full command lines of utility invocations when the loop runs, set the_loop_DEBUGvariable to any value.

For debugging or system-specific usage, it is possible to use a certainfind utility in preference to any others on the system. To do this, add an argument to ause var/loop/findcommand before the first use of the loop. For example:

  • use var/loop/find bsdfind(prefer utility by this name)
  • use var/loop/find /opt/local/bin(look for a utility here first)
  • use var/loop/find /opt/local/bin/gfind(try this one first)
Compatibility mode for obsoletefindutilities

Some systems come with obsolete or brokenfindutilities that don't fully support-exec... {} +aggregating functionality as specified by POSIX. Normally, this is a fatal error, but passing the-b/-Boption to the usecommand, e.g.use var/loop/find -b,enables a compatibility mode that tolerates this defect. If no compliantfindis found, then an obsolete or brokenfindis used as a last resort, a warning is printed to standard error, and the variable_loop_find_brokenis set. The-Boption is equivalent to-bbut does not print a warning. Loop performance may suffer as modernish adapts to using olderexec... {} \;which is very inefficient.

Scripts using this compatibility mode should handle their logic using shell code in the loop body as much as possible (afterDO) and use only simple findexpressions (beforeDO), as obsolete utilities are often buggy and breakage is likely if complex expressions or advanced features are used.

findloop usage examples

Simple example script: without the safe mode, the*.txtpattern must be quoted to prevent it from being expanded by the shell.

.modernish
use var/loop
LOOP find TextFilein~/Documents -name'*.txt'
DO
putln"Found my text file:$TextFile"
DONE

Example script withsafe mode:the--globoption expands the patterns of theinclause, butnotthe expression – so it is not necessary to quote any pattern.

.modernish
use safe
use var/loop
LOOP find --glob lsProgin/*bin /*/*bin -type f -name ls*
DO
putln"This command may list something:$lsProg"
DONE

Example use of the modernish-askprimary: ask the user if they want to descend into each directory found. The shell loop body could skip unwanted results, but cannot physically influence directory traversal, so skipping large directories would take long. Afindexpression can prevent directory traversal using the standard-pruneprimary, which can be combined with -ask,so that unwanted directories never iterate the loop in the first place.

.modernish
use safe
use var/loop
LOOP find filein~/Documents \
-type d\(-ask'Descend into "{}" directory?'-or -prune\)\
-or -iterate
DO
put"File found:"
ls -li$file
DONE

Creating your own loop

The modernish loop construct is extensible. To define a new loop type, you only need to define a shell function called_loopgen_typewheretype is the loop type. This function, called theloop iteration generator,is expected to output lines of text to file descriptor 8, containing properly shell-quoted iteration commands for the shell to run, one line per iteration.

The internal commands expanded fromLOOP,DOandDONE(which are defined as aliases) launch that loop iteration generator function in the background withsafemode enabled, while causing the main shell to read lines from that background process through a pipe, evaling each line as a command before iterating the loop. As long as that iteration command finishes with an exit status of zero, the loop keeps iterating. If it has a nonzero exit status or if there are no more commands to read, iteration terminates and execution continues beyond the loop.

Instead of the normalinternal namespace which is considered off-limits for modernish scripts,var/loopand its submodules use a_loop_*internal namespace for variables, which is also for use by user-implemented loop iteration generator functions.

The above is just the general principle. For the details, study the comments and the code inlib/modernish/mdl/var/loop.mmand the loop generators in lib/modernish/mdl/var/loop/*.mm.

use var/local

This module defines a newLOCAL...BEGIN...ENDshell code block construct with local variables, local positional parameters and local shell options. The local positional parameters can be filled using safe field splitting and pathname expansion operators similar to those in theLOOP construct describedabove.

Usage:LOCAL[localitem|operator... ] [--[word... ] ]; BEGINcommands;END

Thecommandsare executed once, with the specifiedlocalitems applied. Eachlocalitemcan be:

  • A variable name with or without a=immediately followed by a value. This renders that variable local to the block, initially either unsetting it or assigning the value, which may be empty.
  • A shell option letter immediately preceded by a-or+sign. This locally turns that shell option on or off, respectively. This follows the counterintuitive syntax ofset.Long-form shell options like-o optionnameand+ooptionnameare also supported. It depends on the shell what options are supported. Specifying a nonexistent option is a fatal error. Usethisshellhasto check for a non-POSIX option's existence on the current shell before using it.

Modernish implementsLOCALblocks as one-time shell functions that use the stack to save and restore variables and settings. So thereturncommand exits the block, causing the global variables and settings to be restored and resuming execution at the point immediately followingEND.Like any shell function, a LOCALblock exits with the exit status of the last command executed within it, or with the status passed on by or given as an argument toreturn.

The positional parameters ($@,$1,etc.) are always local to the block, but a copy is inherited from outside the block by default. Any changes to the positional parameters made within the block will be discarded upon exiting it.

However, if a double-dash--argument is given in theLOCALcommand line, the positional parameters outside the block are ignored and the set ofwords after--(which may be empty) becomes the positional parameters instead.

Thesewords can be modified prior to entering theLOCALblock using the followingoperators. The safe glob and split operators are only accepted in thesafe mode.The operators are:

  • One of--splitor--split=characters.This operator safely applies the shell's field splitting mechanism to thewords given. The simple --splitoperator applies the shell's default field splitting by space, tab, and newline. If you supply one or more of your owncharactersto split by, each of these characters will be taken as a field separator if it is whitespace, or field terminator if it is non-whitespace. (Note that shells withQRK_IFSFINALtreat both whitespace and non-whitespace characters as separators.)
  • One of--globor--fglob.These operators safely apply shell pathname expansion (globbing) to thewords given. Eachwordis taken as a pattern, whether or not it contains any wildcard characters. For any resulting pathname that starts with-or+or is identical to!or(,./ is prefixed to keep various commands from misparsing it as an option or operand. Non-matching patterns are treated as follows:
    • --glob:Any non-matching patterns are quietly removed.
    • --fglob:All patterns must match. Any nonexistent path terminates the program. Use this if your program would not work after a non-match.
  • --base=string.This operator prefixes the givenstringto each of the words, after first applying field splitting and/or pathname expansion if specified. If--globor--fglobare given, then thestringis used as a base directory path for pathname expansion, without expanding any wildcard characters in that base directory path itself. If such base directory can't be entered, then if--globwas given, all words are removed, or if--fglobwas given, the program terminates.
  • One of--sliceor--slice=number.This operator divides the words in slices of up tonumbercharacters. The default slice size is 1 character, allowing for easy character-by-character processing. (Note that shells withWRN_MULTIBYTEwill not slice multi-byte characters correctly.)

If multiple operators are given, their mechanisms are applied in the following order: split, glob, base, slice.

Importantvar/localusage notes

  • Due to the limitations of aliases and shell reserved words,LOCALhas to use its ownBEGIN...ENDblock instead of the shell'sdo...done. Using the latter results in a misleading shell syntax error.
  • LOCALblocks donotmix well with use of the shell capability LOCALVARS (shell-native functionality for local variables), especially not on shells withQRK_LOCALUNSorQRK_LOCALUNS2.Using both with the same variables causes unpredictable behaviour, depending on the shell.
  • Warning!Never usebreakorcontinuewithin aLOCALblock to resume or break from enclosing loops outside the block! Shells with QRK_BCDANGERallow this, preventingENDfrom restoring the global settings and corrupting the stack; shells without this quirk will throw an error if you try this. A proper way to do what you want is to exit the block with a nonzero status using something like return 1,then append something like|| breakor|| continueto END.Note that this caveat only applies when crossingBEGIN...END boundaries. Usingcontinueandbreakto continue or break loops entirelywithinthe block is fine.

use var/arith

These shortcut functions are alternatives for using let.

Arithmetic operator shortcuts

inc,dec,mult,div,mod:simple integer arithmetic shortcuts. The first argument is a variable name. The optional second argument is an arithmetic expression, but a sane default value is assumed (1 for inc and dec, 2 for mult and div, 256 for mod). For instance,inc Xis equivalent toX=$((X+1))andmult X Y-2is equivalent toX=$((X*(Y-2))).

ndivis likedivbut with correct rounding down for negative numbers. Standard shell integer division simply chops off any digits after the decimal point, which has the effect of rounding down for positive numbers and rounding up for negative numbers.ndivconsistently rounds down.

Arithmetic comparison shortcuts

These have the same name as theirtest/[option equivalents. Unlike withtest,the arguments are shell integer arith expressions, which can be anything from simple numbers to complex expressions. As with$(( )), variable names are expanded to their values even without the$.

Function: Returns successfully if:
eq <expr> <expr> the two expressions evaluate to the same number
ne <expr> <expr> the two expressions evaluate to different numbers
lt <expr> <expr> the 1st expr evaluates to a smaller number than the 2nd
le <expr> <expr> the 1st expr eval's to smaller than or equal to the 2nd
gt <expr> <expr> the 1st expr evaluates to a greater number than the 2nd
ge <expr> <expr> the 1st expr eval's to greater than or equal to the 2nd

use var/assign

This module is provided to solve a common POSIX shell language annoyance: in a normal shell variable assignment, only literal variable names are accepted, so it is impossible to use a variable whose name is stored in another variable. The only way around this is to useevalwhich is too difficult to use safely. Instead, you can now use theassigncommand.

Usage:assign[ [+r]variable=value... ] | [-rvariable=variable2... ]...

assignsafely processes assignment-arguments in the same form as customarily given to thereadonlyandexportcommands, but it only assignsvalues to variables without setting any attributes. Each argument is grammatically an ordinary shell word, so any part or all of it may result from an expansion. The absence of a=character in any argument is a fatal error. The text preceding the first=is taken as the variable name in which to store thevalue;an invalidvariablename is a fatal error. No whitespace is accepted before the =and any whitespace after the=is part of thevalueto be assigned.

The-r(reference) option causes the part to the right of the=to be taken as a second variable namevariable2,and its value is assigned to variableinstead.+rturns this option back off.

Examples:Each of the lines below assigns the value 'hello world' to the variablegreeting.

var=greeting;assign$var='hello world'
var=greeting;assign"$var=hello world"
tag='greeting=hello world';assign"$tag"
var=greeting;gvar=myinput;myinput='hello world';assign -r$var=$gvar

use var/readf

readfreads arbitrary data from standard input into a variable until end of file, converting it into a format suitable for passing to the printf utility. For example,readf var <foo; printf "$var" >barwill copy foo to bar. Thus,readfallows storing both text and binary files into shell variables in a textual format suitable for manipulation with standard shell facilities.

All non-printable, non-ASCII characters are converted toprintfoctal or one-letter escape codes, except newlines. Not encoding newline characters allows for better processing by line-based utilities such asgrep,sed, awk,etc. However, if the file ends in a newline, that final newline is encoded to\nto protect it from being stripped by command substitutions.

Usage:readf[-h]varname

The-hoption disables conversion of high-byte characters (accented letters, non-Latin scripts). Do not use for binary files; this is only guaranteed to work for text files in an encoding compatible with the current locale.

Caveats:

  • Best for small-ish files. The encoded file is stored in memory (a shell variable). For a binary file, encoding inprintfformat typically about doubles the size, though it could be up to four times as large.
  • If the shell executing your program does not haveprintfas a builtin command, the externalprintfcommand will fail if the encoded file size exceeds the maximum length of arguments to external commands (getconf ARG_MAXwill obtain this limit for your system). Shell builtin commands do not have this limit. Check for aprintfbuiltin using thisshellhasif you need to be sure, and alwaysharden printf!

use var/shellquote

This module provides an efficient, fast, safe and portable shell-quoting algorithm for quoting arbitrary data in such a way that the quoted values are safe to pass to the shell for parsing as string literals. This is essential for any context where the shell must grammatically parse untrusted input, such as when supplying arbitrary values totraporeval.

The shell-quoting algorithm is optimised to minimise exponential growth when quoting repeatedly. By default, it also ensures that quoted strings are always one single printable line, making them safe for terminal output and processing by line-oriented utilities.

shellquote

Usage:shellquote[-f|+f|-P|+P]varname[=value]...

The values of the variables specified by name are shell-quoted and stored back into those variables. Repeating a variable name will add another level of shell-quoting. If a=plus avalue(which may be empty) is appended to thevarname, that value is shell-quoted and assigned to the variable.

Options modify the algorithm for variable names following them, as follows:

  • By default, newlines and any control characters are converted into ${CC*} expansions and quoted with double quotes, ensuring that the quoted string consists of a single line of printable text. The-Poption forces pure POSIX quoted strings that may span multiple lines;+Pturns this back off.

  • By default, a value is only quoted if it contains characters not present in$SHELLSAFECHARS.The-foption forces unconditional quoting, disabling optimisations that may leave shell-safe characters unquoted; +fturns this back off.

shellquotewilldieif you attempt to quote an unset variable (because there is no value to quote).

shellquoteparams

Theshellquoteparamscommand shell-quotes the current positional parameters in place using the default quoting method ofshellquote.No options are supported and any attempt to add arguments results in a syntax error.

use var/stack

Modules that extendthe stack.

use var/stack/extra

This module contains stack query and maintenance functions.

If you only need one or two of these functions, they can also be loaded as individual submodules ofvar/stack/extra.

For the four functions below,itemcan be:

  • a valid portable variable name
  • a short-form shell option: dash plus letter
  • a long-form shell option:-ofollowed by an option name (two arguments)
  • --trap=SIGNAMEto refer to the trap stack for the indicated signal (as set bypushtrapfromvar/stack/trap)

stackempty[--key=value] [--force]item:Tests if the stack for an item is empty. Returns status 0 if it is, 1 if it is not. The key feature works as inpop:by default, a key mismatch is considered equivalent to an empty stack. If--forceis given, this function ignores keys altogether.

clearstack[--key=value] [--force]item[item... ]: Clears one or more stacks, discarding all items on it. If (part of) the stack is keyed or a--keyis given, only clears until a key mismatch is encountered. The--forceoption overrides this and always clears the entire stack (be careful, e.g. don't use within LOCAL...BEGIN...END). Returns status 0 on success, 1 if that stack was already empty, 2 if there was nothing to clear due to a key mismatch.

stacksize[--silent|--quiet]item:Leaves the size of a stack in theREPLYvariable and, if option--silentor--quietis not given, writes it to standard output. The size of the complete stack is returned, even if some values are keyed.

printstack[--quote]item:Outputs a stack's content. Option--quoteshell-quotes each stack value before printing it, allowing for parsing multi-line or otherwise complicated values. Column 1 to 7 of the output contain the number of the item (down to 0). If the item is set, column 8 and 9 contain a colon and a space, and if the value is non-empty or quoted, column 10 and up contain the value. Sets of values that were pushed with a key are started with a special line containing--- key:value.A subsequent set pushed with no key is started with a line containing--- (key off). Returns status 0 on success, 1 if that stack is empty.

use var/stack/trap

This module providespushtrapandpoptrap.These functions integrate with themain modernish stack to make traps stack-based, so that each program component or library module can set its own trap commands without interfering with others.

This module also provides a new DIEpseudosignal that allows pushing traps to execute when die is called.

Note an important difference between the trap stack and stacks for variables and shell options: pushing traps does not save them for restoring later, but adds them alongside other traps on the same signal. All pushed traps are active at the same time and are executed from last-pushed to first-pushed when the respective signal is triggered. Traps cannot be pushed and popped usingpushandpopbut use dedicated commands as follows.

Usage:

  • pushtrap[--key=value] [--nosubshell] [--]commandsigspec[sigspec... ]
  • poptrap[--key=value] [-R] [--]sigspec[sigspec... ]

pushtrapworks like regulartrap,with the following exceptions:

  • Adds traps for a signal without overwriting previous ones.
  • An invalid signal is a fatal error. When using non-standard signals, check if thisshellhas --sig=yoursignal before using it.
  • Unlike regular traps, a stack-based trap does not cause a signal to be ignored. Setting one will cause it to be executed upon the shell receiving that signal, but after the stack traps complete execution, modernish re-sends the signal to the main shell, causing it to behave as if no trap were set (unless a regular POSIX trap is also active). Thus,pushtrapdoes not accept an emptycommandas it would be pointless.
  • Each stack trap is executed in a new subshell to keep it from interfering with others. This means a stack trap cannot change variables except within its own environment, andexitwill only exit the trap and not the program. The--nosubshelloption overrides this behaviour, causing that particular trap to be executed in the main shell environment instead. This is not recommended if not absolutely needed, as you have to be extra careful to avoid exiting the shell or otherwise interfere with other stack traps. This option cannot be used with DIEtraps.
  • Each stack trap is executed with$?initially set to the exit status that was active at the time the signal was triggered.
  • Stack traps do not have access to the positional parameters.
  • pushtrapstores current$IFS(field splitting) and$-(shell options) along with the pushed trap. Within the subshell executing each stack trap, modernish restoresIFSand the shell optionsf(noglob),u (nounset) andC(noclobber) to the values in effect during the correspondingpushtrap.This is to avoid unexpected effects in case a trap is triggered while temporary settings are in effect. The--nosubshelloption disables this functionality for the trap pushed.
  • The--keyoption applies the keying functionality inherited from plainpushto the trap stack. It works the same way, so the description is not repeated here.

poptraptakes just signal names or numbers as arguments. It takes the last-pushed trap for each signal off the stack. By default, it discards the trap commands. If the-Roption is given, it stores commands to restore those traps into theREPLYvariable, in a format suitable for re-entry into the shell. Again, the--keyoption works as in plainpop.

With the sole exception of DIEtraps, all stack-based traps, like native shell traps, are reset upon entering a subshell. However, commands for printing traps will print the traps for the parent shell, until anothertrap,pushtraporpoptrapcommand is invoked, at which point all memory of the parent shell's traps is erased.

Trap stack compatibility considerations

Modernish tries hard to avoid incompatibilities with existing trap practice. To that end, it intercepts the regular POSIXtrapcommand using an alias, reimplementing and interfacing it with the shell's builtin trap facility so that plain old regular traps play nicely with the trap stack. You should not notice any changes in the POSIXtrapcommand's behaviour, except for the following:

  • The regulartrapcommand does not overwrite stack traps (but still overwrites existing regular traps).
  • Unlike zsh's native trap command, signal names are case insensitive.
  • Unlike dash's native trap command, signal names may have theSIGprefix; that prefix is quietly accepted and discarded.
  • Setting an empty trap action to ignore a signal only works fully (passing the ignoring on to child processes) if there are no stack traps associated with the signal; otherwise, an empty trap action merely suppresses the signal's default action for the current process – e.g., after executing the stack traps, it keeps the shell from exiting.
  • Thetrapcommand with no arguments, which prints the traps that are set in a format suitable for re-entry into the shell, now also prints the stack traps aspushtrapcommands. (bashusers might notice theSIG prefix is not included in the signal names written.)
  • The bash/yash-style-poption, including its yash-style--print equivalent, is now supported on all shells. If further arguments are given after that option, they are taken as signal specifications and only the commands to recreate the traps for those signals are printed.
  • Saving the traps to a variable using command substitution (as in: var=$(trap)) now works on every shell supported by modernish, including (d)ash, mksh and zsh which don't support this natively.
  • To reset (unset) a trap, the modernishtrapcommand accepts both valid POSIX syntax and legacy bash/(d)ash/zsh syntax, liketrap INTto unset aSIGINT trap (which only works if thetrapcommand is given exactly one argument). Note that this is for compatibility with existing scripts only.
  • Bypassing thetrapalias to set a trap using the shell builtin command will cause an inconsistent state. This may be repaired with a simpletrap command; as modernish prints the traps, it will quietly detect ones it doesn't yet know about and make them work nicely with the trap stack.

POSIX traps for each signal are always executed after that signal's stack-based traps; this means they should not rely on modernish modules that use the trap stack to clean up after themselves on exit, as those cleanups would already have been done.

The newDIEpseudosignal

Thevar/stack/trapmodule adds newDIEpseudosignal whose traps are executed upon invokingdie. This allows for emergency cleanup operations upon fatal program failure, asEXITtraps cannot be executed afterdieis invoked.

  • On non-interactive shells (as well as subshells of interactive shells),DIEis its own pseudosignal with its own trap stack and POSIX trap. In order to kill the malfunctioning program as quickly as possible (hopefully before it has a chance to delete all your data),die doesn't wait for those traps to complete before killing the program. Instead, it executes eachDIEtrap simultaneously as a background job, then gathers the process IDs of the main shell and all its subprocesses, sendingSIGKILL to all of them except anyDIEtrap processes. Unlike other traps,DIE traps are inherited by and survive in subshell processes, andpushtrapmay add to them within the subshell. Whatever shell process invokesdiewill fork allDIEtrap actions before beingSIGKILLed itself. (Note that any DIEtraps pushed or set within a subshell will still be forgotten upon exiting the subshell.)
  • On an interactive shell (notincluding its subshells), DIEis simply an alias forINT,andINTtraps (both POSIX and stack) are cleared out after executing them once. This is becausedieusesSIGINTfor command interruption on interactive shells, and it would not make sense to execute emergency cleanup commands repeatedly. As a side effect of this special handling,INTtraps on interactive shells do not have access to the positional parameters and cannot return from functions.

use var/string

String comparison and manipulation functions.

use var/string/touplow

toupperandtolower:convert case in variables.

Usage:

  • touppervarname[varname... ]
  • tolowervarname[varname... ]

Arguments are taken as variable names (note: they should be given without the$) and case is converted in the contents of the specified variables, without reading input or writing output.

toupperandtolowertry hard to use the fastest available method on the particular shell your program is running on. They use built-in shell functionality where available and working correctly, otherwise they fall back on running an external utility.

Which external utility is chosen depends on whether the current locale uses the Unicode UTF-8 character set or not. For non-UTF-8 locales, modernish assumes the POSIX/C locale andtris always used. For UTF-8 locales, modernish tries hard to find a way to correctly convert case even for non-Latin alphabets. A few shells have this functionality built in with typeset.The rest need an external utility. Modernish initialisation triestr,awk,GNUawkand GNUsedbefore giving up and setting the variableMSH_2UP2LOW_NOUTF8.Ifisset MSH_2UP2LOW_NOUTF8,it means modernish is in a UTF-8 locale but has not found a way to convert case for non-ASCII characters, sotoupperandtolowerwill convert only ASCII characters and leave any other characters in the string alone.

use var/string/trim

trim:strip whitespace from the beginning and end of a variable's value. Whitespace is defined by the[:space:]character class. In the POSIX locale, this is tab, newline, vertical tab, form feed, carriage return, and space, but in other locales it may be different. (On shells withBUG_NOCHCLASS, $WHITESPACE is used to define whitespace instead.) Optionally, a string of literal characters can be provided in the second argument. Any characters appearing in that string will then be trimmed instead of whitespace. Usage:trimvarname[characters]

use var/string/replacein

replacein:Replace leading,-trailing or-all occurrences of a string by another string in a variable.
Usage:replacein[-t|-a]varnameoldstringnewstring

use var/string/append

appendandprepend:Append or prepend zero or more strings to a variable, separated by a string of zero or more characters, avoiding the hairy problem of dangling separators. Usage:append|prepend[--sep=separator] [-Q]varname[string... ]
If the separator is not specified, it defaults to a space character. If the-Qoption is given, eachstringis shell-quoted before appending or prepending.

use var/unexport

Theunexportfunction clears the "export" bit of a variable, conserving its value, and/or assigns values to variables without setting the export bit. This works even ifset -a(allexport) is active, allowing an "export all variables, except these" way of working.

Usage is likeexport,with the caveat that variable assignment arguments containing non-shell-safe characters or expansions must be quoted as appropriate, unlike in some specific shell implementations ofexport. (To get rid of that headache,use safe.)

Unlikeexport,unexportdoes not work for read-only variables.

use var/genoptparser

As thegetoptsbuiltin is not portable when used in functions, this module provides a command that generates modernish code to parse options for your shell function in a standards-compliant manner. The generated parser supports short-form (one-character) options which can be stacked/combined.

Usage: generateoptionparser[-o] [-ffunc] [-vvarprefix] [-noptions] [ -aoptions] [varname]

  • -o:Write parser to standard output.
  • -f:Function name to prefix to error messages. Default: none.
  • -v:Variable name prefix for options. Default:opt_.
  • -n:String of options that do not take arguments.
  • -a:String of options that require arguments.
  • varname:Store parser in specified variable. Default:REPLY.

At least one of-nand-ais required. All other arguments are optional. Option characters must be valid components of portable variable names, so they must be ASCII upper- or lowercase letters, digits, or the underscore.

generateoptionparserstores the generated parser code in a variable: either REPLYor thevarnamespecified as the first non-option argument. This makes it possible to generate and use the parser on the fly with a command like eval "$REPLY"immediately following thegenerateoptionparserinvocation.

For better efficiency and readability, it will often be preferable to insert the option parser code directly into your shell function instead. The-o option writes the parser code to standard output, so it can be redirected to a file, inserted into your editor, etc.

Parsed options are shifted out of the positional parameters while setting or unsetting corresponding variables, until a non-option argument, a-- end-of-options delimiter argument, or the end of arguments is encountered. Unlike withgetopts,no additionalshiftcommand is required.

Each specified option gets a corresponding variable with a name consisting of thevarprefix(default:opt_) plus the option character. If an option is not passed to your function, the parser unsets its variable; otherwise it sets it to either the empty value or its option-argument if it requires one. Thus, your function can check if any optionxwas given using isset, for example,if isset opt_x; then...

use sys/base

Some very common and essential utilities are not specified by POSIX, differ widely among systems, and are not always available. For instance, the whichandreadlinkcommands have incompatible options on various GNU and BSD variants and may be absent on other Unix-like systems. Thesys/base module provides a complete re-implementation of such non-standard but basic utilities, written as modernish shell functions. Using the modernish version of these utilities can help a script to be fully portable. These versions also have various enhancements over the GNU and BSD originals, some of which are made possible by their integration into the modernish shell environment.

use sys/base/mktemp

A cross-platform shell implementation ofmktempthat aims to be just as safe as nativemktemp(1) implementations, while avoiding the problem of having various mutually incompatible versions and adding several unique features of its own.

Creates one or more unique temporary files, directories or named pipes, atomically (i.e. avoiding race conditions) and with safe permissions. The path name(s) are stored inREPLYand optionally written to stdout.

Usage:mktemp[-dFsQCt] [template... ]

  • -d:Create a directory instead of a regular file.
  • -F:Create a FIFO (named pipe) instead of a regular file.
  • -s:Silent. Store output in$REPLY,don't write any output or message.
  • -Q:Shell-quote each unit of output. Separate by spaces, not newlines.
  • -C:Automated cleanup. Pushes a trap to remove the files on exit. On an interactive shell, that's all this option does. On a non-interactive shell, the following applies: Clean up on receiving SIGPIPEandSIGTERMas well. On receivingSIGINT,clean up if the option was given at least twice, otherwise notify the user of files left. On the invocation of die, clean up if the option was given at least three times, otherwise notify the user of files left.
  • -t:Prefix one temporary files directory to all thetemplates: $XDG_RUNTIME_DIRor$TMPDIRif set, or/tmp.Thetemplates may not contain any slashes. If the template has neither any trailing Xes nor a trailing dot, a dot is added before the random suffix.

The template defaults to “/tmp/temp.”.An suffix of random shell-safe ASCII characters is added to the template to create the file. For compatibility with othermktempimplementations, any optional trailingXcharacters in the template are removed. The length of the suffix will be equal to the amount of Xes removed, or 10, whichever is more. The longer the random suffix, the higher the security of usingmktempin a shared directory such astmp.

Since/tmpis a world-writable directory shared by other users, for best security it is recommended to create a private subdirectory usingmktemp -d and work within that.

Option-Ccannot be used without option-swhen in a subshell. Modernish will detect this and treat it as a fatal error. The reason is that a typical command substitution like tmpfile=$(mktemp -C) is incompatible with auto-cleanup, as the cleanup EXIT trap would be triggered not upon exiting the program but upon exiting the command substitution subshell that just ranmktemp,thereby immediately undoing the creation of the file. Instead, do something like: mktemp -sC; tmpfile=$REPLY

This module depends on the trap stack to do auto-cleanup (the-Coption), so it will automaticallyuse var/stack/trapon initialisation.

use sys/base/readlink

readlinkreads the target of a symbolic link, robustly handling strange filenames such as those containing newline characters. It stores the result in theREPLYvariable and optionally writes it on standard output.

Usage:readlink[-nsefmQ]path[path... ]

  • -n:If writing output, don't add a trailing newline. This does not remove the separating newlines if multiplepaths are given.
  • -s:Silent operation: don't write output, only store it inREPLY.
  • -e,-f,-m:Canonicalise. Convert eachpathfound into a canonical and absolute path that can be used starting from any working directory. Relativepaths are resolved starting from the present working directory. Double slashes are removed. Any special pathname components .and..are resolved. All symlinks encountered are followed, but apathdoes not need to contain any symlinks. UNC network paths (as on Cygwin) are supported. These options differ as follows:
    • -e:All pathname components must exist to produce a result.
    • -f:All but the last pathname component must exist to produce a result.
    • -m:No pathname component needs to exist; this always produces a result. Nonexistent pathname components are simulated as regular directories.
  • -Q:Shell-quote each unit of output. Separate by spaces instead of newlines. This generates a list of arguments in shell syntax, guaranteed to be suitable for safe parsing by the shell, even if the resulting pathnames should contain strange characters such as spaces or newlines and other control characters.

The exit status ofreadlinkis 0 on success and 1 if thepatheither is not a symlink, or could not be canonicalised according to the option given.

use sys/base/rev

revcopies the specified files to the standard output, reversing the order of characters in every line. If no files are specified, the standard input is read.

Usage: likerevon Linux and BSD, which is likecatexcept that-is a filename and does not denote standard input. No options are supported.

use sys/base/seq

A cross-platform implementation ofseqthat is more powerful and versatile than native GNU and BSDseq(1) implementations. The core is written in bc,the POSIX arbitrary-precision calculator language. That means this seqinherits the capacity to handle numbers with a precision and size only limited by computer memory, as well as the ability to handle input numbers in any base from 1 to 16 and produce output in any base 1 and up.

Usage:seq[-w] [-L] [-fformat] [-sstring] [-Sscale] [-Bbase] [-bbase] [first[incr] ]last

seqprints a sequence of arbitrary-precision floating point numbers, one per line, fromfirst(default 1), to as nearlastas possible, in increments of incr(default 1). Iffirstis larger thanlast,the defaultincris -1. Anincrof zero is treated as a fatal error.

  • -w:Equalise width by padding with leading zeros. The longest of the first,incrorlastarguments is taken as the length that each output number should be padded to.
  • -L:Use the current locale's radix point in the output instead of the full stop (.).
  • -f:printf-style floating-point format. The format string is passed on (with an added\n) toawk's builtinprintffunction. Because of that, the-foption can only be used if the output base is 10. Note thatawk's floating point precision is limited, so very large or long numbers will be rounded.
  • -s:Instead of writing one number per line, write all numbers on one line separated bystringand terminated by a newline character.
  • -S:Explicitly set the scale (number of digits after the radix point). Defaults to the largest number of digits after the radix point among thefirst,incrorlastarguments.
  • -B:Set input and output base from 1 to 16. Defaults to 10.
  • -b:Set arbitrary output base from 1. Defaults to input base. See thebc(1) manual for more information on the output format for bases greater than 16.

The-S,-Band-boptions take shell integer numbers as operands. This means a leading0Xor0xdenotes a hexadecimal number and a leading0 denotes an octal number.

For portability reasons, modernishsequses a full stop (.) for the radix point,regardless of the system locale. This applies both to command arguments and to output. The-Loption causesseqto use the current locale's radix point character for output only.

Differences with GNU and BSDseq

The-S,-Band-boptions are modernish innovations. The-w,-fand-soptions are inspired by GNU and BSDseq. The following differences apply:

  • Like GNU and unlike BSD, the separator specified by the-soption is not appended to the final number and there is no-toption to add a terminator character.
  • Like GNU and unlike BSD, the-soption-argument is taken as literal characters and is not parsed for backslash escape codes like\n.
  • Unlike GNU and like BSD, the output radix point defaults to a full stop, regardless of the current locale.
  • Unlike GNU and like BSD, ifincris not specified, it defaults to -1 iffirst>last,1 otherwise. For example,seq 5 1counts backwards from 5 to 1, and specifyingseq 5 -1 1as with GNU is not needed.
  • Unlike GNU and like BSD, anincrof zero is not accepted. To output the same number or string infinite times, use yesinstead.
  • Unlike both GNU and BSD, the-foption accepts any format specifiers accepted byawk'sprintf()function.

Thesys/base/seqmodule depends on, and automatically loads, var/string/touplow.

use sys/base/shuf

Shuffle lines of text. A portable reimplementation of a commonly used GNU utility.

Usage:

  • shuf[-nmax] [-rrfile]file
  • shuf[-nmax] [-rrfile]-ilow-high
  • shuf[-nmax] [-rrfile]-eargument...

By default,shufreads lines of text from standard input, or fromfile (thefile-signifies standard input). It writes the input lines to standard output in random order.

  • -i:Use sequence of non-negative integerslowthroughhighas input.
  • -e:Instead of reading input, use thearguments as lines of input.
  • -n:Output a maximum ofmaxlines.
  • -r:Userfileas the source of random bytes. Defaults to/dev/urandom.

Differences with GNUshuf:

  • Long option names are not supported.
  • The-o/--output-fileoption is not supported; use output redirection. Safely shuffling files in-place is not supported; use a temporary file.
  • --random-source=fileis changed to-rfile.
  • The-z/--zero-terminatedoption is not supported.

use sys/base/tac

tac(the reverse ofcat) is a cross-platform reimplementation of the GNU tacutility, with some extra features.

Usage:tac[-rbBP] [-Sseparator]file[file... ]

tacoutputs thefiles in reverse order of lines/records. Iffileis-or is not given,tacreads from standard input.

  • -s:Specify the record (line) separator. Default: linefeed.
  • -r:Interpret the record separator as an extended regular expression. This allows using separators that may vary. Each separator is preserved in the output as it is in the input.
  • -b:Assume the separator comes before each record in the input, and also output the separator before each record. Cannot be combined with-B.
  • -B:Assume the separator comes after each record in the input, but output the separator before each record. Cannot be combined with-b.
  • -P:Paragraph mode: output text last paragraph first. Input paragraphs are separated from each other by at least two linefeeds. Cannot be combined with any other option.

Differences between GNUtacand modernishtac:

  • The-Band-Poptions were added.
  • The-roption interprets the record separator as an extended regular expression. This is an incompatibility with GNUtacunless expressions are used that are valid as both basic and extended regular expressions.
  • In UTF-8 locales, multi-byte characters are recognised and reversed correctly.

use sys/base/which

The modernishwhichutility finds external programs and reports their absolute paths, offering several unique options for reporting, formatting and robust processing. The default operation is similar to GNUwhich.

Usage:which[-apqsnQ1f] [-Pnumber]program[program... ]

By default,whichfinds the first available path to each givenprogram. Ifprogramis itself a path name (contains a slash), only that path's base directory is searched; if it is a simple command name, the current$PATH is searched. Any relative paths found are converted to absolute paths. Symbolic links are not followed. The first path found for eachprogramis written to standard output (one per line), and a warning is written to standard error for everyprogramnot found. The exit status is 0 (success) if allprograms were found, 1 otherwise.

whichalso leaves its output in theREPLYvariable. This may be useful if you runwhichin the main shell environment. TheREPLYvalue will notsurvive a command substitution subshell as inls_path=$(which ls).

The following options modify the default behaviour described above:

  • -a:Listallprograms that can be found in the directories searched, instead of just the first one. This is useful for finding duplicate commands that the shell would not normally find when searching its$PATH.
  • -p:Search in$DEFPATH (the default standard utilityPATHprovided by the operating system) instead of in the user's$PATH,which is vulnerable to manipulation.
  • -q:Bequiet: suppress all warnings.
  • -s:Silent operation: don't write output, only store it in theREPLY variable. Suppress warnings except, if you runwhich -sin a subshell, a warning that theREPLYvariable will not survive the subshell.
  • -n:When writing to standard output, donot write a finalnewline.
  • -Q:Shell-quote each unit of output. Separate by spaces instead of newlines. This generates a one-line list of arguments in shell syntax, guaranteed to be suitable for safe parsing by the shell, even if the resulting pathnames should contain strange characters such as spaces or newlines and other control characters.
  • -1(one): Output the results for at mostoneof the arguments in descending order of preference: once a search succeeds, ignore the rest. Suppress warnings except a subshell warning for-s. This is useful for finding a command that can exist under several names, for example: which -f -1 gnutar gtar tar
    This option modifies which's exit status behaviour:which -1 returns successfully if at least one command was found.
  • -f:Throw afatal error in cases wherewhichwould otherwise return status 1 (non-success).
  • -P:Strip the indicated number ofpathname elements from the output, starting from the right. -P1:strip/program; -P2:strip/*/program, etc. This is useful for determining the installation root directory for an installed package.
  • --help:Show brief usage information.

use sys/base/yes

yesvery quickly outputs infinite lines of text, each consisting of its space-separated arguments, until terminated by a signal or by a failure to write output. If no argument is given, the default line isy.No options are supported.

This infinite-output command is useful for piping into commands that need an indefinite input data stream, or to automate a command requiring interactive confirmation.

Modernishyesis like GNUyesin that it outputs all its arguments, whereas BSDyesonly outputs the first. It can output multiple gigabytes per second on modern systems.

use sys/cmd

Modules in this category contain functions for enhancing the invocation of commands.

use sys/cmd/extern

externis likecommandbut always runs an external command, without having to know or determine its location. This provides an easy way to bypass a builtin, alias or function. It does the same$PATHsearch the shell normally does when running an external command. For instance, to guarantee running externalprintfjust do:extern printf...

Usage:extern[-p] [-v] [-uvarname... ] [varname=value... ]command[argument... ]

  • -p:Thecommand,as well as any commands it further invokes, are searched in $DEFPATH (the default standard utilityPATHprovided by the operating system) instead of in the user's$PATH,which is vulnerable to manipulation.
    • extern -pis much more reliable than the shell's builtin command -p because: (a) many existing shell installations use a wrong search path for command -p;(b)command -pdoes not export the defaultPATH,so something likecommand -p sudo cp foo /bin/barsearches onlysudoin the secure default path and notcp.
  • -v:don't executecommandbut show the full path name of the command that would have been executed. Any extraarguments are taken as more command paths to show, one per line.externexits with status 0 if all the commands were found, 1 otherwise. This option can be combined with-p.
  • -u:Temporary export override. Unset the given variable in the environment of the command executed, even if it is currently exported. Can be specified multiple times.
  • varname=valueassignment-arguments: These variables/values are temporarily exported to the environment during the execution of the command.
    • This is provided because assignmentsprecedingexterncause unwanted, shell-dependent side effects, asexternis a shell function. Be sure to provide assignment-argumentsfollowingexterninstead.
    • Assignment-arguments after a--end-of-options delimiter are not parsed; this allowscommands containing a=sign to be executed.

use sys/cmd/harden

Thehardenfunction allows implementing emergency halt on error for any external commands and shell builtin utilities. It is modernish's replacement forset -ea.k.a.set -o errexit(which is fundamentally flawed, not supported and will break the library). It depends on, and auto-loads, thesys/cmd/externmodule.

hardensets a shell function with the same name as the command hardened, so it can be used transparently. This function hardens the given command by checking its exit status against values indicating error or system failure. Exactly what exit statuses signify an error or failure depends on the command in question; this should be looked up in the POSIX specification (under "Utilities" ) or in the command'smanpage or other documentation.

If the command fails, the function installed byhardencallsdie,so it will reliably halt program execution, even if the failure occurred within a subshell.

Usage:

harden[-ffuncname] [-[cSpXtPE]] [-etestexpr] [var=value... ] [-uvar... ]command_name_or_path [command_argument... ]

The-foption hardens the command as the shell functionfuncnameinstead of defaulting tocommand_name_or_pathas the function name. (If the latter is a path, that's always an invalid function name, so the use of-fis mandatory.) Ifcommand_name_or_pathis itself a shell function, that function is bypassed and the builtin or external command by that name is hardened instead. If no such command is found,hardendies with the message that hardening shell functions is not supported. (Instead, you should invoke diedirectly from your shell function upon detecting a fatal error.)

The-coption causescommand_name_or_pathto be hardened and run immediately instead of setting a shell function for later use. This option is meant for commands that run once; it is not efficient for repeated use. It cannot be used together with the-foption.

The-Soption allows specifying several possible names/paths for a command. It causes thecommand_name_or_pathto be split by comma and interpreted as multiple names or paths to search. The first name or path found is used. Requires-f.

The-eoption, which defaults to>0,indicates the exit statuses corresponding to a fatal error. It depends on the command what these are; consult the POSIX spec and the manual pages. The status test expressiontestexpr,argument to the-eoption, is like a shell arithmetic expression, with the binary operators==!=<=>=<>turned into unary operators referring to the exit status of the command in question. Assignment operators are disallowed. Everything else is the same, including&&(logical and) and||(logical or) and parentheses. Note that the expression needs to be quoted as the characters used in it clash with shell grammar tokens.

The-Xoption causeshardento always search for and harden an external command, even if a built-in command by that name exists.

The-Eoption causes the hardening function to consider it a fatal error if the hardened command writes anything to the standard error stream. This option allows hardening commands (such as bc) where you can't rely on the exit status to detect an error. The text written to standard error is passed on as part of the error message printed by die.Note that:

  • Intercepting standard error necessitates that the command be executed from a subshell. This means any builtins or shell functions hardened with-Ecannot influence the calling shell (e.g.harden -E cdrenderscdineffective).
  • -Edoes not disable exit status checks; by default, any exit status greater than zero is still considered a fatal error as well. If your command does not even reliably return a 0 status upon success, then you may want to add-e '>125',limiting the exit status check to reserved values indicating errors launching the command and signals caught.

The-poption causeshardento search for commands using the system default path (as obtained withgetconf PATH) as opposed to the current$PATH.This ensures that you're using a known-good external command that came with your operating system. By default, the system-default PATH search only applies to the command itself, and not to any commands that the command may search for in turn. But if the-poption is specified at least twice, the command is run in a subshell withPATHexported as the default path, which is equivalent to adding aPATH=$DEFPATHassignment argument (seebelow).

Examples:

harden make#simple check for status > 0
harden -f tar'/usr/local/bin/gnutar'#id.; be sure to use this 'tar' version
harden -e'> 1'grep#for grep, status > 1 means error
harden -e'==1 || >2'gzip#1 and >2 are errors, but 2 isn't (see manual)
Important note on variable assignments

As far as the shell is concerned, hardened commands are shell functions and not external or builtin commands. This essentially changes one behaviour of the shell: variable assignments preceding the command will not be local to the command as usual, butwill persistafter the command completes. (POSIX technically makes that behaviour optional but all current shells behave the same in POSIX mode.)

For example, this means that something like

harden -e'>1'grep
#[...]
LC_ALL=C grep regex some_ascii_file.txt

should never be done, because the meant-to-be-temporaryLC_ALLlocale assignment will persist and is likely to cause problems further on.

To solve this problem,hardensupports adding these assignments as part of the hardening command, so instead of the above you do:

harden -e'>1'LC_ALL=C grep
#[...]
grep regex some_ascii_file.txt

With the-uoption,hardenalso supports unsetting variables for the duration of a command, e.g.:

harden -e'>1'-u LC_ALL grep

The-uoption may be specified multiple times. It causes the hardened command to be invoked from a subshell with the specified variables unset.

Hardening while allowing for broken pipes

If you're piping a command's output into another command that may close the pipe before the first command is finished, you can use the-Poption to allow for this:

harden -e'==1 || >2'-P gzip#also tolerate gzip being killed by SIGPIPE
gzip -dc file.txt.gz|head -n 10#show first 10 lines of decompressed file

headwill close the pipe ofgzipinput after ten lines; the operating system kernel then killsgzipwith the PIPE signal before it's finished, causing a particular exit status that is greater than 128. This exit status would normally makehardenkill your entire program, which in the example above is clearly not the desired behaviour. If the exit status caused by a broken pipe were known, you could specifically allow for that exit status in the status expression. The trouble is that this exit status varies depending on the shell and the operating system. The-poption was made to solve this problem: it automatically detects and whitelists the correct exit status corresponding toSIGPIPEtermination on the current system.

ToleratingSIGPIPEis an option and not the default, because in many contexts it may be entirely unexpected and a symptom of a severe error if a command is killed by a broken pipe. It is up to the programmer to decide which commands should expectSIGPIPEand which shouldn't.

Tip:It could happen that the same command should expectSIGPIPEin one context but not another. You can create two hardened versions of the same command, one that toleratesSIGPIPEand one that doesn't. For example:

harden -f hardGrep -e'>1'grep#hardGrep does not tolerate being aborted
harden -f pipeGrep -e'>1'-P grep#pipeGrep for use in pipes that may break

Note:IfSIGPIPEwas set to ignore by the process invoking the current shell, the-poption has no effect, because no process or subprocess of the current shell can ever be killed bySIGPIPE.However, this may cause various other problems and you may want to refuse to let your program run under that condition. thisshellhas WRN_NOSIGPIPEcan help you easily detect that condition so your program can make a decision. See theWRN_NOSIGPIPEdescriptionfor more information.

Tracing the execution of hardened commands

The-toption will trace command output. Each execution of a command hardened with-tcauses the command line to be output to standard error, in the following format:

[functionname]> commandline

wherefunctionnameis the name of the shell function used to harden the command andcommandlineis the actual command executed. The commandlineis properly shell-quoted in a format suitable for re-entry into the shell; however, command lines longer than 512 bytes will be truncated and the unquoted string(TRUNCATED)will be appended to the trace. If standard error is on a terminal that supports ANSI colours, the tracing output will be colourised.

The-toption was added tohardenbecause the commands that you harden are often the same ones you would be particularly interested in tracing. The advantage of usingharden -tover the shell's builtin tracing facility (set -xorset -o xtrace) is that the output is alotless noisy, especially when using a shell library such as modernish.

Note:Internally,-tuses the shell file descriptor 9, redirecting it to standard error (usingexec 9>&2). This allows tracing to continue to work normally even for commands that redirect standard error to a file (which is another enhancement overset -xon most shells). However, this does mean harden -tconflicts with any other use of the file descriptor 9 in your shell program.

If file descriptor 9 is already open beforehardenis called,harden does not attempt to override this. This means tracing may be redirected elsewhere by doing something likeexec 9>trace.outbefore calling harden.(Note that redirecting FD 9 on thehardencommand itself will notwork as it won't survive the run of the command.)

Simple tracing of commands

Sometimes you just want to trace the execution of some specific commands as inharden -t(see above) without actually hardening them against command errors; you might prefer to do your own error handling.tracemakes this easy. It is modernish's replacement or complement forset -xa.k.a.set -o xtrace. Unlikeharden -t,it can also trace shell functions.

Usage 1:trace[-ffuncname] [-[cSpXE]] [var=value... ] [-uvar... ]command_name_or_path [command_argument... ]

For non-function commands,traceacts as a shortcut for harden -t -P -e '>125 &&!=255'command_name_or_path. Any further options and arguments are passed on tohardenas given. The result is that the indicated command is automatically traced upon execution. A bonus is that you still get minimal hardening against fatal system errors. Errors in the traced command itself are ignored, but your program is immediately halted with an informative error message if the traced command:

  • cannot be found (exit status 127);
  • was found but cannot be executed (exit status 126);
  • was killed by a signal other thanSIGPIPE(exit status > 128, except the shell-specific exit status forSIGPIPE,and except 255 which is used by some utilities, such assshandrsync,to return an error).

Note:The caveat for command-local variable assignments forhardenalso applies totrace.See Important note on variable assignments above.

Usage 2:[#!]trace -ffuncname

If no further arguments are given,trace -fwill trace the shell functionfuncnamewithout applying further hardening (except against nonexistence).trace -fcan be used to trace the execution of modernish library functions as well as your own script's functions. The trace output for shell functions shows an extra()following the function name.

Internally, this involves setting an alias under the function's name, so the limitations of the shell's alias expansion mechanism apply: only function calls that the shell had not yet parsed before callingtrace -f will be traced. So you should usetrace -fat the beginning of your script, before defining your own functions. To facilitate this,trace -f does not check that the functionfuncnameexists while setting up tracing, but only when attempting to execute the traced function.

Inportable-form modernish scripts,trace -fshould be used as a hashbang command to be compatible with alias expansion on all shells. Only thetrace -fform may be used that way. For example:

#!/usr/bin/env modernish
#!use safe -k
#!use sys/cmd/harden
#!trace -f push
#!trace -f pop
...your program begins here...

use sys/cmd/mapr

mapr(map records) is an alternative toxargsthat shares features with the mapfilecommand in bash 4.x. It is fully integrated into your script's main shell environment, so it can call your shell functions as well as builtin and external utilities. It depends on, and auto-loads, thesys/cmd/procsubstmodule.

Usage:mapr[-ddelimiter|-P] [-scount] [-nnumber] [-mlength] [-cquantum]callback

maprreads delimited records from the standard input, invoking the specified callbackcommand once or repeatedly as needed, with batches of input records as arguments. Thecallbackmay consist of multiple arguments. By default, an input record is one line of text.

Options:

  • -ddelimiter:Use the single characterdelimiterto delimit input records, instead of the newline character. ANUL(0) character and multi-byte characters are not supported.
  • -P:Paragraph mode. Input records are delimited by sequences consisting of a newline plus one or more blank lines, and leading or trailing blank lines will not result in empty records at the beginning or end of the input. Cannot be used together with-d.
  • -scount:Skip and discard the firstcountrecords read.
  • -nnumber:Stop processing after passing a total ofnumberrecords to invocation(s) ofcallback.If-nis not supplied ornumberis 0, all records are passed, except those skipped using-s.
  • -mlength:Set the maximum argument length in bytes of eachcallback command call, including thecallbackcommand argument(s) and the current batch of up toquantuminput records. The length of each argument is increased by 1 to account for the terminating null byte. The default maximum length depends on constraints set by the operating system for invoking external commands. Iflengthis 0, this limit is disabled.
  • -cquantum:Pass at mostquantumarguments at a time to each call to callback.If-cis not supplied or ifquantumis 0, the number of arguments per invocation is not limited except by-m;whichever limit is reached first applies.

Arguments:

  • callback:Call thecallbackcommand with the collected arguments each timequantumlines are read. The callback command may be a shell function or any other kind of command, and is executed from the same shell environment that invokedmapr.If the callback command exits or returns with status 255 or is interrupted by theSIGPIPEsignal,maprwill not process any further batches but immediately exit with the status of the callback command. If it exits with another exit status 126 or greater, a fatal error is thrown. Otherwise,maprexits with the status of the last-executed callback command.
  • argument:If there are extra arguments supplied on the mapr command line, they will be added before the collected arguments on each invocation on the callback command.
Differences frommapfile

maprwas inspired by the bash 4.x builtin commandmapfilea.k.a. readarray,and uses similar options, but there are important differences.

  • maprpasses all the records as arguments to the callback command.
  • maprdoes not support assigning records directly to an array. Instead, all handling is done through the callback command (which could be a shell function that assigns its arguments to an array.)
  • The callback command is specified directly instead of with a-Coption, and it may consist of several arguments (as withxargs).
  • The record separator itself is never included in the arguments passed to the callback command (so there is no-toption to remove it).
  • maprsupports paragraph mode.
  • If the callback command exits with status 255, processing is aborted.
Differences fromxargs

maprshares important characteristics with xargs while avoiding its myriad pitfalls.

  • Instead of being an external utility,mapris fully integrated into the shell. The callback command can be a shell function or builtin, which can directly modify the shell environment.
  • mapris line-oriented by default, so it is safe to use for input arguments that contain spaces or tabs.
  • maprdoes not parse or modify the input arguments in any way, e.g. it does not process and remove quotes from them likexargsdoes.
  • maprsupports paragraph mode.

use sys/cmd/procsubst

This module provides a portable process substitution construct, the advantage being that this is not limited to bash, ksh or zsh but works on all POSIX shells capable of running modernish. It is not possible for modernish to introduce the original ksh syntax into other shells. Instead, this module provides a%command for use within a $(command substitution).

The%command takes one simple command as its arguments, executes it in the background, and writes a file name from which to read its output. So if%is used within a command substitution as intended, that file name is passed on to the invoking command as an argument.

The%command supports one option,-o.If that option is given, then it is expected that, instead of reading input, the invoking command writes output to the file name passed on to it, so that the command invoked by% -ocan read that data from its standard input.

Example syntax comparison:
ksh/bash/zshmodernish
diff -u <(ls) <(ls -a) diff -u $(% ls) $(% ls -a)
IFS=' ' read -r user vsz args < <(ps -o 'user= vsz= args=' -p $$) IFS=' ' read -r user vsz args < $(% ps -o 'user= vsz= args=' -p $$)
{ some commands; } > >(tee stdout.log) 2> >(tee stderr.log)
(both `tee` commands write terminal output to standard output)
{ some commands; } > $(% -o tee stdout.log) 2> $(% -o tee stderr.log)
(both `tee` commands write terminal output to standard error)

Unlike the bash/ksh/zsh version, modernish process substitution only works with simple commands. This includes shell function calls, but not aliases or anything involving shell grammar or reserved words (such as redirections, pipelines, loops, etc.). To use such complex commands, enclose them in a shell function and call that function from the process substitution.

Also note that anything that a command invoked by the% -owrites to its standard output is redirected to standard error. The main shell environment's standard output is not available because the command substitution subsumes it.

use sys/cmd/source

Thesourcecommand sources a dot script like the.command, but additionally supports passing arguments to sourced scripts like you would pass them to a function. It mostly mimics the behaviour of thesource command built in to bash and zsh.

If a filename without a directory path is given, then, unlike the. command,sourcelooks for the dot script in the current directory by default, as well as searching$PATH.

It is a fatal error to attempt to source a directory, a file with no read permission, or a nonexistent file.

use sys/dir

Functions for working with directories.

use sys/dir/countfiles

countfiles:Count the files in a directory using nothing but shell functionality, so without external commands. (It's amazing how many pitfalls this has, so a library function is needed to do it robustly.)

Usage:countfiles[-s]directory[globpattern... ]

Count the number of files in a directory, storing the number inREPLY and (unless-sis given) printing it to standard output. If anyglobpatterns are given, only count the files matching them.

use sys/dir/mkcd

Themkcdfunction makes one or more directories, then, upon success, change into the last-mentioned one.mkcdinheritsmkdir's usage, so options depend on your system'smkdir;only the POSIX options are guaranteed. Whenmkcdis run from a script, it usescd -Pto change the working directory, resolving any symlinks in the present working directory path.

use sys/term

Utilities for working with the terminal.

use sys/term/putr

This module provides commands to efficiently output a string repeatedly.

Usage:

  • putr[number|-]string
  • putrln[number|-]string

Output thestringnumbertimes. When usingputrln,add a newline at the end.

If a-is given instead of anumber,then the total length of the output is the line length of the terminal divided by the length of thestring, rounded down.

Note that, unlike withputandputln,only a singlestring argument is accepted.

Example:putrln - '='prints a full terminal line of equals signs.

use sys/term/readkey

readkey:read a single character from the keyboard without echoing back to the terminal. Buffering is done so that multiple waiting characters are read one at a time.

Usage:readkey[-EERE] [-ttimeout] [-r] [varname]

-E:Only accept characters that match the extended regular expression ERE(the type of RE used bygrep -E/egrep).readkeywill silently ignore input not matching the ERE and wait for input matching it.

-t:Specify atimeoutin seconds (one significant digit after the decimal point). After the timeout expires, no character is read and readkeyreturns status 1.

-r:Raw mode. Disables INTR (Ctrl+C), QUIT, and SUSP (Ctrl+Z) processing as well as translation of carriage return (13) to linefeed (10).

The character read is stored into the variable referenced byvarname, which defaults toREPLYif not specified.

This module depends on the trap stack to save and restore the terminal state if the program is stopped while reading a key, so it will automatically use var/stack/trapon initialisation.


Appendix A: List of shell cap IDs

This appendix lists all the shell capabilities, quirks,and bugs that modernish can detect in the current shell, so that modernish scripts can easily query the results of these tests and decide what to do. Certain problematic system conditions are also detected this way and listed here.

The all-caps IDs below are all usable with the thisshellhas function. This makes it easy for a cross-platform modernish script to be aware of relevant conditions and decide what to do.

Each detection test has its own little test script in the lib/modernish/capdirectory. These tests are executed on demand, the first time the capability or bug in question is queried using thisshellhas.SeeREADME.mdin that directory for further information. The test scripts also document themselves in the comments.

Capabilities

Modernish currently identifies and supports the following non-standard shell capabilities:

  • ADDASSIGN:Add a string to a variable using additive assignment, e.g.VAR+=string
  • ANONFUNC:zsh anonymous functions (basically the native zsh equivalent of modernish's var/local module)
  • ARITHCMD:standalone arithmetic evaluation using a command like ((expression)).
  • ARITHFOR:ksh93/C-style arithmeticforloops of the form for ((exp1;exp2;exp3)) docommands;done.
  • ARITHPP:support for the++and--unary operators in shell arithmetic.
  • CESCQUOT:Quoting with C-style escapes, like$'\n'for newline.
  • DBLBRACKET:The ksh88-style[[double-bracket command]], implemented as a reserved word, integrated into the main shell grammar, and with a different grammar applying within the double brackets. (ksh93, mksh, bash, zsh, yash >= 2.48)
  • DBLBRACKETERE:DBLBRACKETplus the=~binary operator to match a string against an extended regular expression.
  • DBLBRACKETV:DBLBRACKETplus the-vunary operator to test if a variable is set. Named variables only. (Testing positional parameters (like[[ -v 1 ]]) does not work on bash or ksh93; check$#instead.)
  • DOTARG:Dot scripts support arguments.
  • HERESTR:Here-strings, an abbreviated kind of here-document.
  • KSH88FUNC:define ksh88-style shell functions with thefunctionkeyword, supporting dynamically scoped local variables with thetypesetbuiltin. (mksh, bash, zsh, yash, et al)
  • KSH93FUNC:the same, but with static scoping for local variables. (ksh93 only) See Q28 at theksh93 FAQfor an explanation of the difference.
  • KSHARRAY:ksh93-style arrays. Supported on bash, zsh (underemulate sh), mksh, and ksh93.
  • LEPIPEMAIN:execute last element of a pipe in the main shell, so that things likesomecommand| readsomevariablework. (zsh, AT&T ksh, bash 4.2+)
  • LINENO:the$LINENOvariable contains the current shell script line number.
  • LOCALVARS:thelocalcommand creates dynamically scoped local variables within functions defined using standard POSIX syntax.
  • NONFORKSUBSH:as a performance optimisation, subshellsare implemented without forking a new process, so they share a PID with the main shell. (AT&T ksh93; it hasmany bugs related to this, but there's a nice workaround:ulimit -t unlimitedforces a subshell to fork, making those bugs disappear! See alsoBUG_FNSUBSH.)
  • PRINTFV:The shell'sprintfbuiltin has the-voption to print to a variable, which avoids forking a command substitution subshell.
  • PROCREDIR:the shell natively supports<(process redirection), a special kind of redirection that connects standard input (or standard output) to a background process running your command(s). This exists on yash. Note this isnotcombined with a redirection like< <(command). Contrast with bash/ksh/zsh'sPROCSUBSTwhere this<(syntax) substitutes a file name.
  • PROCSUBST:the shell natively supports<(process substitution), a special kind of command substitution that substitutes a file name, connecting it to a background process running your command(s). This exists on ksh93 and zsh. (Bash has it too, but its POSIX mode turns it off, so modernish can't use it.) Note this is usually combined with a redirection, like< <(command). Contrast this with yash'sPROCREDIRwhere the same<(syntax) is itself a redirection.
  • PSREPLACE:Search and replace strings in variables using special parameter substitutions with a syntax vaguely resembling sed.
  • RANDOM:the$RANDOMpseudorandom generator. Modernish seeds it if detected. The variable is then set it to read-only whether the generator is detected or not, in order to block it from losing its special properties by being unset or overwritten, and to stop it being used if there is no generator. This is because some of modernish depends onRANDOMeither working properly or being unset.
    (The use case for non-readonlyRANDOMis setting a known seed to get reproducible pseudorandom sequences. To get that in a modernish script, useawk'ssrand(yourseed)andint(rand()*32768).)
  • ROFUNC:Set functions to read-only withreadonly -f.(bash, yash)
  • TESTERE:The regulartest/[builtin command supports the=~binary operator to match a string against an extended regular expression.
  • TESTO:Thetest/[builtin supports the-ounary operator to check if a shell option is set.
  • TRAPPRSUBSH:The ability to obtain a list of the current shell's native traps from a command substitution subshell, for example:var=$(trap), as long as no new traps have been set within that command substitution. Note that thevar/stack/trapmodule transparently reimplements this feature on shells without this native capability.
  • TRAPZERR:This feature ID is detected if theERRtrap is an alias for theZERRtrap. According to the zsh manual, this is the case for zsh on most systems, i.e. those that don't have aSIGERRsignal. (The trap stack uses this feature test.)
  • VARPREFIX:Expansions of type${!prefix@}and${!prefix*}yield all names of set variables beginning withprefixin the same way and with the same quoting effects as$@and$*,respectively. This includes the nameprefixitself, unless the shell hasBUG_VARPREFIX. (bash; AT&T ksh93)

Quirks

Modernish currently identifies and supports the following shell quirks:

  • QRK_32BIT:mksh: the shell only has 32-bit arithmetic. Since every modern system these days supports 64-bit long integers even on 32-bit kernels, we can now count this as a quirk.
  • QRK_ANDORBG:On zsh, the&operator takes the last simple command as the background job and not an entire AND-OR list (if any). In other words,a && b || c &is interpreted as a && b || { c & }and not{ a && b || c; } &.
  • QRK_ARITHEMPT:In yash, with POSIX mode turned off, a set but empty variable yields an empty string when used in an arithmetic expression, instead of 0. For example,foo=''; echo $((foo))outputs an empty line.
  • QRK_ARITHWHSP:Inyash and FreeBSD /bin/sh, trailing whitespace from variables is not trimmed in arithmetic expansion, causing the shell to exit with an 'invalid number' error. POSIX is silent on the issue. The modernishisintfunction (to determine if a string is a valid integer number in shell syntax) isQRK_ARITHWHSPcompatible, tolerating only leading whitespace.
  • QRK_BCDANGER:breakandcontinuecan affect non-enclosing loops, even across shell function barriers (zsh, Busybox ash; older versions of bash, dash and yash). (This is especially dangerous when using var/local which internally uses a temporary shell function to try to protect against breaking out of the block without restoring global parameters and settings.)
  • QRK_EMPTPPFLD:Unquoted$@and$*do not discard empty fields. POSIX says for both unquoted$@and unquoted$*that empty positional parameters maybe discarded from the expansion. AFAIK, just one shell (yash) doesn't.
  • QRK_EMPTPPWRD:POSIX says that empty"$@"generates zero fields but empty''or""or "$emptyvariable"generates one empty field. But it leaves unspecified whether something like"$@$emptyvariable"generates zero fields or one field. Zsh, pdksh/mksh and (d)ash generate one field, as seems logical. But bash, AT&T ksh and yash generate zero fields, which we consider a quirk. (See also BUG_PP_01)
  • QRK_EVALNOOPT:evaldoes not parse options, not even--,which makes it incompatible with other shells: on the one hand, (d)ash does not accept
    eval -- "$command"whereas on other shells this is necessary if the command starts with a-,or the command would be interpreted as an option toeval. A simple workaround is to prefix arbitrary commands with a space. Both situations are POSIX compliant, but since they are incompatible without a workaround,the minority situation is labeled here as a QuiRK.
  • QRK_EXECFNBI:In pdksh and zsh,execlooks up shell functions and builtins before external commands, and if it finds one it does the equivalent of running the function or builtin followed byexit.This is probably a bug in POSIX terms;execis supposed to launch a program that overlays the current shell, implying the program launched by execis always external to the shell. However, since the POSIX language is rather vague and possibly incorrect, this is labeled as a shell quirk instead of a shell bug.
  • QRK_FNRDREXIT:On FreeBSD sh and NetBSD sh, an error in a redirection attached to a function call causes the shell to exit. This affects redirections of all functions, including modernish library functions as well as functions set byharden.
  • QRK_GLOBDOTS:Pathname expansion of.*matches the pseudonames.and ..so that, e.g.,cp -pr.* backup/cannot be used to copy all your hidden files. (bash < 5.2, (d)ash, AT&T ksh!= 93u+m, yash)
  • QRK_HDPARQUOT:Doublequotes within certainparameter substitutions in here-documents aren't removed (FreeBSD sh; bosh). For instance, if varis set,${var+ "x" }in a here-document yields"x",notx. POSIX considers it undefined to use double quotes there, so they should be avoided for a script to be fully POSIX compatible. (Note this quirk doesnotapply for substitutions that remove patterns, such as${var# "$x" }and${var% "$x" };those are defined by POSIX and double quotes are fine to use.) (Note 2: single quotes produce widely varying behaviour and should never be used within any form of parameter substitution in a here-document.)
  • QRK_IFSFINAL:in field splitting, a final non-whitespaceIFSdelimiter character is counted as an empty field (yash < 2.42, zsh, pdksh). This is a QRK (quirk), not a BUG, because POSIX is ambiguous on this.
  • QRK_LOCALINH:On a shell withLOCALVARS,local variables, when declared without assigning a value, inherit the state of their global namesake, if any. (dash, FreeBSD sh)
  • QRK_LOCALSET:On a shell withLOCALVARS,local variables are immediately set to the empty value upon being declared, instead of being initially without a value. (zsh)
  • QRK_LOCALSET2:LikeQRK_LOCALSET,butonlyif the variable by the same name in the global/parent scope is unset. If the global variable is set, then the local variable starts out unset. (bash 2 and 3)
  • QRK_LOCALUNS:On a shell withLOCALVARS,local variables lose their local status when unset. Since the variable name reverts to global, this means that unsetwill not necessarily unset the variable!(yash, pdksh/mksh. Note: this is actually a behaviour oftypeset,to which modernish aliaseslocal on these shells.)
  • QRK_LOCALUNS2:This is a more treacherous version ofQRK_LOCALUNSthat is unique to bash. Theunsetcommand works as expected when used on a local variable in the same scope that variable was declared in,however,it makes local variables global again if they are unset in a subscope of that local scope, such as a function called by the function where it is local. (Note: sinceQRK_LOCALUNS2is a special case ofQRK_LOCALUNS,modernish will not detect both.) On bash >= 5.0, modernish eliminates this quirk upon initialisation by settingshopt -s localvar_unset.
  • QRK_OPTABBR:Long-form shell option names can be abbreviated down to a length where the abbreviation is not redundant with other long-form option names. (ksh93, yash)
  • QRK_OPTCASE:Long-form shell option names are case-insensitive. (yash, zsh)
  • QRK_OPTDASH:Long-form shell option names ignore the-.(ksh93, yash)
  • QRK_OPTNOPRFX:Long-form shell option names use a dynamicnoprefix for all options (including POSIX ones). For instance,globis the opposite ofnoglob,andnonotifyis the opposite ofnotify.(ksh93, yash, zsh)
  • QRK_OPTULINE:Long-form shell option names ignore the_.(yash, zsh)
  • QRK_PPIPEMAIN:On zsh <= 5.5.1, in all elements of a pipeline, parameter expansions are evaluated in the current environment (with any changes they make surviving the pipeline), though the commands themselves of every element but the last are executed in a subshell. For instance, given unset or emptyv,in the pipelinecmd1 ${v:=foo} | cmd2,the assignment to vsurvives, thoughcmd1itself is executed in a subshell.
  • QRK_SPCBIXP:Variable assignments directly preceding special builtin commands are exported, and persist as exported. (bash; yash)
  • QRK_UNSETF:If 'unset' is invoked without any option flag (-v or -f), and no variable by the given name exists but a function does, the shell unsets the function. (bash)

Bugs

Modernish currently identifies and supports the following shell bugs:

  • BUG_ALIASCSHD:A spurious syntax error occurs if a here-document containing a command substitution is used within two aliases that define a block. The syntax error reporting a missing}occurs because the alias terminating the block is not correctly expanded. This bug affects var/localand var/loop as they define blocks this way. Workaround: make a shell function that handles the here-document and call that shell function from the block/loop instead. Bug found on: dash <= 0.5.10.2; Busybox ash <= 1.31.1.
  • BUG_ALIASPOSX:Running any command "foo" in POSIX mode like POSIXLY_CORRECT=y foowill globally disable alias expansion on a non-interactive shell (killing modernish), unless POSIX mode is globally enabled. Bug found on bash 4.2 through 5.0. Note:on bash versions with this bug, modernish automatically enables POSIX mode to avoid triggering it. A side effect is that process substitution (PROCSUBST) isn't available.
  • BUG_ARITHINIT:Using unset or empty variables (dash <= 0.5.9.1 on macOS) or unset variables (yash <= 2.44) in arithmetic expressions causes the shell to exit, instead of taking them as a value of zero.
  • BUG_ARITHLNNO:The shell supports$LINENO,but the variable is considered unset in arithmetic contexts, like$(( LINENO > 0 )). This makes it error out underset -uand default to zero otherwise. Workaround: use shell expansion like$(( $LINENO > 0 )).(FreeBSD sh)
  • BUG_ARITHNAN:The case-insensitive special floating point constants InfandNaNare recognised in arithmetic evaluation, overriding any variables with the namesInf,NaN,INF,nan,etc. (AT&T ksh93; zsh 5.6 - 5.8)
  • BUG_ARITHSPLT:Unquoted$((arithmetic expressions))are not subject to field splitting as expected. (zsh, mksh<=R49)
  • BUG_ASGNCC01:ifIFScontains a$CC01(^A) character, unquoted expansions in shell assignments discard that character (if present). Found on: bash 4.0-4.3
  • BUG_ASGNLOCAL:If you have a function-local variable (seeLOCALVARS) with the same name as a global variable, and within the function you run a shell builtin command preceded by a temporary variable assignment, then the global variable is unset. (zsh <= 5.7.1)
  • BUG_BRACQUOT:shell quoting within bracket patterns has no effect (zsh < 5.3; ksh93) This bug means the-retains it special meaning of 'character range', and an initial!(and, on some shells,^) retains the meaning of negation, even in quoted strings within bracket patterns, including quoted variables.
  • BUG_CASEEMPT:An emptycaselist on a single line, as incase x in esac, is a syntax error. (AT&T ksh93)
  • BUG_CASELIT:If acasepattern doesn't match as a pattern, it's tried again as a literal string, even if the pattern isn't quoted. This can result in false positives when a pattern doesn't match itself, like with bracket patterns. This contravenes POSIX and breaks use cases such as input validation. (AT&T ksh93) Note: modernishmatchworks around this.
  • BUG_CASEPAREN:casepatterns without an opening parenthesis (i.e. with only an unbalanced closing parenthesis) are misparsed as a syntax error within command substitutions of the form$( ). Workaround: include the opening parenthesis. Found on: bash 3.2
  • BUG_CASESTAT:Thecaseconditional construct prematurely clobbers the exit status$?.(found in zsh < 5.3, Busybox ash <= 1.25.0, dash < 0.5.9.1)
  • BUG_CDNOLOGIC:Thecdbuilt-in command lacks the POSIX-specified-L option and does not support logical traversal; it always acts as if the-P (physical traversal) option was passed. This also renders the-Loption to modernishchdirineffective. (NetBSD sh)
  • BUG_CDPCANON:cd -P(and hence also modernish chdir) does not correctly canonicalise/normalise a directory path that starts with three or more slashses; it reduces these to two initial slashes instead of one in$PWD.(zsh <= 5.7.1)
  • BUG_CMDEXEC:usingcommand exec(to open a file descriptor, using commandto avoid exiting the shell on failure) within a function causes bash <= 4.0 to fail to restore the global positional parameters when leaving that function. It also renders bash <=4.0 prone to hanging.
  • BUG_CMDEXPAN:if thecommandcommand results from an expansion, it acts likecommand -v,showing the path of the command instead of executing it. For example:v=command; "$v" lsorset -- command ls; "$@"don't work. (AT&T ksh93)
  • BUG_CMDOPTEXP:thecommandbuiltin does not recognise options if they result from expansions. For instance, you cannot conditionally store-p in a variable likedefaultpathand then docommand $defaultpath someCommand.(found in zsh < 5.3)
  • BUG_CMDPV:command -pvdoes not find builtins ({pd,m}ksh), does not accept the -p and -v options together (zsh < 5.3) or ignores the-p option altogether (bash 3.2); in any case, it's not usable to find commands in the default system PATH.
  • BUG_CMDSETPP:usingcommand set --has no effect; it does not set the positional parameters. For compat, usesetwithoutcommand.(mksh <= R57)
  • BUG_CMDSPASGN:preceding a special builtin withcommanddoes not stop preceding invocation-local variable assignments from becoming global. (AT&T ksh93)
  • BUG_CMDSPEXIT:preceding a special builtin (other thaneval,exec,returnorexit) withcommanddoes not always stop it from exiting the shell if the builtin encounters error. (bash <= 4.0; zsh <= 5.2; mksh; ksh93)
  • BUG_CSNHDBKSL:Backslashes within non-expanding here-documents within command substitutions are incorrectly expanded to perform newline joining, as opposed to left intact. (bash <= 4.4)
  • BUG_CSUBBTQUOT:A spurious syntax erorr is thrown when using double quotes within a backtick-style command substitution that is itself within double quotes. (AT&T ksh93 < 93u+m 2022-05-20)
  • BUG_CSUBLNCONT:Backslash line continuation is not processed correctly within modern-form$(command substitutions). (AT&T ksh93 < 93u+m 2022-05-21)
  • BUG_CSUBRMLF:A bug affecting the stripping of final linefeeds from command substitutions. If a command substitution does not produce any output to substituteandis concatenated in a string or here-document, then the shell removes any concurrent linefeeds occurring directly before the command substitution in that string or here-document. (dash <= 0.5.10.2, Busybox ash, FreeBSD sh)
  • BUG_CSUBSTDO:If standard output (file descriptor 1) is closed before entering a command substitution, and any other file descriptors are redirected within the command substitution, commands such asechoor putlnwill not work within the command substitution, acting as if standard output is still closed (AT&T ksh93 <= AJM 93u+ 2012-08-01). Workaround: see cap/BUG_CSUBSTDO.t.
  • BUG_DEVTTY:the shell can't redirect output to/dev/ttyif set -C/set -o noclobber(part ofsafe mode) is active. Workaround: use>| /dev/ttyinstead of> /dev/tty. Bug found on: bash on certain systems (at least QNX and Interix).
  • BUG_DOLRCSUB:parsing problem where, inside a command substitution of the form$(...),the sequence$$'...'is treated as$'...'(i.e. as a use of CESCQUOT), and$$ "..."as$ "..."(bash-specific translatable string). (Found in bash up to 4.4)
  • BUG_DQGLOB:globbing is not properly deactivated within double-quoted strings. Within double quotes, a*or?immediately following a backslash is interpreted as a globbing character. This applies to both pathname expansion and pattern matching incase.Found in: dash. (The bug is not triggered when using modernish match.)
  • BUG_EXPORTUNS:Setting the export flag on an otherwise unset variable causes a set and empty environment variable to be exported, though the variable continues to be considered unset within the current shell. (FreeBSD sh < 13.0)
  • BUG_FNSUBSH:Function definitions within subshells (including command substitutions) are ignored if a function by the same name exists in the main shell, so the wrong function is executed.unset -fis also silently ignored. ksh93 (all current versions as of November 2018) has this bug. It only applies to non-forked subshells. SeeNONFORKSUBSH.
  • BUG_FORLOCAL:aforloop in a function makes the iteration variable local to the function, so it won't survive the execution of the function. Found on: yash. This is intentional and documented behaviour on yash in non-POSIX mode, but in POSIX terms it's a bug, so we mark it as such.
  • BUG_GETOPTSMA:Thegetoptsbuiltin leaves a:instead of a?in the specified option variable if a given option that requires an argument lacks an argument, and the option string does not start with a:.(zsh)
  • BUG_HDOCBKSL:Line continuation usingbackslashes in expanding here-documents is handled incorrectly. (zsh up to 5.4.2)
  • BUG_HDOCMASK:Here-documents (and here-strings, seeHERESTRING) use temporary files. This fails if the currentumasksetting disallows the user to read, so the here-document can't read from the shell's temporary file. Workaround: ensure user-readableumaskwhen using here-documents. (bash, mksh, zsh)
  • BUG_IFSCC01PP:IfIFScontains a$CC01(^A) control character, the expansion"$@"(even quoted) is gravely corrupted.Since many modernish functions use this to loop through the positional parameters, this breaks the library.(Found in bash < 4.4)
  • BUG_IFSGLOBC:In glob pattern matching (such as incaseand[[), if a wildcard character is part ofIFS,it is matched literally instead of as a matching character. This applies to glob characters*,?,[and]. Since nearly all modernish functions usecasefor argument validation and other purposes, nearly every modernish function breaks on shells with this bug ifIFScontains any of these three characters! (Found in bash < 4.4)
  • BUG_IFSGLOBP:In pathname expansion (filename globbing), if a wildcard character is part ofIFS,it is matched literally instead of as a matching character. This applies to glob characters*,?,[and]. (Bug found in bash, all versions up to at least 4.4)
  • BUG_IFSGLOBS:in glob pattern matching (as incaseor parameter substitution with#and%), ifIFSstarts with?or*and the "$*"parameter expansion inserts anyIFSseparator characters, those characters are erroneously interpreted as wildcards when quoted "$*" is used as the glob pattern. (AT&T ksh93)
  • BUG_IFSISSET:AT&T ksh93 (2011/2012 versions):${IFS+s}always yieldss even ifIFSis unset. This applies toIFSonly.
  • BUG_ISSETLOOP:AT&T ksh93: Expansions like${var+set} remain static when used within afor,whileor untilloop; the expansions don't change along with the state of the variable, so they cannot be used to check whether a variable is set within a loop if the state of that variable may change in the course of the loop.
  • BUG_KBGPID:AT&T ksh93: If a single command ending in&(i.e. a background job) is enclosed in a{braces;}block with an I/O redirection, the$! special parameter is not set to the background job's PID.
  • BUG_KUNSETIFS:AT&T ksh93: UnsettingIFSfails to activate default field splitting if the following conditions are met: 1.IFSis set and empty (i.e. split is disabled) in the main shell, and at least one expansion has been processed with that setting; 2. The code is currently executing in a non-forked subshell (seeNONFORKSUBSH).
  • BUG_LNNONEG:$LINENObecomes wildly inaccurate, even negative, when dotting/sourcing scripts. Bug found on: dash with LINENO support compiled in.
  • BUG_LOOPRET1:If areturncommand is given with a status argument within the set of conditional commands in awhileoruntilloop (i.e., between while/untilanddo), the status argument is ignored and the function returns with status 0 instead of the specified status. Found on: dash <= 0.5.8; zsh <= 5.2
  • BUG_LOOPRET2:If areturncommand is given without a status argument within the set of conditional commands in awhileoruntilloop (i.e., betweenwhile/untilanddo), the exit status passed down from the previous command is ignored and the function returns with status 0 instead. Found on: dash <= 0.5.10.2; AT&T ksh93; zsh <= 5.2
  • BUG_LOOPRET3:If areturncommand is given within the set of conditional commands in awhileoruntilloop (i.e., betweenwhile/untiland do),andthe return status (either the status argument toreturnor the exit status passed down from the previous command byreturnwithout a status argument) is non-zero,andthe conditional command list itself yields false (forwhile) or true (foruntil),andthe whole construct is executed in a dot script sourced from another script, then too many levels of loop are broken out of, causingprogram flow corruptionor premature exit. Found on: zsh <= 5.7.1
  • BUG_MULTIBIFS:We're on a UTF-8 locale and the shell supports UTF-8 characters in general (i.e. we don't haveWRN_MULTIBYTE) – however, using multi-byte characters asIFSfield delimiters still doesn't work. For example,"$*"joins positional parameters on the first byte ofIFS instead of the first character. (ksh93, mksh, FreeBSD sh, Busybox ash)
  • BUG_NOCHCLASS:POSIX-mandated character[:classes:]within bracket [expressions]are not supported in glob patterns. (mksh)
  • BUG_NOEXPRO:Cannot export read-only variables. (zsh <= 5.7.1 in sh mode)
  • BUG_OPTNOLOG:on dash, setting-o nologcauses$-to wreak havoc: trying to expand$-silently aborts parsing of an entire argument, so e.g."one,$-,two"yields"one,".(Same applies to-o debug.)
  • BUG_PP_01:POSIX says that empty"$@"generates zero fields but empty''or""or "$emptyvariable"generates one empty field. This means concatenating "$@"with one or more other, separately quoted, empty strings (like "$@" "$emptyvariable") should still produce one empty field. But on bash 3.x, this erroneously produces zero fields. (See also QRK_EMPTPPWRD)
  • BUG_PP_02:LikeBUG_PP_01,but with unquoted$@and only with"$emptyvariable" $@,not$@ "$emptyvariable". (mksh <= R50f; FreeBSD sh <= 10.3)
  • BUG_PP_03:WhenIFSis unset or empty (zsh 5.3.x) or empty (mksh <= R50), assigningvar=$*only assigns the first field, failing to join and discarding the rest of the fields. Workaround:var= "$*" (POSIX leavesvar=$@,etc. undefined, so we don't test for those.)
  • BUG_PP_03A:WhenIFSis unset, assignments likevar=$* incorrectly remove leading and trailing spaces (but not tabs or newlines) from the result. Workaround: quote the expansion. Found on: bash 4.3 and 4.4.
  • BUG_PP_03B:WhenIFSis unset, assignments likevar=${var+$*}, etc. incorrectly remove leading and trailing spaces (but not tabs or newlines) from the result. Workaround: quote the expansion. Found on: bash 4.3 and 4.4.
  • BUG_PP_03C:WhenIFSis unset, assigningvar=${var-$*}only assigns the first field, failing to join and discarding the rest of the fields. (zsh 5.3, 5.3.1) Workaround:var=${var- "$*" }
  • BUG_PP_04A:Like BUG_PP_03A, but for conditional assignments within parameter substitutions, as in:${var=$*}or:${var:=$*}. Workaround: quote either$*within the expansion or the expansion itself. (bash <= 4.4)
  • BUG_PP_04E:When assigning the positional parameters ($*) to a variable using a conditional assignment within a parameter substitution, e.g. :${var:=$*},the fields are always joined and separated by spaces, except ifIFSis set and empty. Workaround as in BUG_PP_04A. (bash 4.3)
  • BUG_PP_04_S:WhenIFSis null (empty), the result of a substitution like${var=$*}is incorrectly field-split on spaces. The assignment itself succeeds normally. Found on: bash 4.2, 4.3
  • BUG_PP_05:POSIX says that empty$@and$*generate zero fields, but with nullIFS,empty unquoted$@and$*yield one empty field. Found on: dash 0.5.9 and 0.5.9.1; Busybox ash.
  • BUG_PP_06A:POSIX says that unquoted$@and$*initially generate as many fields as there are positional parameters, and then (because$@or$*is unquoted) each field is split further according toIFS.With this bug, the latter step is not done ifIFSis unset (i.e. default split). Found on: zsh < 5.4
  • BUG_PP_07:unquoted$*and$@(including in substitutions like ${1+$@}or${var-$*}) do not perform default field splitting if IFSis unset. Found on: zsh (up to 5.3.1) in sh mode
  • BUG_PP_07A:WhenIFSis unset, unquoted$*undergoes word splitting as ifIFS=' ',and not the expectedIFS= "${CCt}${CCn}". Found on: bash 4.4
  • BUG_PP_08:WhenIFSis empty, unquoted$@and$*do not generate one field for each positional parameter as expected, but instead join them into a single field without a separator. Found on: yash < 2.44 and dash < 0.5.9 and Busybox ash < 1.27.0
  • BUG_PP_08B:WhenIFSis empty, unquoted$*within a substitution (e.g. ${1+$*}or${var-$*}) does not generate one field for each positional parameter as expected, but instead joins them into a single field without a separator. Found on: bash 3 and 4
  • BUG_PP_09:WhenIFSis non-empty but does not contain a space, unquoted$*within a substitution (e.g.${1+$*}or${var-$*}) does not generate one field for each positional parameter as expected, but instead joins them into a single field separated by spaces (even though, as said,IFSdoes not contain a space). Found on: bash 4.3
  • BUG_PP_10:WhenIFSis null (empty), assigningvar=$*removes any $CC01(^A) and$CC7F(DEL) characters. (bash 3, 4)
  • BUG_PP_10A:WhenIFSis non-empty, assigningvar=$*prefixes each $CC01(^A) and$CC7F(DEL) character with a$CC01character. (bash 4.4)
  • BUG_PP_1ARG:WhenIFSis empty on bash <= 4.3 (i.e. field splitting is off),${1+ "$@" }or"${1+$@}"is counted as a single argument instead of each positional parameter as separate arguments. This also applies to prepending text only if there are positional parameters with something like"${1+foobar $@}".
  • BUG_PP_MDIGIT:Multiple-digit positional parameters don't require expansion braces, so e.g.$10=${10}(dash; Busybox ash). This is classed as a bug because it causes a straight-up incompatibility with POSIX scripts. POSIX says: "The parameter name or symbol can be enclosed in braces, which are optional except for positional parameters with more than one digit [...]".
  • BUG_PP_MDLEN:For${#x}expansions where x >= 10, only the first digit of the positional parameter number is considered, e.g.${#10},${#12}, ${#123}are all parsed as if they are${#1}.Then, string parsing is aborted so that further characters or expansions, if any, are lost. Bug found in: dash 0.5.11 - 0.5.11.4 (fixed in dash 0.5.11.5)
  • BUG_PSUBASNCC:in an assignment parameter substitution of the form ${foo=value},if the characters$CC01(^A) or$CC7F(DEL) are in the value, all their occurrences are stripped from the expansion (although the assignment itself is done correctly). If the expansion is quoted, only $CC01is stripped. This bug is independent of the state ofIFS,except if IFSis null, the assignment in${foo=$*}(unquoted) is buggy too: it strips$CC01from the assigned value. (Found on bash 4.2, 4.3, 4.4)
  • BUG_PSUBBKSL1:A backslash-escaped}character within a quoted parameter substitution is not unescaped. (bash 3.2, dash <= 0.5.9.1, Busybox 1.27 ash)
  • BUG_PSUBEMIFS:ifIFSis empty (no split, as in safe mode), then if a parameter substitution of the forms${foo-$*},${foo+$*},${foo:-$*}or ${foo:+$*}occurs in a command argument, the characters$CC01(^A) or $CC7F(DEL) are stripped from the expanded argument. (Found on: bash 4.4)
  • BUG_PSUBEMPT:Expansions of the form${V-}and${V:-}are not subject to normal shell empty removal if that parameter is unset, causing unexpected empty arguments to commands. Workaround:${V+$V}and ${V:+$V}work as expected. (Found on FreeBSD 10.3 sh)
  • BUG_PSUBIFSNW:When field-splitting unquoted parameter substitutions like ${var#foo},${var##foo},${var%foo}or${var%%foo}on non-whitespace IFS,if there is an initial empty field, a spurious extra initial empty field is generated. (mksh)
  • BUG_PSUBNEWLN:Due to a bug in the parser, parameter substitutions spread over more than one line cause a syntax error. Workaround: instead of a literal newline, use$CCn. (found in dash <= 0.5.9.1 and Busybox ash <= 1.28.1)
  • BUG_PSUBSQUOT:in pattern matching parameter substitutions (${param#pattern},${param%pattern},${param##pattern}and ${param%%pattern}), if the whole parameter substitution is quoted with double quotes, then single quotes in thepatternare not parsed. POSIX says they are to keep their special meaning, so that glob characters may be quoted. For example:x=foobar; echo "${x#'foo'}"should yieldbar but with this bug yieldsfoobar.(dash <= 0.5.9.1; Busybox 1.27 ash)
  • BUG_PSUBSQHD:Like BUG_PSUBSQUOT, but included a here-document instead of quoted with double quotes. (dash <= 0.5.9.1; mksh)
  • BUG_PUTIOERR:Shell builtins that output strings (echo,printf,ksh/zsh print), and thus also modernishputandputln,do not check for I/O errors on output. This means a script cannot check for them, and a script process in a pipe can get stuck in an infinite loop ifSIGPIPEis ignored.
  • BUG_READWHSP:If there is more than one field to read,readdoes not trim trailingIFSwhitespace. (dash 0.5.7, 0.5.8)
  • BUG_REDIRIO:the I/O redirection operator<>(open a file descriptor for both read and write) defaults to opening standard output (i.e. is short for1<>) instead of defaulting to opening standard input (0<>) as POSIX specifies. (AT&T ksh93)
  • BUG_REDIRPOS:Buggy behaviour occurs if aredirection ispositioned in between to variable assignments in the same command. On zsh 5.0.x, a parse error is thrown. On zsh 5.1 to 5.4.2, anything following the redirection (other assignments or command arguments) is silently ignored.
  • BUG_SCLOSEDFD:bash < 5.0 and dash fail to establish a block-local scope for a file descriptor that is added to the end of the block as a redirection that closes that file descriptor (e.g.} 8<&-ordone 7>&-). If that FD is already closed outside the block, the FD remains global, so you can't locallyexecit. So with this bug, it is not straightforward to make a block-local FD appear initially closed within a block. Workaround: first open the FD, then close it – for example:done 7>/dev/null 7>&-will establish a local scope for FD 7 for the precedingdo...doneblock while still making FD 7 appear initially closed within the block.
  • BUG_SETOUTVAR:Thesetbuiltin (with no arguments) only prints native function-local variables when called from a shell function. (yash <= 2.46)
  • BUG_SHIFTERR0:Theshiftbuiltin silently returns a successful exit status (0) when attempting to shift a number greater than the current amount of positional parameters. (Busybox ash <= 1.28.4)
  • BUG_SPCBILOC:Variable assignments preceding special builtins create a partially function-local variable if a variable by the same name already exists in the global scope. (bash < 5.0 in POSIX mode)
  • BUG_TESTERR1A:test/[exits with a non-errorfalsestatus (1) if an invalid argument is given to an operator. (AT&T ksh93)
  • BUG_TESTILNUM:On dash (up to 0.5.8), giving an illegal number totest -t or[ -tcauses some kind of corruption so the nexttest/[invocation fails with an "unexpected operator" error even if it's legit.
  • BUG_TESTONEG:Thetest/[builtin supports a-ounary operator to check if a shell option is set, but it ignores thenoprefix on shell option names, so something like[ -o noclobber ]gives a false positive. Bug found on yash up to 2.43. (TheTESTOfeature test implicitly checks against this bug and won't detect the feature if the bug is found.)
  • BUG_TRAPEMPT:Thetrapbuiltin does not quote empty traps in its output, rendering the output unsuitable for shell re-input. For instance, trap '' INT; trapoutputs "trap -- INT"instead of"trap -- '' INT". (found in mksh <= R56c)
  • BUG_TRAPEXIT:the shell'strapbuiltin does not know the EXIT trap by name, but only by number (0). Using the name throws a "bad trap" error. Found in klibc 2.0.4 dash.
  • BUG_TRAPFNEXI:When a function issues a signal whose trap exits the shell, the shell is not exited immediately, but only on return from the function. (zsh)
  • BUG_TRAPRETIR:Usingreturnwithinevaltriggers infinite recursion if both a RETURN trap and thefunctraceshell option are active. This bug in bash-only functionality triggers a crash when using modernish, so to avoid this, modernish automatically disables thefunctraceshell option if a RETURNtrap is set or pushed and this bug is detected. (bash 4.3, 4.4)
  • BUG_TRAPSUB0:Subshells in traps fail to pass down a nonzero exit status of the last command they execute, under certain conditions or consistently, depending on the shell. (bash <= 4.0; dash 0.5.9 - 0.5.10.2; yash <= 2.47)
  • BUG_TRAPUNSRE:When a trapunsets itself and thenresends its own signal, the execution of the trap action (including functions called by it) is not interrupted by the now-untrapped signal; instead, the process terminates after completing the entire trap routine. (bash <= 4.2; zsh)
  • BUG_UNSETUNXP:If an unset variable is given the export flag using the exportcommand, a subsequentunsetcommand does not remove that export flag again. Workaround: assign to the variable first, then unset it to unexport it. (Found on AT&T ksh JM-93u-2011-02-08; Busybox 1.27.0 ash)
  • BUG_VARPREFIX:On a shell with theVARPREFIXfeature, expansions of type ${!prefix@}and${!prefix*}do not find the variable name prefixitself. (AT&T ksh93)
  • BUG_ZSHNAMES:A series of lowerase names, normally okay for script use as per POSIX convention, is reserved for special use. Unsetting these names is impossible in most cases, and changing them may corrupt important shell or system settings. This may conflict with simple-formmodernish scripts. This bug is detected on zsh when it was not initially invoked in emulation mode, and emulation mode was enabled usingemulate shpost invocation instead (which does not disable these conflicting parameters). As of zsh 5.6, the list of variable names affected is:aliasesargv builtinscdpathcommandsdirstackdis_aliasesdis_builtins dis_functionsdis_functions_sourcedis_galiasesdis_patchars dis_reswordsdis_saliasesfignorefpathfuncfiletrace funcsourcetracefuncstackfunctionsfunctions_sourcefunctrace galiaseshistcharshistoryhistorywordsjobdirsjobstates jobtextskeymapsmailpathmanpathmodule_pathmodulesnameddirs optionsparameterspatcharspathpipestatuspromptpsvar reswordssaliasessignalsstatustermcapterminfouserdirs usergroupswatchwidgetszsh_eval_contextzsh_scheduled_events
  • BUG_ZSHNAMES2:Two lowercase variable nameshistcharsandsignals, normally okay for script use as per POSIX convention, are reserved for special use on zsh,evenif zsh is initialised in sh mode (via ash symlink or using the--emulate shoption at startup). Bug found on: zsh <= 5.7.1. The bug is only detected ifBUG_ZSHNAMESis not detected, because this bug's effects are included in that one's.

Warning IDs

Warning IDs do not identify any characteristic of the shell, but instead warn about a potentially problematic system condition that was detected at initialisation time.

  • WRN_EREMBYTE:The current system locale setting supports Unicode UTF-8 multi-byte/variable-length characters, but the utility used by str ematch to match extended regular expressions (EREs) does not support them and treats all characters as single bytes. This means multi-byte characters will be matched as multiple characters, and character[:classes:] within bracket expressions will only match ASCII characters.
  • WRN_MULTIBYTE:The current system locale setting supports Unicode UTF-8 multi-byte/variable-length characters, but the current shell does not support them and treats all characters as single bytes. This means counting or processing multi-byte characters with the current shell will produce incorrect results. Scripts that need compatibility with this system condition should checkif thisshellhas WRN_MULTIBYTEand resort to a workaround that uses external utilities where necessary.
  • WRN_NOSIGPIPE:Modernish has detected that the process that launched the current program has setSIGPIPEto ignore, an irreversible condition that is in turn inherited by any process started by the current shell, and their subprocesses, and so on. The system constant $SIGPIPESTATUS is set to the special value 99999 and neither the current shell nor any process it spawns is now capable of receivingSIGPIPE.The -Poption toharden is also rendered ineffective. Depending on how a given commandfoois implemented, it is now possible that a pipeline such asfoo | head -n 10never ends; iffoodoesn't check for I/O errors, the only way it would ever stop trying to write lines is by receivingSIGPIPEasheadterminates. Programs that use commands in this fashion should checkif thisshellhas WRN_NOSIGPIPEand either employ workarounds or refuse to run if so.

Appendix B: Regression test suite

Modernish comes with a suite of regression tests to detect bugs in modernish itself, which can be run usingmodernish --testafter installation. By default, it will run all the tests verbosely but without tracing the command execution. Theinstall.shinstaller will runmodernish --test -eqqon the selected shell before installation.

A few options are available to specify after--test:

  • -h:show help.
  • -e:disable or reduce expensive (i.e. slow or memory-hogging) tests.
  • -q:quieter operation; report expected fails [known shell bugs] and unexpected fails [bugs in modernish]). Add-qagain for quietest operation (report unexpected fails only).
  • -s:entirely silent operation.
  • -t:run only specific test sets or tests. Test sets are those listed in the full default output ofmodernish --test.This option requires an option-argument in the following format:
    testset1:num1,num2,/testset2:num1,num2,/
    The colon followed by numbers is optional; if omitted, the entire set will be run, otherwise the given numbered tests will be run in the given order. Example:modernish --test -t match:2,4,7/arith/shellquote:1runs test 2, 4 and 7 from thematchset, the entirearithset, and only test 1 from theshellquoteset. Atestsetcan also be given as the incomplete beginning of a name or as a shell glob pattern. In that case, all matching sets will be run.
  • -x:trace each test using the shell'sxtracefacility. Each trace is stored in a separate file in a specially created temporary directory. By default, the trace is deleted if a test does not produce an unexpected fail. Add-xagain to keep expected fails as well, and again to keep all traces regardless of result. If any traces were saved, modernish will tell you the location of the temporary directory at the end, otherwise it will silently remove the directory again.
  • -E:don't run any tests, but output a command to open the tests that would have been run in your editor. The editor from theVISUALorEDITOR environment variable is used, withvias a default. This option should be used together with-tto specify tests. All other options are ignored.
  • -F:takes an argument with the name or path to afindutility to prefer when testingLOOP find. More info here.

These short options can be combined so, for example, --test -qxxis the same as--test -q -x -x.

Difference between capability detection and regression tests

Note the difference between these regression tests and the cap tests listed in Appendix A.The latter are tests for whatever shell is executing modernish: they detect capabilities (features, quirks, bugs) of the current shell. They are meant to be run via thisshellhasand are designed to be taken advantage of in scripts. On the other hand, these tests run by modernish --testare regression tests for modernish itself. It does not make sense to use these in a script.

New/unknown shell bugs can still cause modernish regression tests to fail, of course. That's why some of the regression tests also check for consistency with the results of the capability detection tests: if there is a shell bug in a widespread release version that modernish doesn't know about yet, this in turn is considered to be a bug in modernish, because one of its goals is to know about all the shell bugs in all released shell versions currently seeing significant use.

Testing modernish on all your shells

Thetestshells.shprogram inshare/doc/modernish/examplescan be used to run the regression test suite on all the shells installed on your system. You could put it astestshellsin some convenient location in your $PATH,and then simply run:

testshells modernish --test

(adding any further options you like – for instance, you might like to add -qto avoid very long terminal output). On first run,testshellswill generate a list of shells it can find on your system and it will give you a chance to edit it before proceeding.

Appendix C: Supported locales

modernish, like most shells, fully supports two system locales: POSIX (a.k.a. C, a.k.a. ASCII) and Unicode's UTF-8. It will work in other locales, but things like converting to upper/lower case, and matching single characters in patterns, are not guaranteed.

Caveat:some shells or operating systems have bugs that prevent (or lack features required for) full locale support. If portability is a concern, check forthisshellhas WRN_MULTIBYTEorthisshellhas BUG_NOCHCLASS where needed. SeeAppendix A.

Scripts/programs shouldnotchange the locale (LC_*orLANG) after initialising modernish. Doing this might break various functions, as modernish sets specific versions depending on your OS, shell and locale. (Temporarily changing the locale is fine as long as you don't use modernish features that depend on it – for example, setting a specific locale just for an external command. However, if you useharden,see theimportant note in its documentation!)

Appendix D: Supported shells

Modernish builds on the POSIX 2018 Edition standard, so it should run on any sufficiently POSIX-compliant shell and operating system. It uses both bug/feature detection and regression testing to determine whether it can run on any particular shell, so it does not block or support particular shell versions as such. However, modernish has been confirmed to run correctly on the following shells:

  • bash3.2 or higher
  • Busyboxash 1.20.0 or higher, excluding 1.28.x (also possibly excluding anything older than 1.27.x on UTF-8 locales, depending on your operating system)
  • dash(Debian sh) 0.5.7 or higher,excluding0.5.10, 0.5.10.1, 0.5.11-0.5.11.4
  • FreeBSDsh 11.0 or higher
  • gwsh
  • ksh93u+ 2012-08-01, 93u+m
  • mkshversion R55 or higher
  • NetBSDsh 9.0 or higher
  • yash2.40 or higher (2.44+ for POSIX mode)
  • zsh5.3 or higher

Currently knownnotto run modernish due to excessive bugs:

Appendix E: zsh: integration with native scripts

This appendix is specific tozsh.

While modernish duplicates some functionality already available natively on zsh, it still has plenty to add. However, writing a normal simple-formmodernish script turns emulate shon for the entire script, so you lose important aspects of the zsh language.

But there is another way – modernish functionality may be integrated with native zsh scripts using 'sticky emulation', as follows:

emulate -R sh -c'.modernish'

This causes modernish functions to run in sh mode while your script will still run in native zsh mode with all its advantages. The following notes apply:

  • Using thesafe modeisnotrecommended, as zsh does not apply split/glob to variable expansions by default, and the modernish safe mode would defeat the${~var}and${=var}flags that apply these on a case by case basis. This does mean that:
    • The--splitand--globoperators to constructs such as LOOP find are not available. Use zsh expansion flags instead.
    • Quoting literal glob patterns to commands likefindremains necessary.
  • UsingLOCALis not recommended. Anonymous functions are the native zsh equivalent.
  • Native zsh loops should be preferred over modernish loops, except where modernish adds functionality not available in zsh (such asLOOP findor user-programmed loops).

Seeman zshbuiltinsunderemulate,option-c,for more information.

Appendix F: Bundling modernish with your script

The modernish installerinstall.shcan bundle one or more scripts with a stripped-down version of the modernish library. This allows the bundled scripts to run with a known version of modernish, whether or not modernish is installed on the user's system. Like modernish itself, bundling is cross-platform and portable (or as portable as your script is).

Bundled scripts are not modified. Instead, for each script, a wrapper script is installed under the same name in the installation root directory. This wrapper automatically looks for asuitable POSIX-compliant shell that passes the modernish battery of fatal bug tests, then sets up the environment to run the real script with modernish on that shell. Your modernish script can be run through the supplied wrapper script from any directory location on any POSIX-compliant operating system, as long as all files remain in the same location relative to each other.

Bundling is always a non-interactive installer operation, with options specified on the command line. The installer usage for bundling is as follows:

install.sh-B-Drootdir[-dsubdir] [-sshell]scriptfile[scriptfile... ]

The-Boption enables bundling mode. The option does not itself take an option-argument. Instead, any number ofscriptfiles to bundle can be given as arguments following all other options. All scripts are bundled with a single copy of modernish. The bundling operation does not deal with any auxiliary files the scripts may require (other than modernish modules); any such need to be added manually after bundling is complete.

The-Doption specifies the path to the bundled installation's root directory, where wrapper scripts are installed. This option is mandatory. If the directory doesn't exist, it is created.

The-doption specifies the subdirectory of the-Droot directory where the bundled scripts and modernish are installed. It can contain slashes to install the bundle at a deeper directory level. The default subdirectory isbndl. The option-argument can be empty or/,in which case the bundle is installed directly into the installation root directory.

The-soption specifies a preferred shell for the bundled scripts. A shell name or a full path to a shell can be given. Wrapper scripts try the full path first (if any), then try to find a shell with its basename, and then try to find a shell with that basename minus any version number (e.g.bashinstead ofbash-5.0orkshinstead ofksh93). If all that doesn't produce a shell that passes fatal bugs tests, it continues with the normal shell search.

This means the script won't fail to launch if the preferred shell can't be found. Instead, it is up to the script itself to refuse to run if required shell-specific conditions are not met. Script should use the thisshellhas function to check for any nonstandard capabilities required, or any bugs or quirks that the script is incompatible with (or indeed requires!).

Bundling is supported for both portable-form and simple-form modernish scripts. The installer automatically adapts the wrapper scripts to the form used. For simple-form scripts, the directory containing the bundled modernish core library (by default,.../bndl/bin/modernish) is prefixed to $PATHso that.modernishworks. Since simple-form scripts are often more shell-specific, you may want to specify a preferred shell with the-soption.

To save space, the bundled copy of the modernish library is reduced such that all comments are stripped from the code, interactive use is not supported, theregression test suite is not included, thisshellhas does not have the--cacheand--showoperators, and the cap/*.tcapability detection scripts are "statically linked" (directly included) into bin/modernish instead of shipped as separate files. AREADME.modernishfile is added with a short explanation, the licence, and a link for people to get the complete version of modernish. Please do not remove this when distributing bundled scripts.


EOF