\iffalse meta-comment File l3sys-query-tool.tex Copyright (C) 2024 The LaTeX Project Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ----------------------------------------------------------------------- The development version of the bundle can be found at https://github.com/latex3/l3sys-query for those people who are interested. ----------------------------------------------------------------------- \fi \documentclass{l3doc} \begin{document} \title{The \pkg{l3sys-query} script: System queries for LaTeX using Lua} \author{% The \LaTeX{} Project\thanks{% Email: \href{mailto:latex-team@latex-project.org} {latex-team@latex-project.org}% }% } \date{Release 2024-04-08} \maketitle \tableofcontents \begin{documentation} \section{Introduction} \TeX{} engines provide only very limited access to information about the system they are used on: using primitives, one can for example get the size of a single file, but not a list of files in a given location. For most documents, this is not an issue as they are self-contained. However, for cases where \enquote{dynamic} construction of parts of a document is needed based on file lists or other system-dependent data, methods to obtain this from (restricted) shell escape are desirable. Security considerations mean that directly querying the system shell is problematic for general use. Instead, \emph{restricted} shell escape may be used to get many details, provided a suitable tool is available to provide the information in a platform-neutral and security-conscious way. The Java program \texttt{texosquery}, written by Nicola Talbot, has been available for a number of years to provide this facility. As well as file system insight, \texttt{texosquery} also provides for example locale data and other system information. However, the requirement for Java means that the script is not automatically usable when a \TeX{} system is installed. The \LaTeX{} team have therefore provided a Lua-based script, \texttt{l3sys-query}, which conforms to the security requirements of \TeX{} Live using Lua to obtain the system information. This means that it can be used \enquote{out of the box} across platforms. The facilities provided by \texttt{l3sys-query} are more limited than \texttt{texosquery}, partly as some information is available in modern \TeX{} systems using primitives, and partly as the aim of \texttt{l3sys-query} is to provide information where there are defined use cases. Requests for additional data interfaces are welcome. \section{The command line interface} The command line interface to \begin{center} \ttfamily l3sys-query \meta{cmd} [\meta{option(s)}] [\meta{args}] \end{center} where \texttt{\meta{cmd}} can be one of the following: \begin{itemize}[noitemsep]\ttfamily \item ls \item ls \meta{args} \item pwd \end{itemize} The \meta{cmd} are described below. The result of the \meta{cmd} will be printed to the terminal in an interactive run; in normal usage, this will be piped to the calling \TeX{} process. Results containing path separators \emph{always} use~|/|, irrespective of the platform in use. As well as these targets, the script recognizes the options \begin{itemize} \item |--exclude| Specification for directory entries to exclude \item |--ignore-case| Ignores case when sorting directory listings \item |--pattern| (|-p|) Treat the \meta{args} as Lua patterns rather than converting from wildcards \item |--recursive| (|-r|) Enables recursive searching during directory listings \item |--reverse| Causes sorting to go from highest to lowest rather than lowest to highest \item |--sort| Sets the method used to sort entries returned by |ls| \item |--type| Selects the type of entry returned by |ls| \end{itemize} The action of these options on the appropriate \meta{cmd(s)} is detailed below. \subsection{\texttt{ls [\meta{args}]}} Lists the contents of one or more directories, in a manner somewhat reminiscent of the Unix command |ls| or the Windows command |dir|. The exact nature of the output will depend on the \meta{args}, if given, along with the prevailing options. Note that the options names are inspired by ideas from the Unix commands |ls| and |find| as well as the Windows command |dir|: they therefore do not map directly to those of any one of the command line tools that they somewhat mirror. When no \meta{args} are given, all entries in the current directory will be listed, one per line in the output. This will include both files and subdirectories. Each entry will include a path relative to the current directory: for files \emph{in} the current directory, this will be |./|. The order of results will be determined by the underlying operating system process: unless requested \emph{via} an option, no sorting takes place. As standard, the \meta{args} are treated as a file/path name potentially including |?| and |*| as wildcards, for example |*.png| or |file?.txt|. \begin{verbatim} l3sys-query ls '*.png' \end{verbatim} Some care is needed in preventing expansion of such wildcards by the shell or \texttt{texlua} process: these are detailed in Section~\ref{sec:wildcard}. In this section, |'| is used to indicate a character being used to suppress expansion: this is for example normal on macOS and Linux. Removal of entries from the listing can be achieved using the |--exclude| option, which should be given with a \meta{xarg}, for example \begin{verbatim} l3sys-query ls --exclude '*.bak' 'graphics/*' \end{verbatim} Directory entries starting |.| are traditionally hidden on Linux and macOS systems: these \enquote{dot} entries are excluded from the output of \texttt{l3sys-query}. The entries |.| and |..| for the current and parent directory are also excluded from the results returned by \texttt{l3sys-query} as they can always be assumed. For more complex matching, the \meta{args} can be treated as a Lua pattern using the |--pattern| (|-p|) option; this also applies to the \meta{xarg} argument to the |--exclude| option. For example, the equivalent to wildcard |*.png| could be obtained using \begin{verbatim} l3sys-query ls --pattern '^.*%.png$' \end{verbatim} The results returned by |ls| can be sorted using the |--sort| option. This can be set to |none| (use the order from the file system: the default), |name| (sort by file name) or |date| (sort by date last modified). The sorting order can be reversed using |--reverse|. Sorting normally takes account of case: this can be suppressed with the |--ignore-case| option. The listing can be filtered based on the type of entry using the |--type| option. This takes a string argument, one of |d| (directory) or |f| (file). As standard, only the path specified as part of the \meta{args} is queried. However, if the |--recursive| (|-r|) option is set, the query is applied within all subdirectories. Subdirectories starting with~|.| (macOS and Linux hidden) are excluded from recursion. For security reasons, only paths within the current working directory can be queried, this for example |graphics/*.png| will list all |png| files in the |graphics| subdirectory, but |../graphics/*.png| will yield no output. \subsection{\texttt{pwd}} Returns the absolute path to the current working directory from which \texttt{l3sys-query} is run. From within a \TeX{} run, this will (usually) be the directory containing the main file, assuming a command such as \begin{verbatim} pdflatex main.tex \end{verbatim} The \texttt{pwd} command is unaffected by any options. \section{Spaces in arguments} Since \texttt{l3sys-query} is intended primarily for use with restricted shell escape calls from \TeX{} processes, handling of spaces is unusual. It is not possible to quote spaces in such a call, so for example whilst \begin{verbatim} l3sys-query ls "foo *" \end{verbatim} does work from the command prompt to find all files with names starting \verb*|foo |, it would not work \emph{via} restricted shell escape. To circumvent this, \texttt{l3sys-query} will collect all command line arguments after any \meta{options}, and combine these as a space-separated \meta{args}, for example allowing \begin{verbatim} l3sys-query ls foo '*' \end{verbatim} to achieve the same result as the first example. The result is that the \meta{args} will only every be interpreted by \texttt{l3sys-query} as a single argument. It also means that spaces cannot be used at the start or end of the argument, nor can multiple spaces appear between non-space arguments. \section{Wildcard expansion handling\label{sec:wildcard}} The handling of wildcards needs some further comment for those using \texttt{l3sys-query} from the command line: the \pkg{expl3} interface described in Section~\ref{sec:expl3} handles this aspect automatically for the user. On macOS and Linux, the shell normally expands globs, which include the wildcards |*| and |?|, before passing arguments to the appropriate command. This can be suppressed by surrounding the argument with |'| characters, hence the formulation \begin{verbatim} l3sys-query ls '*.png' \end{verbatim} earlier. On Windows, the shell does no expansion, and thus arguments are passed as-is to the relevant command. As such, |'| has no special meaning here. However, to allow quoting of wildcards from the shell in a platform-neutral manner, \texttt{l3sys-query} will strip exactly one set of |'| characters around each argument before further processing. It is not possible to use |"| quotes at all in the argument passed to \texttt{l3sys-query} from \TeX{}, as the \TeX{} system removes all |"| in |\input| while handling space quoting. Restricted shell escape prevents shell expansion of wildcards entirely. On non-Windows systems, it does this by ensuring that each argument is |'| quoted to ensure further expansion. Thus a \TeX{} call such as \begin{verbatim} \input|"l3sys-query ls '*.png'" \end{verbatim} will work if |--shell-escape| is used as the argument is passed directly to the shell, but in restricted shell escape will give an error such as: \begin{verbatim} ! I can't find file `"|l3sys-query ls '*.png'"'. \end{verbatim} The \LaTeX{} interfaces described below adust the quoting used depending on the |shell-escape| status. \section{The \LaTeX{} interface\label{sec:expl3}} Using \texttt{l3sys-query} is not tied to access \emph{via} \pkg{expl3}, but this is the preferred approach for the \LaTeX{} Team. Details of how to use \texttt{l3sys-query} as an \pkg{expl3} programmer are covered in \texttt{interface3.pdf}. A document level interface is also be provided via a \pkg{l3sys-query} package which is based on the \pkg{expl3} interface; it is described in \texttt{l3sys-query.pdf}. \end{documentation} \PrintIndex \end{document}