[vset VERSION 1.3.1] [manpage_begin docstrip_util n [vset VERSION]] [see_also docstrip] [see_also doctools] [see_also doctools_fmt] [keywords .ddt] [keywords .dtx] [keywords catalogue] [keywords diff] [keywords docstrip] [keywords doctools] [keywords documentation] [keywords LaTeX] [keywords {literate programming}] [keywords module] [keywords {package indexing}] [keywords patch] [keywords source] [keywords {Tcl module}] [copyright "2003\u20132010 Lars Hellstr\u00F6m\ "] [moddesc {Literate programming tool}] [titledesc {Docstrip-related utilities}] [category {Documentation tools}] [require Tcl 8.4] [require docstrip [opt 1.2]] [require docstrip::util [opt [vset VERSION]]] [vset emdash \u2014] [description] The [package docstrip::util] package is meant for collecting various utility procedures that are mainly useful at installation or development time. It is separate from the base package to avoid overhead when the latter is used to [cmd source] code. [para] [section {Package indexing commands}] Like raw [file .tcl] files, code lines in docstrip source files can be searched for package declarations and corresponding indices constructed. A complication is however that one cannot tell from the code blocks themselves which will fit together to make a working package; normally that information would be found in an accompanying [file .ins] file, but parsing one of those is not an easy task. Therefore [package docstrip::util] introduces an alternative encoding of such information, in the form of a declarative Tcl script: the [term catalogue] (of the contents in a source file). [para] The special commands which are available inside a catalogue are: [list_begin definitions] [call [cmd pkgProvide] [arg name] [arg version] [arg terminals]] Declares that the code for a package with name [arg name] and version [arg version] is made up from those modules in the source file which are selected by the [arg terminals] list of guard expression terminals. This code should preferably not contain a [cmd {package}] [method {provide}] command for the package, as one will be provided by the package loading mechanisms. [call [cmd pkgIndex] [opt "[arg terminal] ..."]] Declares that the code for a package is made up from those modules in the source file which are selected by the listed guard expression [arg terminal]s. The name and version of this package is determined from [cmd {package}] [method {provide}] command(s) found in that code (hence there must be such a command in there). [call [cmd fileoptions] [opt "[arg option] [arg value] ..."]] Declares the [cmd fconfigure] options that should be in force when reading the source; this can usually be ignored for pure ASCII files, but if the file needs to be interpreted according to some other [option -encoding] then this is how to specify it. The command should normally appear first in the catalogue, as it takes effect only for commands following it. [list_end] Other Tcl commands are supported too [vset emdash] a catalogue is parsed by being evaluated in a safe interpreter [vset emdash] but they are rarely needed. To allow for future extensions, unknown commands in the catalogue are silently ignored. [para] To simplify distribution of catalogues together with their source files, the catalogue is stored [emph {in the source file itself}] as a module selected by the terminal '[const docstrip.tcl::catalogue]'. This supports both the style of collecting all catalogue lines in one place and the style of putting each catalogue line in close proximity of the code that it declares. [para] Putting catalogue entries next to the code they declare may look as follows [example { % First there's the catalogue entry % \begin{tcl} %pkgProvide foo::bar 1.0 {foobar load} % \end{tcl} % second a metacomment used to include a copyright message % \begin{macrocode} %<*foobar> %% This file is placed in the public domain. % \end{macrocode} % third the package implementation % \begin{tcl} namespace eval foo::bar { # ... some clever piece of Tcl code elided ... % \end{tcl} % which at some point may have variant code to make use of a % |load|able extension % \begin{tcl} %<*load> load [file rootname [info script]][info sharedlibextension] % %<*!load> # ... even more clever scripted counterpart of the extension # also elided ... % } % % \end{tcl} % and that's it! }] The corresponding set-up with [cmd pkgIndex] would be [example { % First there's the catalogue entry % \begin{tcl} %pkgIndex foobar load % \end{tcl} % second a metacomment used to include a copyright message % \begin{tcl} %<*foobar> %% This file is placed in the public domain. % \end{tcl} % third the package implementation % \begin{tcl} package provide foo::bar 1.0 namespace eval foo::bar { # ... some clever piece of Tcl code elided ... % \end{tcl} % which at some point may have variant code to make use of a % |load|able extension % \begin{tcl} %<*load> load [file rootname [info script]][info sharedlibextension] % %<*!load> # ... even more clever scripted counterpart of the extension # also elided ... % } % % \end{tcl} % and that's it! }] [list_begin definitions] [call [cmd docstrip::util::index_from_catalogue] [arg dir]\ [arg pattern] [opt "[arg option] [arg value] ..."]] This command is a sibling of the standard [cmd pkg_mkIndex] command, in that it adds package entries to [file pkgIndex.tcl] files. The difference is that it indexes [syscmd docstrip]-style source files rather than raw [file .tcl] or loadable library files. Only packages listed in the catalogue of a file are considered. [para] The [arg dir] argument is the directory in which to look for files (and whose [file pkgIndex.tcl] file should be amended). The [arg pattern] argument is a [cmd glob] pattern of files to look into; a typical value would be [const *.dtx] or [const *.{dtx,ddt}]. Remaining arguments are option-value pairs, where the supported options are: [list_begin options] [opt_def -recursein [arg dirpattern]] If this option is given, then the [cmd index_from_catalogue] operation will be repeated in each subdirectory whose name matches the [arg dirpattern]. [option -recursein] [const *] will cause the entire subtree rooted at [arg dir] to be indexed. [opt_def -sourceconf [arg dictionary]] Specify [cmd fileoptions] to use when reading the catalogues of files (and also for reading the packages if the catalogue does not contain a [cmd fileoptions] command). Defaults to being empty. Primarily useful if your system encoding is very different from that of the source file (e.g., one is a two-byte encoding and the other is a one-byte encoding). [const ascii] and [const utf-8] are not very different in that sense. [opt_def -options [arg terminals]] The [arg terminals] is a list of terminals in addition to [const docstrip.tcl::catalogue] that should be held as true when extracting the catalogue. Defaults to being empty. This makes it possible to make use of "variant sections" in the catalogue itself, e.g. gaurd some entries with an extra "experimental" and thus prevent them from appearing in the index unless that is generated with "experimental" among the [option -options]. [opt_def -report [arg boolean]] If the [arg boolean] is true then the return value will be a textual, probably multiline, report on what was done. Defaults to false, in which case there is no particular return value. [opt_def -reportcmd [arg commandPrefix]] Every item in the report is handed as an extra argument to the command prefix. Since [cmd index_from_catalogue] would typically be used at a rather high level in installation scripts and the like, the [arg commandPrefix] defaults to "[cmd puts] [const stdout]". Use [cmd list] to effectively disable this feature. The return values from the prefix are ignored. [list_end] The [cmd {package ifneeded}] scripts that are generated contain one [cmd {package require docstrip}] command and one [cmd docstrip::sourcefrom] command. If the catalogue entry was of the [cmd pkgProvide] kind then the [cmd {package ifneeded}] script also contains the [cmd {package provide}] command. [para] Note that [cmd index_from_catalogue] never removes anything from an existing [file pkgIndex.tcl] file. Hence you may need to delete it (or have [cmd pkg_mkIndex] recreate it from scratch) before running [cmd index_from_catalogue] to update some piece of information, such as a package version number. [para] [call [cmd docstrip::util::modules_from_catalogue] [arg target]\ [arg source] [opt "[arg option] [arg value] ..."]] This command is an alternative to [cmd index_from_catalogue] which creates Tcl Module ([file .tm]) files rather than [file pkgIndex.tcl] entries. Since this action is more similar to what [syscmd docstrip] classically does, it has features for putting pre- and postambles on the generated files. [para] The [arg source] argument is the name of the source file to generate [file .tm] files from. The [arg target] argument is the directory which should count as a module path, i.e., this is what the relative paths derived from package names are joined to. The supported options are: [list_begin options] [opt_def -preamble [arg message]] A message to put in the preamble (initial block of comments) of generated files. Defaults to a space. May be several lines, which are then separated by newlines. Traditionally used for copyright notices or the like, but metacomment lines provide an alternative to that. [opt_def -postamble [arg message]] Like [option -preamble], but the message is put at the end of the file instead of the beginning. Defaults to being empty. [opt_def -sourceconf [arg dictionary]] Specify [cmd fileoptions] to use when reading the catalogue of the [arg source] (and also for reading the packages if the catalogue does not contain a [cmd fileoptions] command). Defaults to being empty. Primarily useful if your system encoding is very different from that of the source file (e.g., one is a two-byte encoding and the other is a one-byte encoding). [const ascii] and [const utf-8] are not very different in that sense. [opt_def -options [arg terminals]] The [arg terminals] is a list of terminals in addition to [const docstrip.tcl::catalogue] that should be held as true when extracting the catalogue. Defaults to being empty. This makes it possible to make use of "variant sections" in the catalogue itself, e.g. gaurd some entries with an extra "experimental" guard and thus prevent them from contributing packages unless those are generated with "experimental" among the [option -options]. [opt_def -formatpreamble [arg commandPrefix]] Command prefix used to actually format the preamble. Takes four additional arguments [arg message], [arg targetFilename], [arg sourceFilename], and [arg terminalList] and returns a fully formatted preamble. Defaults to using [cmd classical_preamble] with a [arg metaprefix] of '##'. [opt_def -formatpostamble [arg commandPrefix]] Command prefix used to actually format the postamble. Takes four additional arguments [arg message], [arg targetFilename], [arg sourceFilename], and [arg terminalList] and returns a fully formatted postamble. Defaults to using [cmd classical_postamble] with a [arg metaprefix] of '##'. [opt_def -report [arg boolean]] If the [arg boolean] is true (which is the default) then the return value will be a textual, probably multiline, report on what was done. If it is false then there is no particular return value. [opt_def -reportcmd [arg commandPrefix]] Every item in the report is handed as an extra argument to this command prefix. Defaults to [cmd list], which effectively disables this feature. The return values from the prefix are ignored. Use for example "[cmd puts] [const stdout]" to get report items written immediately to the terminal. [list_end] An existing file of the same name as one to be created will be overwritten. [call [cmd docstrip::util::classical_preamble] [arg metaprefix]\ [arg message] [arg target] [opt "[arg source] [arg terminals] ..."]] This command returns a preamble in the classical [syscmd docstrip] style [example { ## ## This is `TARGET', ## generated by the docstrip::util package. ## ## The original source files were: ## ## SOURCE (with options: `foo,bar') ## ## Some message line 1 ## line2 ## line3 }] if called as [example_begin] docstrip::util::classical_preamble {##}\ "\nSome message line 1\nline2\nline3" TARGET SOURCE {foo bar} [example_end] The command supports preambles for files generated from multiple sources, even though [cmd modules_from_catalogue] at present does not need that. [call [cmd docstrip::util::classical_postamble] [arg metaprefix]\ [arg message] [arg target] [opt "[arg source] [arg terminals] ..."]] This command returns a postamble in the classical [syscmd docstrip] style [example { ## Some message line 1 ## line2 ## line3 ## ## End of file `TARGET'. }] if called as [example_begin] docstrip::util::classical_postamble {##}\ "Some message line 1\nline2\nline3" TARGET SOURCE {foo bar} [example_end] In other words, the [arg source] and [arg terminals] arguments are ignored, but supported for symmetry with [cmd classical_preamble]. [call [cmd docstrip::util::packages_provided] [arg text]\ [opt [arg setup-script]]] This command returns a list where every even index element is the name of a package [cmd provide]d by [arg text] when that is evaluated as a Tcl script, and the following odd index element is the corresponding version. It is used to do package indexing of extracted pieces of code, in the manner of [cmd pkg_mkIndex]. [para] One difference to [cmd pkg_mkIndex] is that the [arg text] gets evaluated in a safe interpreter. [cmd {package require}] commands are silently ignored, as are unknown commands (which includes [cmd source] and [cmd load]). Other errors cause processing of the [arg text] to stop, in which case only those package declarations that had been encountered before the error will be included in the return value. [para] The [arg setup-script] argument can be used to customise the evaluation environment, if the code in [arg text] has some very special needs. The [arg setup-script] is evaluated in the local context of the [cmd packages_provided] procedure just before the [arg text] is processed. At that time, the name of the slave command for the safe interpreter that will do this processing is kept in the local variable [var c]. To for example copy the contents of the [var ::env] array to the safe interpreter, one might use a [arg setup-script] of [example { $c eval [list array set env [array get ::env]]}] [list_end] [section {Source processing commands}] Unlike the previous group of commands, which would use [cmd docstrip::extract] to extract some code lines and then process those further, the following commands operate on text consisting of all types of lines. [list_begin definitions] [call [cmd docstrip::util::ddt2man] [arg text]] The [cmd ddt2man] command reformats [arg text] from the general [syscmd docstrip] format to [package doctools] [file .man] format (Tcl Markup Language for Manpages). The different line types are treated as follows: [list_begin definitions] [def {comment and metacomment lines}] The '%' and '%%' prefixes are removed, the rest of the text is kept as it is. [def {empty lines}] These are kept as they are. (Effectively this means that they will count as comment lines after a comment line and as code lines after a code line.) [def {code lines}] [cmd example_begin] and [cmd example_end] commands are placed at the beginning and end of every block of consecutive code lines. Brackets in a code line are converted to [cmd lb] and [cmd rb] commands. [def {verbatim guards}] These are processed as usual, so they do not show up in the result but every line in a verbatim block is treated as a code line. [def {other guards}] These are treated as code lines, except that the actual guard is [cmd emph]asised. [list_end] At the time of writing, no project has employed [package doctools] markup in master source files, so experience of what works well is not available. A source file could however look as follows [example { % [manpage_begin gcd n 1.0] % [keywords divisor] % [keywords math] % [moddesc {Greatest Common Divisor}] % [require gcd [opt 1.0]] % [description] % % [list_begin definitions] % [call [cmd gcd] [arg a] [arg b]] % The [cmd gcd] procedure takes two arguments [arg a] and [arg b] which % must be integers and returns their greatest common divisor. proc gcd {a b} { % The first step is to take the absolute values of the arguments. % This relieves us of having to worry about how signs will be treated % by the remainder operation. set a [expr {abs($a)}] set b [expr {abs($b)}] % The next line does all of Euclid's algorithm! We can make do % without a temporary variable, since $a is substituted before the % [lb]set a $b[rb] and thus continues to hold a reference to the % "old" value of [var a]. while {$b>0} { set b [expr { $a % [set a $b] }] } % In Tcl 8.3 we might want to use [cmd set] instead of [cmd return] % to get the slight advantage of byte-compilation. % set a % return $a } % [list_end] % % [manpage_end] }] If the above text is fed through [cmd docstrip::util::ddt2man] then the result will be a syntactically correct [package doctools] manpage, even though its purpose is a bit different. [para] It is suggested that master source code files with [package doctools] markup are given the suffix [file .ddt], hence the "ddt" in [cmd ddt2man]. [call [cmd docstrip::util::guards] [arg subcmd] [arg text]] The [cmd guards] command returns information (mostly of a statistical nature) about the ordinary docstrip guards that occur in the [arg text]. The [arg subcmd] selects what is returned. [list_begin definitions] [def [method counts]] List the guard expression terminals with counts. The format of the return value is a dictionary which maps the terminal name to the number of occurencies of it in the file. [def [method exprcount]] List the guard expressions with counts. The format of the return value is a dictionary which maps the expression to the number of occurencies of it in the file. [def [method exprerr]] List the syntactically incorrect guard expressions (e.g. parentheses do not match, or a terminal is missing). The return value is a list, with the elements in no particular order. [def [method expressions]] List the guard expressions. The return value is a list, with the elements in no particular order. [def [method exprmods]] List the guard expressions with modifiers. The format of the return value is a dictionary where each index is a guard expression and each entry is a string with one character for every guard line that has this expression. The characters in the entry specify what modifier was used in that line: +, -, *, /, or (for guard without modifier:) space. This is the most primitive form of the information gathered by [cmd guards]. [def [method names]] List the guard expression terminals. The return value is a list, with the elements in no particular order. [def [method rotten]] List the malformed guard lines (this does not include lines where only the expression is malformed, though). The format of the return value is a dictionary which maps line numbers to their contents. [list_end] [call [cmd docstrip::util::patch] [arg source-var] [arg terminals]\ [arg fromtext] [arg diff] [opt "[arg option] [arg value] ..."]] This command tries to apply a [syscmd diff] file (for example a contributed patch) that was computed for a generated file to the [syscmd docstrip] source. This can be useful if someone has edited a generated file, thus mistaking it for being the source. This command makes no presumptions which are specific for the case that the generated file is a Tcl script. [para] [cmd patch] requires that the source file to patch is kept as a list of lines in a variable, and the name of that variable in the calling context is what goes into the [arg source-var] argument. The [arg terminals] is the list of terminals used to extract the file that has been patched. The [arg diff] is the actual diff to apply (in a format as explained below) and the [arg fromtext] is the contents of the file which served as "from" when the diff was computed. Options can be used to further control the process. [para] The process works by "lifting" the hunks in the [arg diff] from generated to source file, and then applying them to the elements of the [arg source-var]. In order to do this lifting, it is necessary to determine how lines in the [arg fromtext] correspond to elements of the [arg source-var], and that is where the [arg terminals] come in; the source is first [cmd extract]ed under the given [arg terminals], and the result of that is then matched against the [arg fromtext]. This produces a map which translates line numbers stated in the [arg diff] to element numbers in [arg source-var], which is what is needed to lift the hunks. [para] The reason that both the [arg terminals] and the [arg fromtext] must be given is twofold. First, it is very difficult to keep track of how many lines of preamble are supplied some other way than by copying lines from source files. Second, a generated file might contain material from several source files. Both make it impossible to predict what line number an extracted file would have in the generated file, so instead the algorithm for computing the line number map looks for a block of lines in the [arg fromtext] which matches what can be extracted from the source. This matching is affected by the following options: [list_begin options] [opt_def -matching [arg mode]] How equal must two lines be in order to match? The supported [arg mode]s are: [list_begin definitions] [def [const exact]] Lines must be equal as strings. This is the default. [def [const anyspace]] All sequences of whitespace characters are converted to single spaces before comparing. [def [const nonspace]] Only non-whitespace characters are considered when comparing. [def [const none]] Any two lines are considered to be equal. [list_end] [opt_def -metaprefix [arg string]] The [option -metaprefix] value to use when extracting. Defaults to "%%", but for Tcl code it is more likely that "#" or "##" had been used for the generated file. [opt_def -trimlines [arg boolean]] The [option -trimlines] value to use when extracting. Defaults to true. [list_end] The return value is in the form of a unified diff, containing only those hunks which were not applied or were only partially applied; a comment in the header of each hunk specifies which case is at hand. It is normally necessary to manually review both the return value from [cmd patch] and the patched text itself, as this command cannot adjust comment lines to match new content. [para] An example use would look like [example_begin] set sourceL [lb]split [lb]docstrip::util::thefile from.dtx[rb] \n[rb] set terminals {foo bar baz} set fromtext [lb]docstrip::util::thefile from.tcl[rb] set difftext [lb]exec diff --unified from.tcl to.tcl[rb] set leftover [lb]docstrip::util::patch sourceL $terminals $fromtext\ [lb]docstrip::util::import_unidiff $difftext[rb] -metaprefix {#}[rb] set F [lb]open to.dtx w[rb]; puts $F [lb]join $sourceL \n[rb]; close $F return $leftover [example_end] Here, [file from.dtx] was used as source for [file from.tcl], which someone modified into [file to.tcl]. We're trying to construct a [file to.dtx] which can be used as source for [file to.tcl]. [call [cmd docstrip::util::thefile] [arg filename] [ opt "[arg option] [arg value] ..." ]] The [cmd thefile] command opens the file [arg filename], reads it to end, closes it, and returns the contents (dropping a final newline if there is one). The option-value pairs are passed on to [cmd fconfigure] to configure the open file channel before anything is read from it. [call [cmd docstrip::util::import_unidiff] [arg diff-text]\ [opt [arg warning-var]]] This command parses a unified ([syscmd diff] flags [option -U] and [option --unified]) format diff into the list-of-hunks format expected by [cmd docstrip::util::patch]. The [arg diff-text] argument is the text to parse and the [arg warning-var] is, if specified, the name in the calling context of a variable to which any warnings about parsing problems will be [cmd append]ed. [para] The return value is a list of [term hunks]. Each hunk is a list of five elements "[arg start1] [arg end1] [arg start2] [arg end2] [arg lines]". [arg start1] and [arg end1] are line numbers in the "from" file of the first and last respectively lines of the hunk. [arg start2] and [arg end2] are the corresponding line numbers in the "to" file. Line numbers start at 1. The [arg lines] is a list with two elements for each line in the hunk; the first specifies the type of a line and the second is the actual line contents. The type is [const -] for lines only in the "from" file, [const +] for lines that are only in the "to" file, and [const 0] for lines that are in both. [list_end] [manpage_end]