% webguide.tex Guide to LaTeX, STEP and the Web, etc \documentclass[11pt]{article} % a11pream.tex generic preamble \usepackage{url} %\usepackage{ltx2html} \setlength{\textheight}{8.0in} \setlength{\textwidth}{6.0in} \setlength{\oddsidemargin}{0.25in} \setlength{\evensidemargin}{0.25in} \setlength{\marginparwidth}{0.6in} \setcounter{secnumdepth}{4} \setcounter{tocdepth}{4} %% \usepackage{times} \newif\ifpdf \ifx\pdfoutput\undefined \pdffalse \else \pdftrue \fi \ifpdf \pdfoutput=1 \usepackage[pdftex]{graphicx} \else \usepackage{graphicx} \fi \newcommand{\file}[1]{\textsf{#1}} \newcommand{\program}[1]{\texttt{#1}} \newcommand{\package}[1]{\texttt{#1}} \newcommand{\mpost}{\textsc{MetaPost}} \newcommand{\mfont}{\textsc{metafont}} \newcommand{\Han}{H\`{a}n Th\^{e} Th\`{a}nh} \newcommand{\tex}{TeX} \newcommand{\latex}{LaTeX} \title{A Brief Guide to \latex{} Tools for Web Publishing} \author{Peter R. Wilson\thanks{With helpful critiques by Eitan Gurari (\texttt{gurari@cis.ohio-state.edu}) and David Wilson (\texttt{davidw@utopiatype.com.au}).} \\ \texttt{peter.r.wilson@boeing.com}} \date{11 March 2000} \begin{document} \pagestyle{headings} \pagenumbering{roman} \maketitle \begin{abstract} This document provides a brief guide to converting \latex{} documents to forms more suitable for dissemination via the Web. \end{abstract} \tableofcontents \listoffigures \clearpage \pagenumbering{arabic} \section{Introduction} Publishing on the Web has rapidly achieved significant importance, for example, the International Organization for Standardization (ISO) is moving towards electronic forms of International Standard documents that are suitable for publishing on the Web, and in particular, documents as PDF or HTML files rather than their traditional request for camera-ready paper copy. Documents written using \latex~\cite{LAMPORT94} tagging can be easily converted to PostScript, PDF and HTML, all from the single electronic source. This guide briefly notes some of the ways that this can be accomplished. Most of the programs and systems mentioned here are described in more detail in~\cite{GOOSSENS99}. I have made no attempt to design this document for Web publication. The typographical rules for printing on paper are well founded, having been developed over hundreds of years. Display on computer screens is a very different matter and requires a different set of rules, most of which, as yet, are either in a state of flux or unavailable. For LaTeXers who are interested in this topic I suggest a look at D.~P.~Story's work on AcroTeX (\url{http://www.math.uakron.edu/~dpstory/acrotex.html}). Further, for the example conversions I have used only the minimal tool options necessary. Many of the tools have extensive capabilities which are well documented in their accompanying user manuals; these should be consulted for further information. \subsection{URLs} I have tried to provide URLs for the programs and systems mentioned here. Most \latex-related software is available from the Comprehensive TeX Archive Network (CTAN). There are three sites, \url{ftp://ctan.tug.org/tex-archive} in the USA, \url{ftp://ftp.tex.ac.uk/tex-archive} in the UK, and \url{ftp://ftp.dante.de/tex-archive} in Germany, as well as several mirror sites. Usefully, the CTAN sites (but not necessarily a mirror site) supports on-the-fly zipping of files and entire directories, which makes downloading a group of files less tedious than having to get them one-by-one. Below, I have used \url{ftp://ctan.tug.org/tex-archive} to stand for any of the three CTAN sites. \subsection{Disclaimer} Nothing that is said in this document is meant to imply any endorsement or recommendation, either positive or negative, concerning any systems or programs mentioned herein. Many of the systems or programs are `free' in the sense that they are either public domain or their licences are roughly equivalent to the GNU Public License. Others are either commercial or have more restricitive licenses or may require payment. Where known, programs and systems that are not `free' are noted. \section{PDF} The traditional output from a \latex{} (e.g., \file{*.tex}) file is a `device independent' \file{*.dvi} file. The \file{*.dvi} file is then processed further to convert it to a format suitable for printing on a particular printing device. In the vast majority of cases the final printable format has been PostScript, obtained by running the \file{*.dvi} file through a program like \program{dvips}, to generate a \file{*.ps} file. PostScript was developed by Adobe Systems. The Portable Document Format (PDF) has since also been developed by Adobe, and seems to be overtaking PostScript as the format of choice for printing, and especially for display via the Web. DVI and PDF are somewhat similar in that they both describe where (electronic) ink is to be put on (electronic) paper. PostScript also does this but at the same time it is a complete programming language. This means that it is inherently more difficult, time consuming, and computer intensive, to process PostScript than either DVI or PDF. This is probably the reason behind the popularity of PDF on the Web. There are now several methods of producing a PDF (e.g., \file{*.pdf}) file from \file{*.tex}. These include: \begin{itemize} \item Converting from PostScript to PDF; from \file{*.ps} to \file{*.pdf}. \item Generate PDF from the device independent file; from \file{*.dvi} to \file{*.pdf} \item Generate PDF directly from the \latex{} source; from \file{*.tex} to \file{*.pdf}. \end{itemize} \subsection{From PostScript to PDF} There are basically two routes to getting from PostScript to PDF. The first of these is to use Acrobat software from Adobe Systems, which essentially means the commercial \program{Distiller} program. \program{Distiller} can read in a PostScript file and output a PDF file where the visual results of printing the two files are identical. This, or any other, PDF file can be viewed and/or printed via the charge-free Acrobat \program{Reader} program. Note that when using \program{Reader} the `fit to paper' option may alter the page layout, for example by changing the height of the text block. The second route is to use a non-Adobe converter program, like \program{Ghostscript} which runs on nearly all operating systems and which is obtainable from \url{http://www.cs.wisc.edu/~ghost}. The \program{Ghostscript} distribution comes with a script called \program{ps2pdf} which performs the conversion. The distribution also provides the popular \program{Ghostview} program, which is a viewer for both PostScript and PDF files. Another converter program, which does have some licensing conditions that may not be suitable for all users, is \program{PStill}; it is available from \url{http://www.this.net/~frank/pstill.html}. \subsection{From DVI to PDF} Mark Wicks' \program{dvipdfm} program (\url{http://odo.kettering.edu/dvipdfm}) converts a \file{*.dvi} file to a \file{*.pdf} file. The program is used in the same manner as \program{dvips} and provides similar capabilities. PostScript illustrations are handled in one of two ways. Simple PostScript generated by the \mpost{} program~\cite{HOBBY92} is included natively. Any other PostScript file is first converted to PDF by using an external program like \program{Ghostscript} and then inserted into the output file. Illustrations in PDF, PNG and JPEG formats require no external aids. \program{dvipdfm} is written in C but there are some binaries for Linux systems. \subsection{From LaTeX to PDF} The \program{pdfLaTeX} program being developed by \Han{} is a modified version of \tex{} that generates \file{*.pdf} instead of \file{*.dvi} output files. \program{pdfLaTeX} is distributed with many of the free \latex{} distributions, and is also obtainable from \url{ftp://ftp.cstug.cz/pub/tex/local/cstug/thanh}, although it may be better to try \url{ftp://ctan.tug.org/tex-archive/systems/pdftex}. Running \program{pdfLaTeX} is very similar to running \latex, but some minor changes are required to the \file{*.tex} file. For example: \label{code:example} \begin{verbatim} % example.tex example latex file \documentclass[...]{...} \newif\ifpdf \ifx\pdfoutput\undefined \pdffalse \else \pdftrue \fi \ifpdf \pdfoutput=1 % \usepackage[pdftex]{graphicx} % uncomment if using graphicx % \usepackage[pdftex]{hyperref} % uncomment if using hyperref \else % \usepackage{graphicx} % uncomment if using graphicx % \usepackage{hyperref} % uncomment if using hyperref \fi .... \end{verbatim} Running \\ \texttt{latex example} \\ will produce \file{example.dvi}, while running \\ \texttt{pdflatex example} \\ will produce \file{example.pdf}. It is thus very easy to generate both \file{*.dvi} and \file{*.pdf} from the same \latex{} source file. \program{pdflatex} will handle graphics files in the following formats: PDF, PNG, JPEG and TIFF, but notice that (Encapsulated) PostScript is missing from this list. However, it can handle directly the simple Encapsulated PostScript output by \mpost~\cite{HOBBY92}. It does, though, expect \mpost{} files to have a \file{.mps} extension. To include PostScript from other sources it is necessary to convert the PostScript to PDF. \program{pdftex}, and hence \program{pdflatex}, has some extra primitive commands that are not available in \tex{} itself specifically for accessing aspects of the PDF format, for example to create hypertext links, bookmarks or article threads. Consult the manual for details. Independently of \program{pdflatex} the \package{hyperref} package (\url{ftp://ctan.tug.org/tex-archive/macros/latex/contrib/supported/hyperref}) extends the functionality of the \latex{} cross-referencing commands to include hypertext links, and also ad hoc hypertext links to, for example, external documents and URLs. \subsection{Fonts} The normal fonts used with \latex{} are the Computer Modern family developed by Knuth using \mfont~\cite{KNUTH86b}. All \mfont{} fonts are in the form of bitmaps, which is unfortunate when it comes to PDF. Typically, PDF will only use one size of each font for a document, and will scale this if different font sizes are required. This normally works well as fonts used with PDF are typically `Type~1' fonts (e.g., PostScript fonts) which are designed to be scaleable. Bitmap fonts look terrible when scaled or printed at a resolution that they were not designed for. In other words, expect bad results if you generate a PDF file with the original Computer Modern fonts. Perhaps the easiest method of dealing with this is to use the most common PostScript fonts, namely Times, Courier and Helvetica. All that is necessary is to add \verb|\usepackage{times}| to the document's preamble. Alternatively, if you need to use the CM fonts, perhaps because a lot of mathematics is involved, many \latex{} distributions include Type~1 versions of the CM fonts. If you don't have them they can be found at \url{ftp://ctan.tug.org/tex-archive/fonts/cm/ps-type1/bluesky} and at \url{ftp://ctan.tug.org/tex-archive/fonts/amsfonts/ps-type1} for the AMS fonts. Goossens \textit{et al.} provide useful and general information on installing and using different fonts with \latex~\cite{GOOSSENS94}, while for the fontophile, Alan Hoenig~\cite{HOENIG98} delves much more deeply into the installation of PostScript fonts. \tex{} doesn't care about the particular shape of any glyph, nor how it is constructed or represented, it only cares about the space occupied by each character (i.e., the \file{*.tfm} files). It is the DVI processor that needs to know in detail about the fonts in a document. So, the DVI processor has to be told to use Type~1 CM PostScript fonts. The following is for the \program{dvips} program. For convenience, let \path{$TEXMF} stand for the root of the \path{texmf} tree (e.g., \path{/usr/teTeX/texmf}). \program{dvips} looks in the \path{$TEXMF/dvips/base/psfonts.map} to see if it can use any PostScript fonts. This file starts off something like: \begin{verbatim} bchb8r CHarterBT-Bold "TeXBase1Encoding ReEncodeFont" <8r.enc