\documentclass{ltxdoc} \usepackage{fontspec} \setmainfont{TeX Gyre Schola} % \usepackage[utf8]{luainputenc} % \usepackage[T1]{fontenc} % \usepackage{odsfile,tgschola,metalogo,hyperref,xspace,showexpl,booktabs,url} \usepackage{odsfile,metalogo,hyperref,xspace,microtype,showexpl,booktabs,url} \author{Michal Hoftich (\url{michal.h21@gmail.com})} \title{The \textsf{odsfile} package:\\ accessing of the \textsf{opendocument spreadsheet} from \LaTeX{} documents\thanks{Version 0.8, last revisited 2023-09-07.} } \usepackage[english]{babel} \lstloadlanguages{[LaTeX]Tex} \lstset{% basicstyle=\ttfamily, commentstyle=\itshape\ttfamily, showspaces=false, showstringspaces=false, breaklines=true, breakautoindent=true, captionpos=t } \newcommand\ods{\textsf{ods}\xspace} \begin{document} \maketitle \tableofcontents \section{Introduction} This is \LuaLaTeX{} package and \textsf{lua} library for working with \textsf{opendocument spreadsheet} (\ods) documents from Open/Libre Office Calc. Their contents can be read as \LaTeX{} tables, can be pass to macros, you can also add new data to existing spreadsheets. The \ods format consist of number of |xml| files packed in the |zip| file. This package uses \LuaTeX's zip library, LuaXML library\footnote{Pure |lua| library for working with |xml| files, it is available form CTAN or \url{https://github.com/michal-h21/LuaXML}} and lua scripting to read |xml| content from this archive, which means that it is not possible to use this package with pdf\LaTeX{} or \XeLaTeX. On the other side, |odsfile.lua| library can be used from Plain\TeX, Con\TeX{}t or pure |lua| scripts. Creation of this package was motivated by question\footnote{\url{http://tex.stackexchange.com/questions/60378/insert-libreoffice-table-as-input}} on site \url{http://tex.stackexchange.com/}. Development version of the package can be found at \url{https://github.com/michal-h21/odsfile}, all contributions and comments are welcome. \section{Usage} You can load |odsfile| classically with \begin{verbatim} \usepackage{odsfile} \end{verbatim} There are macros: \begin{itemize} \item \cmd{\includespread} \item \cmd{\includespread*} \item \cmd{\OdsNl} \item \cmd{\OdsLastNl} \item \cmd{\tabletemplate} \item \cmd{\loadodsfile} \item \cmd{\savespreadsheet} \item |AddRow| environment \end{itemize} \noindent Main command is\marginpar{\cmd{\includespread}} \cmd{\includespread} and it's starred variant \cmd{\includespread*}. It's syntax is:\\ \cmd{\includespread}\oarg{key-value list} Options are: \begin{description} \item[file] Filename of file to be loaded. You should specify this only on first use of \cmd{\includespread}. \item[sheet] Name of sheet to be loaded. If it's not specified on first use of \cmd{\includespread}, then first sheet from the file is loaded. The sheet remains selected until another use of |sheet|. \begin{LTXexample} \begin{tabular}{l l} \includespread[file=pokus.ods,sheet=List2] \end{tabular} \end{LTXexample} \item[range] Selects range from table to be inserted. Range is specified in format similar to spreadsheet processors, like |a2:c4|, selecting cells starting at first column, second row and ending and third column, fourth row. Other variant of supported ranges are \textit{named ranges}, which can be saved in the |ods file|. \begin{LTXexample} \begin{tabular}{lll} \includespread[sheet=List1,range=a2:c4] \end{tabular} \end{LTXexample} You can omit some or both of the numbers: \begin{LTXexample} \begin{tabular}{lll} \includespread[range=a:c4] \end{tabular} \end{LTXexample} \begin{LTXexample} \begin{tabular}{ll} \includespread[range=a:b] \end{tabular} \end{LTXexample} \begin{LTXexample} \begin{tabular}{ll} \includespread[range=b2:c] \end{tabular} \end{LTXexample} \begin{LTXexample} \begin{tabular}{lll} \includespread[range=newrangetest] \end{tabular} \end{LTXexample} \item[columns] Column heading specification. It can be either |head|, |top|, or comma separated list of values. \begin{description} \item[top] Use as headers first line from the table. \begin{LTXexample} \begin{tabular}{ll} \includespread[range=b3:c5,columns=top] \end{tabular} \end{LTXexample} Note that if you include whole table, first line is included twice: \begin{LTXexample} \begin{tabular}{lll} \includespread[columns=top] \end{tabular} \end{LTXexample} in this case you can use \item[head] use first row from selection as headings. \begin{LTXexample} \begin{tabular}{lll} \includespread[columns=head,range=a:c3] \end{tabular} \end{LTXexample} \item[manually specified list] If column headings are not specified in the file, you can set them manually. \begin{LTXexample} \begin{tabular}{ll} \includespread[columns={title,amount},sheet=List2] \end{tabular} \end{LTXexample} \end{description} \item[columnbreak] Command inserted in manual linebreaks in cells. Default value is |\linebreak| \item[rowseparator] Rows are normally separated with newlines, if you really want, you can separate them with hlines. Possible values: \begin{description} \item[tableline (default)] Inserts |\\| character \item[hline] Inserts |\\ \\hline| \item[newline] Inserts blank line \item[user specified separator] useful in conjunction with \ref{it:rowtemplate} (p. \pageref{it:rowtemplate}), for example if you want to include sheets as plaintext or input for plotting functions. \end{description} \begin{LTXexample} \begin{tabular}{lll} \includespread[columns=top,sheet=List1,rowseparator=hline,range=a2:b5] \end{tabular} \end{LTXexample} \item[template] Templates are simple mechanism to insert whole tabular environment with column specification. All columns are aligned to the left, if you want to do more advanced stuff with column specifications, you must enter them manually as in all previous examples. \begin{LTXexample} \includespread[columns=top,template=booktabs,range=a3] \end{LTXexample} For more info about templates, see next section \ref{sec:tpl} \item[coltypes] When using template, column types are inferred automatically. If that doesn't work well in your case, you can specify them manually with |coltypes| option. \begin{LTXexample} \includespread[columns=head, template=booktabs, coltypes=lrr] \end{LTXexample} \item[rowtemplate] \makeatletter \edef\@currentlabel{rowtemplate} \label{it:rowtemplate} Enables to convert tabular data to something different than \LaTeX\ tables. Syntax for rowtemplates is similar to the table templates, variables are inserted with |-{number}|, where number is the position of the cell from beginning of the selection. \makeatother \begin{LTXexample} \includespread[range=a2:b,rowseparator={,\ },rowtemplate={\textit{-{1}}:-{2}}] \end{LTXexample} \item[celltemplate] This template is applied to every cell in the included table. \begin{LTXexample} \begin{tabular}{lll} \includespread[range=a1:c1,celltemplate={\textit{-{value}}}] \includespread[range=a2:c] \end{tabular} \end{LTXexample} \item[multicoltemplate] supports merged cells. Default template uses left aligned \verb|\multicolumn| command \begin{LTXexample} \begin{tabular}{lll} \includespread[sheet=Sheet3, multicoltemplate={\multicolumn{-{count}}{r}{-{value}}}] \end{tabular} \end{LTXexample} \end{description} \subsection{Character escaping} To prevent compilation errors, \textsf{odsfile} escapes characters that have special meaning in \LaTeX, for example backslashes or dollar characters. In some cases, you may want to execute the \TeX\ code that is saved in the spreadsheet. In that case, use the \texttt{escape} option: \begin{description} \item[escape] Set the value to false to ignore special characters in the cell data. \end{description} The following example prints the left cell in the last row in table in bold, because the original spreadsheet contains the \verb|\textbf{bold}| command. \begin{LTXexample} \begin{tabular}{lll} \includespread[sheet=List1,columns=head,escape=false] \end{tabular} \end{LTXexample} \section{Templates}\label{sec:tpl} If you don't want to specify tabular environment by hand, you can use simple templating mechanism to insert tabular environment by hand. Templates are defined with macro\marginpar{\cmd{\tabletemplate}}\\ \cmd{\tabletemplate}\marg{template name}\marg{template code}\\ there is default template: \begin{verbatim} \tabletemplate{default}{-{colheading}-{rowsep}-{content}} \end{verbatim} Code |-{variable name}| inserts one of the following variables: \begin{description} \item[coltypes] This is code to be inserted in |\begin{tabular}{coltypes}|. %In current version, it inserts |l| for left alignment column, for all columns of inserted table. %It should be possible to use more intelligent method based on types of column content, or \ods styles, maybe in future versions some of them will be used. If you want other alignment of columns now, you have to specify |\begin{tabular}{column types}| manually. The |p| column specifier is used for each column, where cell with manual line break occurs, |l| is used otherwise. \item[colheading] Column headings. \item[rowsep] It inserts row separator defined with |rowsepartor| key of |\includespread|. It is used in default style to delimit column headings and table contents. \item[content] Tabular data. \end{description} \paragraph{More powerful template for the \textsc{booktabs} package} \begin{verbatim} \tabletemplate{booktabs}{% \begin{tabular}{-{coltypes}} \toprule -{colheading} \midrule -{content} \bottomrule \end{tabular} } \end{verbatim} \section{Adding data} There is simple interface for adding new rows to the spreadsheet. |AddRow|\oarg{row number}\marginpar{|AddRow|} environment for adding new row to the current sheet. Optional argument \oarg{row number} specifies where it should be inserted, if blank, it will be inserted at end. Inside |AddRow|, you can use \begin{itemize} \item \cmd{\AddString}\marg{text}\marg{position} \item \cmd{\AddNumber}\marg{number}\marg{position} \end{itemize} Position specifies cell, where data should be added, if you leave it blank, it will be laced next to the previous one. \begin{LTXexample} \includespread[columns=head,template=booktabs] \begin{AddRow} \AddString{last row}{} \end{AddRow} \begin{AddRow}[3] \AddString{third row}{} \AddNumber{22}{2} \end{AddRow} \includespread[columns=head,template=booktabs] \end{LTXexample} \section{Loading and saving of the \ods file} You can explicitly load \ods file with \marginpar{\cmd{\loadodsfile}}\cmd{\loadodsfile}\oarg{key val list}\marg{filename}. This can be useful, if you only want to write some data to the file, otherwise it is better to use \cmd{\includespread}. For saving spreadsheets modified with |AddRow|, you can use \cmd{\savespreadsheet}\marginpar{\cmd{\savespreadsheet}}. This command uses call to external |zip| utility, so you should have installed it and you have to call Lua\LaTeX\ with |lualatex --shell-escape filename|. Lua\LaTeX\ also must have write permissions for accessing the \ods file. This command creates file |content.xml| in the current directory, so if externall call fails, you can run \begin{verbatim} zip -r filename.ods content.xml \end{verbatim} by hand. \section{Lua library} The |lua| library uses |luazip| library integrated to \LuaTeX{} and |LuaXML|\footnote{\url{https://github.com/michal-h21/LuaXML}}, pure |lua| library for working with |XML| files. To use library in your code, you can use: \begin{verbatim} require("odsfile") \end{verbatim} Function |odsfile.load(filename)| returns |odsfile| object, with |loadContent()| method, which returns |lua| table representing |content.xml| file. We can select sheet from the spreadsheet with |odsfile.getTable(xmlobject,sheet_name)|. If we omit |sheet_name|, first sheet from spreadsheet is selected. Data from sheet can be read with |odsfile.tableValues(sheet, x1, y1, x2, y2)|. |x1 - y2| are range to be selected, they can be |nil|, in which case whole rows and cells are selected. For converting of standard range expressions of form |a1:b2| to this representation, function |odsfile.getRange(range)| can be used. \paragraph{Example usage -- file \textsf{odsexample.lua}} \begin{verbatim} require "odsfile" -- Helper function to print structure of the table function printable(tb, level) level = level or 1 local spaces = string.rep(' ', level*2) for k,v in pairs(tb) do if type(v) ~= "table" then print(spaces .. k..'='..v) else print(spaces .. k) level = level + 1 printable(v, level) end end end local ods = odsfile.load("filename.ods") local f, e = ods:loadContent() -- Get First sheet from the table body= odsfile.getTable(f) -- Print structure of the range a4:b5 printable(odsfile.tableValues(body,odsfile.getRange("a4:b5"))) \end{verbatim} Run the example with |texlua odsexample.lua| from the command line, you should get following result: \begin{verbatim} 1 1 value=AA attr office:value-type=string 2 value=3 attr office:value-type=float office:value=3 2 1 value=BB attr office:value-type=string 2 value=4 attr office:value-type=float office:value=4 \end{verbatim} To convert this structure to \LaTeX{} tabular code, you can use following function: \begin{verbatim} function tableToTabular(values) local rowValues = function(row) local t={} for _,column in pairs(row) do table.insert(t,column.value) end return t end content = {} for i,row in pairs(values) do table.insert(content,table.concat(rowValues(row)," & ")) end return table.concat(content,"\\\\\n") end -- Now use it with objects from previous example print(tableToTabular(odsfile.tableValues(body))) \end{verbatim} This example yields \begin{verbatim} Label & Position & Count\\ Hello & 1 & 3\\ World & 2 & 4\\ AA & 3 & 5\\ BB & 4 & 6\\ CC & 5 & 7 \end{verbatim} \section{Changes} \begin{description} \item[v0.8] \begin{itemize} \item added the escape option for enabling or disabling of the character escaping \item fixed handling of merged cells in ranges \item fixed escaping of backslashes \item fixed row insertion \end{itemize} \item[v0.7] \begin{itemize} \item Fixed character escaping. Unescaped ``\%,\#,\$, \_ and \&'' characters will be escaped to prevent compilation errors. Characters that are already escaped in the ODS file will be left untouched.\footnote{Thanks to Wybo Dekker} \item Added support for LuaTeX -recorder command line option \item Added \texttt{celltemplate} option \item Fixed named range handling\footnote{Thanks to Angelforest} \end{itemize} \item[v0.6] \begin{itemize} \item Fixed bug in handling of rows with only one cell\footnote{Thanks to Ulrike Fisher} \item Added support for named ranges \item Fixed text:s handling. Beware that using multiple formatting styles in one cell may result in discarding whole cell, or wrong order. This is caused by underlying LuaXML library and it is likely to remain. \end{itemize} \item[v0.5] \begin{itemize} \item Fixed bug: cell attributes weren't saved \item Added support for merged cells \item Added \texttt{multicoltemplate} option \end{itemize} \item[v0.4] \begin{itemize} \item Fixed bugs in loading sheets without ranges \item Fixed bugs in behaviour of empty cells\footnote{Thanks to \href{https://github.com/TripleWhy}{TrippleWhy}} \item Fixed bug in row counting\footnote{Thanks to \href{https://github.com/yamsu}{yamsu}} \item Added support for children element in column paragraphs \item Added cell value escaping \end{itemize} \item[v0.3] \begin{itemize} \item Added support for multiline cells \item Improved automatic column types generation \item Added new options, |coltypes| and |columnbreak| \end{itemize} \item[v0.2] \begin{itemize} \item LuaXML is now distributed as separate library, so other projects can use it. \item New |AddRow| environment for adding data to the \ods file \item New command |\savespreadsheet| for saving \ods file \item Bug fixes: corrected loading of the sheets, corrected behaviour of blank cell \end{itemize} \item[v0.1] First version \end{description} \end{document}