Latex Style Guide

Latex Style Guide for OSR

This guide has at least three parts. One part is how to use LaTeX in general. Another is how to use LaTeX to in OSR. And the other part, which is presented first, is how to write.

Style Guide for Technical Documents

The first thing all would-be writer should do is read Strunk and White’s Elements of Style. It is a short booklet that contains invaluable lessons. One of the most important is to “omit needless words.” A few more lessons are provided below to supplement or highlight the above.

Golden Rule

A technical document is not a mystery story. The reader should always know where the report is going at all times. One of the worst comments a reader can have is “I didn’t expect that.”

Organization

Technical reports almost always fit a simple, canonical form. There are generally 6 sections plus an abstract.

  1. Introduction
  2. Related work
  3. High-level discussion (eg, model or architecture)
  4. Low-level discussion (eg, implementation)
  5. Results
  6. Conclusion

The above list is merely a rule of thumb. A specific instance may best be suited by more than one section for 3, 4, or 5. But it is unlikely that the other sections should be divided in such a manner.

The other important decision is whether the related work goes second or next to last. There is a tradeoff. The advantage of being second is that the paper can discuss your new work in the context of others. The advantage of being last is that it is easier to contrast and distinguish your work from that of others. If related work is put off to the end, it is usually necessary to add some background to the introduction to orient the user. However, when related work is second, more details of your work are needed in the introduction than otherwise.

Some Notes

Lists

There are two warnings with lists:

  • Grammar rules apply to lists and
  • It is easy to over-use lists.

One needs to develop good taste to determine when to use a list. The overuse of bulletted lists can hurt a paper. Remember: it is a paper, not a PowerPoint™ presentation.

To avoid mistakes due to the former, read the text as if there were no list formatting. If the grammar is faulty in this reading, it is faulty with the formatting. Here is an example.

The farm had many different kind of animals:
  * sheep
  * cow
  * horse
  * lizard

Without the list formatting this is

The farm had many different kind of animals: sheep cow horse lizard

The better form is below.

The farm had many different kind of animals:
  * sheep,
  * cow,
  * horse, and
  * lizard.

Observe the “and” on the next to last line and the period on the last. If any list item requires a comma (,), then use the semi-colon.

Pick from the following dates:
  * February 2, 2005;
  * March 18, 2006; or
  * July 4, 1776.

If the list items contain one or more complete sentences, then punctuate appropriately.

Before go to bed do the following.
  - Put your dirty clothes in the hamper.
  - Brush your teeth and comb your hair.
  - Lay out your school clothes for the morning.

There is a period after “following.”

“Note that …”

Strunk & White says to “omit needless words.” There is probably no more common violation of this dictum than the “Note that …” phrase. Consider the following.

Note that the CPU utilization increases as a fremostatic hyperbolic.

It makes just as much sense (albeit none) without the first two words. If the sentence reads fine without them, don’t use them.

Citiations

  • Generally, put citations at end of sentence.
The Globus project [1] states that this way sucks. The Globus project states that this way sucks [1].
  • When you name authors while citing work, use et al for works that have more than two authors.
Strunk and White [1] is good, while Freeh et al [2] is not.
  • Use the last name of the first author and the date in the citation label, ie, freeh:05. If necessary add a suffix to the end, such as freeh:05b. It is not necessary to follow this exactly. The key is for the label to be simple, short, and unique.
  • Put multiple, adjacent cites in the same cite command. For example, use \cite{xyz:12,abc:09}—which produces [5, 10]—instead of \cite{xyz:12}, \cite{abc:09}—which produces [5], [10].

Parenthetical phrases

A phrase is parenthetical if it interrupts the thought of a sentence. There are three ways to set off a parenthetical phrase: comma, em dash, and parentheses. The list is ordered in increasing level of disruption. A minor disruption employs the comma. In LaTeX, the em dash is made with 3 adjacent hyphens (—). The en dash (2 adjacent hyphens) is never used to set off parenthetical phrases.

That and which

Use “that” for a dependent clause.

Take the book that is red.

Assuming there is more than one book, then the phrase “is red” is required to identify the book. Therefore, it is a dependent clause.

Use “which” and a comma for an independent clause.

Take the book, which is my favorite.

Assuming the identity of the book is not in doubt, then the phrase “which is my favorite” is additional information about the book. Therefore, it is an independent clause.

Miscellaneous

  • Avoid contractions like the plague.
  • Avoid the apostrophe in possessives. If possible re-write the sentence.
The object’s method … The method of the object …
  • Because versus since: Always use since when time is involved. Eg, “Disco has been his bane since the late 80s.” But it is preferrable to use because to indicate consequence. Eg, “Because I think, I am.”
  • Adverbial phrases that begin a sentence require a comma.
    • “I.e.” is a adverbial. It means, “such as” or “in other words.”
    • “E.g” is a adverbial. It means, “for example.”

General Latex Style Guide

The primary thing to understand about LaTeX is that it is very good at typesetting a document. You must let it. The vast majority of your time is spent editing the text—getting the words to come out right. With LaTeX you can use your favorite editor to create the text. Thus, LaTeX separates the two fundamentally different tasks of writing and typesetting. The latter is mundane—let the computer do it.

Floats

Floats, such as figures and tables, should float to the top of the page.There are exceptions, but they are rare. If it is not possible for the first reference to a float to be on the same page, the float should appear after the first reference.

\begin{table}[t]
  \centering
  \begin{tabular}{|c||c|}
    \hline
    & \textbf{Header} \\ 
    \hline\hline
    Row & 1 \\
    \hline
    Row & 2 \\
    \hline
    Row & 3 \\
    \hline
  \end{tabular}
  \caption{Try to keep it to one line.} 
  \label{tab:ex}
\end{table}

The general format for tables in LaTeX is shown above. Starting at the top we see the following.

  • Line 1: The [t] means float the table to the top.
  • Line 2: Generally center the table.
  • Line 3: The double bar (||) sets off the row labels from the data labels.
  • Line 5: Use bold face for headings.
  • Line 6: Use two horizontal lines to set off the column labels.
  • Line n-3: Long captions are good in some cases. However, most times the discussion of the table belongs in the body of the paper.
  • Line n-2: The label command has to occur after the caption command in order for the table numbers to work out.
  • Line n-2: The label for tables should begin with “tab:“.
\begin{figure}[t]
  \begin{center}
    \includegraphics[width=.4\textwidth]{kiss.eps}
    \caption{Keep it simple, stupid.}
    \label{fig:kiss}
  \end{center}
\end{figure}

The general format for figures in LaTeX is shown above. The appropriate comments about tables apply to figures. Below are some figure-specific comments.

  • Line 2: An alternative to the centering command shown above.
  • Line 3: This is the OSR-preferred way to include a picture. The width parameter is generally necessary. It is good to use either \textwidth or \columnwidth as the unit of length.
  • Line 5: Use the prefix “fig:” for figure labels.

Fonts and typeface

The preferred font for use in LaTeX is Times. To use it, include the following in the header of your main document.

\usepackage{times}

LaTeX supports many typefaces, such as italics and bold. In general, the novice user switches typeface too often.

You must use italics for foreign words and titles. Use it for non-words, such as the Unix system call ioctl. Additionally, use italics for the first occurrence of new term.

It is acceptable to use either italics or bold for emphasis or special effects. Both should be used sparingly, but especially so with bold.

Miscellaneous

      • Use the “~” (tilde) character with citations and references, thusly:
        Figure~\ref{fig:foo} is lame.

        The tilde puts a space in the document, but prevents a line break from occurring at that point.

      • Proper typesetting places two spaces between sentences.
        LaTex assumes period (“.”) ends a sentence. However, not all do.
        When a period does not end a sentence you must escape the following space. For example, use “etc.\ ” — that is 6 characters:

        e, t, c, ., \, [space]
      • Use the LaTeX 2e style typeface macros. That is use \textbf{this macro form} rather than {\bf the old style} macro.
      • One should use \emph{emphasis} macro when emphasizing text instead of the explicitly formatting in italics. This macro changes the font from the default for emphasis. Thus, if the body of the sentence is in roman, the specified text is italicized. But if the body is italics, the text is highlighted in roman.

OSR-Specific Latex Style Guide

No line should exceed 80 characters in width. Even though Word™ likes to wrap lines, do not create long lines. Put hard returns before the 80th character.

Sentences

The sentence is the basic object in a text document. The line is the fundamental unit of a text editor. Therefore, you should never have two sentences on the same line. Thus, the first word of every sentence should begin in column one.

The reasons for this are many. Here are two. First, in editing it is often desirable to re-arrange sentences. This is much simpler when lines hold at most one sentences. Second, LaTeX comments are to the end of the line. So it is again easier to “delete” a line (from the formatting document but not the source) using a comment when source documents are constructed this way.

Never fill a LaTeX paragraph.
When doing the final few edits on a document it is useful to compare the current version with the most recent one. For example, the command

git diff abstract.tex

shows the differences between the local copy of the abstract and that in the repository. If someone deleted a single word and refilled the paragraph it was in, many lines of the old and new paragraph are displayed. It is almost impossible for a human to recognize the minor change in the document. Without filling, only the line on which the word was on is displayed, albeit twice. Consequently, a human easily observes the change.

When in doubt use short sentences.
Run-on sentences, which are grammatically incorrect, are very bad. Long sentences, even grammatically correct, are often hard to follow. Therefore, short sentences tend to be easier for both reader and writer. Consider re-writing long complicated sentences as two or more short ones.

AUCTex

As you no doubt know, only cretins evangelize one editor to the exclusion of all others. If you use Emacs, you can use its helper modes for editing various file types. The preferred helper mode for LaTeX documents is AUCTex. This mode stores meta-info in comments at the end of the file. Because I use Emacs & AUCTex, I insert these comments in source files. Do not remove them.

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "taco"
%%% End:

Labels

Above we see the label style for tables and figures. Label all sections, chapters, and parts whether you expect to refer to them or not. It is generally not necessary to label anything lower than a subsection. Use the prefix “sec:” for all such “sections.” Do not distinguish between sections, subsections, etc. or even paragraphs. Label them all the same. There are two reasons for this. First, it is initially easier to do and remember. Second, during editing sections may be promoted or demoted. So it is better if the label name should not have any implicit context.