Author Archive

Typeset Your Curriculum Vitae – Part 3: Automatically Generate a List of Publications

 | December 2, 2009 11:19 am

Publications are the currency of ideas.  Through them the experts, thinkers and dreamers of this world can share their thoughts and insights.  A good publication is not only influential, but it’s even capable of shifting the course of a whole society, as Martin Luther King demonstrated with his “Letter from a Birmingham Jail”.

Since publications are so important to the dissemination of knowledge, there is a rather high expectation that an academic author should publish prolifically.  The mantra “Publish or Perish” is not just a clever quip, but a very serious way of life.

It is ironic, then, that the most prolific of academic writers can suffer from a surprising problem: it can be very difficult to keep track of all of their work.  Yet, an up to date CV is very important.  After all, publishing your work in influential journals is an important first step toward establishing tenure!

Members of a research team or those who collaborate outside of their institution experience this same problem, only more so.  Such a person may work on many projects at once, but only have direct responsibility for one or two of them.  This places the researcher in the unenviable position of trying to track the work of others.  This situation becomes even more complicated if the collaborator refuses to play by the rules of common decency.

It would be nice, for example, if the primary author of a publication would notify the co-authors of its progress, or when it has been submitted.  But … that doesn’t always happen.  Academic researchers are busy people and soliciting feedback from all of your collaborators can be difficult … and there is a tendency for difficult things to go undone.  Thus, if you don’t follow what your team mates are working on, it is quite possible that an abstract might have gotten submitted while your back was turned.

To stay on top of the “delightful chaos”, you need to have some kind of system.  Personally, I keep my list of projects and publications in three places. The first (and perhaps most important) is the hand-written list in my experimental notebook. Any time I hear about a new project, it gets added to this list. I keep track of what I’ve contributed, what papers or abstracts have been created from the data, and what their status is. When I know that an abstract or paper has been accepted, I then create an entry for the item in my bibliography manager. Once in the bibliography manager, I can cite the reference in other documents such as proposals or related papers.

About once a year, I go through the tedious process of updating my CV. This typically involves manually sorting through both my project list and my reference database and account for new items or reconcile differences. Every time I do this, it's painful; and because I’ve historically formatted the reference list by hand, it's not uncommon for a typo to sneak its way in or for an author to accidentally get left off of a citation. These mistakes are never intentional, but they do happen.

When I find such an error in the reference database, I fix it. But since I often import these references from websites, the errors tend to be few and far between. Moreover, my reference database is something that I use every day; as a result, it gets a lot of scrutiny. My CV, on the other hand, gets updated much less frequently and errors tend to persist longer.

For a very long time, I've wanted to automate the process. Instead of keeping three separate lists – active projects, reference database, and CV – I’d prefer to keep only one (or two). But I've never found a really satisfactory way of doing so.  Or at least I hadn’t found a system until quite recently.

In my last review of different ways to typeset a CV, I came across an interesting article by Dario Taraborelli.  In it, he described how to create a CV based on the standard “article” document class.  It was well designed, elegant, simple and attractive.  From his work, I created the xetexCV document class.  Additional research turned up an add-on module that makes it convenient to automatically generate a list of publications.  So, for the first time  in a great while, I have finally found a way to automatically generate a publications list in a simple and automated manner.  In this article, I will demonstrate how that is done.

Show me more... »

Typeset Your Curriculum Vitae – Part 2: Extending and Customizing an Existing Document Class

 | November 30, 2009 2:54 pm

Many first-time users of LaTeX often mistakenly look at the language as a a type of glorified word processing software – albeit a particularly complicated one.  While such an analogy may be apt in helping new users become acclimatized to the language, it suffers from a rather nasty problem: LaTeX isn’t a word processor.

If anything, LaTeX shares more in common with a programming languages than any type of application.  In fact, the document processing system is really nothing more than a bunch of re-usable pieces of programming called macros.  Everything is a macro.  That includes the commands that every user is familiar with: \title{}, \section{}, \subsection{}; in addition to the internal formatting commands that allows LaTeX to function.  (Most of the macros were originally created or packaged by Leslie Lamport as a way of making TeX – the typesetting system created by Donald Knuth – easier to work with.)

This has some rather practical consequence; because everything in LaTeX is a macro, it is far more extensible than a word processor could ever hope to be.  If you require a feature that doesn’t yet exist, it typically isn’t all that difficult to add it.  And when your extension is packaged inside a style or class, you can use those customizations in anything that you want to write.

But though creating macros isn’t particularly complicated, it is a different beast than just using the stock macros for writing.  This is not surprising, the craft of design is inherently different than the craft of writing.  There are different conventions to follow and different topics to obsess about.  In the first article of this series, I introduced the xetexCV document class, which is one example of where I decided to don the designer hat.

But before you get too far down the road of customizing and extending, there are a some important things that you need to know.  These include the general conventions used when working with document classes, their internal anatomy, an understanding of how macros are created, and how to handle formatting and layout challenges.  In this article, I will look at these issues more in detail, particularly as they pertain to xetexCV.  In the process of reviewing these topics, I will also explain some of my design choices.

Show me more... »

Typeset Your Curriculum Vitae – Part 1: The xetexCV Document Class

 | November 25, 2009 12:02 am

Very few documents are more personal than a curriculum vitae (CV).  A CV lists a person’s educational history, who they’ve worked for and what they’ve accomplished.  Moreover, a CV is frequently used to judge a person’s inherent worth and value (or at least exploitability).  A quality curiculum vitae matters, a lot.

For that reason, a CV not only needs to include all the pertinent information of a person’s life, but it also needs to look good. An attractive CV with good spacing and contrast leaves a positive impression and makes it easier to find information.  When laid out correctly, a reviewer might just find themselves scouring past accomplishments for interesting tidbits: “I didn’t realize that this applicant organized a lecture series with Patch Adams and other notables, that’s interesting!”

Show me more... »

Customizing LyX: Character Styles and the LyX Local Layout

 | November 14, 2009 5:00 pm

Imagine for a minute that you’re writing a book or technical manual.  Let’s say it’s a book on technology, maybe the open source tools used for scientific writing (to randomly pick an example).  As you write this book, you realize that you need some way to cue the reader into different parts of the text.

For instance, you might want all definitions to appear in bolded text so that a reader pick out key terms quickly.  Or you might want code examples to appear in a different font than the regular text, again, so they’re easy to find.  What’s the best way to do this?

Sure, you could just bold the definitions, or manually change the font for the code examples.  But that’s painful!  Changing typeface and size every time that you have a section of code will eventually result in a lot of lost time.  Moreover, you might make a mistake, which destroys your consistency and makes your writing look unprofessional.  There must be a better way!

Thankfully, there is.  It’s through the consistent use of styles.

Show me more... »

Patronage in the Digital Age

 | November 10, 2009 2:20 pm

As wonderful as the internet may be, it causes a lot of problems.  For starters, it is putting newspapers out of business.  It’s also radically changing how artists, writers and musicians make their living.  And in case you weren’t paying attention, it’s starting to look like a crisis.

Different groups have responded to the impending collapse of publishing in different ways.  Some writers sell sponsorships for their books and then offer an acknowledgement when it is printed.  Many musicians have adopted the self-publishing and distribution tools long available to authors, leading to experiments like Amazon’s CreateSpace.  And there are those who have gone the route of directly asking for contributions and donations to support their work; the digital equivalent of a performer passing the hat, you might say.

The problem is that some of these experiments are running head-long into good old American sensibility and propriety.  There are even people saying that some of the new content generation schemes are inappropriate; including that old bastion of American common sense, Ms. Manners.  Manners has even gone so far as to say that for a novelist to ask for a contribution is the same as begging, or panhandling.

She says it like it’s a bad thing.  The simple truth is that artists, musicians and storytellers have long been beggars.  The content industry of the 20 industry is a tremendously new invention, and as I noted above, it’s running into another time tested American value: frugality and a love of private property.

In fact, there seems to be this attitude that, “After I’ve purchased the novel or CD, I own the work and ideas.  I’ve invested in its creation.”   This little nugget rears it’s head most commonly when discussing music.  Even the great Steve Jobs has been known to say, “People don’t want to rent music, they want to own it.”

Except … that’s bullshit.  An interesting idea, or a well written book, or a beautiful piece of music isn’t like paying for a hamburger.  You aren’t reimbursing someone for providing you a good or service.  And I’m frankly shocked that anyone would think that Beethoven’s 9th Symphony or Antonio Vivaldi’s “Four Seasons” is only worth the price that paid on iTunes.  The true worth is far greater than the price of admission.  Would you seriously think yourself exploited for buying a second recording, or for paying to hear it at a concert?

Of course, that’s when people can be bothered to pay for content at all.  An exacerbating factor is that many people expect ideas to be free or very inexpensive.  How many times have you heard a variant of this argument, “I would buy more music (or books) if it wasn’t so expensive!  Nine dollars for an album is just out of my budget!”  Ironically, these same people don’t blanch at dropping hundreds or thousands of dollars for an iPod or iPhone.

While bad, this attitude can further devolve into something much more poisonous: “The artist owes me for reading, viewing or listening to their work.  My piracy is helpful!  After all, I am promoting them and making them famous!”  But being famous doesn’t pay the bills.  There have been many authors, artists or musicians who lived in squalor while enjoying enormous fame and prestige.

A music or literature pirate might even justify their position by saying, “I’m sticking it to the music industry (or publishing industry), they’re a bunch of greedy pigs!”  And the pirate might have a point, if he weren’t doing far more damage to the creator of the content than to its distributor.  Big businesses like record labels and big publishing houses don’t respond to that attitude by lowering prices or dealing fairly with their customers.  Rather, they become more draconian in how that content is disseminated.  Ever wonder why Digital Rights Management (DRM) and related technologies were born?  It might just have something to do with the American sense of entitlement.

Clearly, something needs to change.  Artists and musicians can continue to experiment with different pricing and distributions schemes, but I remain rather unconvinced that it will have a lasting effect.  What we really need is a return to the patronage system of old, with a few major modifications.  Certainly, artists should continue to sell recordings, books and other tangible goods.  But the public should also undergo a shift in our attitudes and ideas about what the arts are and how we support them.  That might mean that we transform our understanding of what a “donation” is.

When buying a book or donating to a writer, it’s foolish to think that you are somehow providing a fair compensation for the ideas and entertainment that you receive.  Instead, it is much healthier to view your contribution as a support so that the artist can continue to create future content.  This notion actually fits in pretty well with the concept of Fair Trade.

We also need to understand that the price we pay for a book or CD isn’t about the value of the materials.  Textbooks aren’t expensive because they are printed on beautiful paper with artwork and in color; they’re expensive because researching and writing their content is hard.  For example, the “Contributors and Reviewers” page for Gray’s Anatomy (the anatomical guide, not the television show) lists sixty different authors and content reviewers, though only the editor and chief is credited on the cover.

Except, how do you actually bring about the needed shift in attitudes and culture?

That’s an excellent question, and I’m not sure that I can offer any insight.  The Europeans have tried to shape public perception through generous subsidies.  But direct governmental support of news agencies and publishers is controversial for good reason.  As a cure, it might even be worse than the illness.  If you’ve got any ideas, let’s hear ‘em!

Statistics With R – Part 1: An Old Dog Learns New Computing Tricks

 | November 8, 2009 10:21 pm

When doing math or numerical analysis, the knowledge of the technique is far too often tied to the tool performing the calculation.  Consider an engineer whose understanding of the Fast Fourier transformation is inseparably tied to the fft function in Matlab.  Of course this hypothetical engineer understands what the results mean (more or less) but may not be able to duplicate his analysis if Matlab were taken away.

In most cases, it is likely that no deeper understanding will be required.  But what happens if the computer makes a mistake?  Or the program becomes unavailable?  Both situations are entirely possible.  Computer algorithms aren’t perfect and occasionally arrive at results make little sense; and hardware has been known to fail.

When the engineer understands how the computer arrived at the answer, however, he can recognize, understand, and ultimately correct those cases where the results are unexpected.  This is an important reality check that can prevent costly disasters later down the line.  Or, if the hardware is unavailable, he can use an alternative tool or software package to duplicate the analysis.

But while such a situation can arise with any type of numerical software, it’s most likely to happen to users of a statistical package.  I find this extremely ironic since a proper understanding of statistics is essential to live in the modern world.  (Much more so than an understanding of the Fast Fourier transform, at any rate.)  The rules of probability, the normal curve, correlation, and multivariate statistics can have a direct impact on how we live our lives.  They are used in making important decisions in finance, medicine, science and government.  A misunderstanding of stats and the methods of science (from which statistics is inseparable), underlies the most divisive issues of our day: abortion, stem cell research, and global warming.

Moreover, neither side has a monopoly on ignorance or misunderstanding.  People fail to distinguish between correlation and causality, or insist in using the word “average” as a slur.  Nearly as bad are those that – like the hypothetical engineer described above – only understand statistics within the narrow context of their stats package.  Casual statisticians are nearly as dangerous as the wholly uninformed.

The Statistical Package for the Social Sciences (SPSS), is one of the biggest perpetrators of this crisis.  Which is hugely ironic, because I happen to love SPSS.  SPSS is probably the first statistical package that has placed advanced statistical methods within the grasp of the novice user.  I’ve been a happy user for nearly a decade (ever since I was introduced to the program in high school).  But there is no doubt that I’ve come to understand statistics within the context of SPSS and its GUI.

Please don’t misunderstand me, I have a pretty good grasp of basic statistics.  I can sling probability with the best of them and take relish in describing when to use the Fischer Exact test instead of a Chi-Square; but advanced statistics are a completely different matter.  Advanced stats scare me.  I can certainly use these more complicated methods.  I’ve analyzed and written about multi-variate models and even ventured into Analysis of Variance (ANOVA).  But I have to rely on SPSS and the aid of my institution’s biostatistician to help me recognize when there is a problem.

Which is why, in a time of tight budgets, losing the institution’s SPSS license has been a crushing blow to my productivity.  (Whoever made that decision should be hauled out and shot!)  Because I don’t have my statistics software any more, there are certain aspects of my job that are much more difficult to do.  And unfortunately, there is only logical conclusion to draw: I’ve become a victim of the statistical ease of SPSS.

Show me more... »

Customizing LyX: Create an NIH Grant Proposal Template

 | November 2, 2009 6:21 pm

LyX is a wonderful writing program.  It’s easy to use and produces beautifully typeset output.  More importantly, though, it lets an author focus on the content and structure of his writing; rather than the formatting.  It isn’t so easy to customize, though, which limits its usefulness in a big way.  What if you need to create a new layout or take advantage of one of the thousands of specialized  LaTeX styles?  How, exactly, do you go about doing that?

That’s why this article was written.  Recently, I was asked to help with a National Institutes of Health (NIH) R21 grant proposal.  After some talk amongst the different investigators, it was decided that we would use LaTeX and LyX to draft it.  Unfortunately, we hit a rather substantial hurdle early in the process:  LyX doesn’t have an NIH grant template.

After additional debate, we decided to proceed with LyX anyway.  But in the process, I found myself saddled with an additional job.  In addition to responsibilities as research flunky and copy editor, I was tasked with creating a LyX and LaTeX template for our NIH grant.  This article will summarize the steps I took and describe how to create a custom template using an available style on CTAN.

image

Note: All of the files in this tutorial can downloaded here (.zip).

Show me more... »

Create a Unified Inbox in Gnome Evolution

 | October 27, 2009 3:43 am

I have a serious love-hate relationship with Linux.  I love the fact that it’s free and open source.  I love the fact that it can breathe new life into old hardware.  I love the fact that it’s easy to extend.  I love the fact that it has a vibrant and passionate user community.

What I do not love is that many open source programs are incomplete.  They can do most everything that you need, but never get around to adding the one or two features that prevent them from being finished, polished and exceptional.  I’ve ranted about this before, back when I was trying to find the perfect backup program.

Well … I’m at it again; except this time, I’m looking for the perfect email program.

Show me more... »

Time Drive 0.3: Better, Easier, More Refined

 | October 26, 2009 12:16 pm

One of the upsides of open source software is that it largely sales itselfImagine how awesome it would be if this announcement read: “Time Drive has been completely rewritten from scratch (yet again) to take better advantage of the paradigms of modern computing!  Version 0.3 has hundreds of updates and new features which will make your life easier and more fulfilled!”

There's just one little problem … such a hyper inflated announcement wouldn't necessarily be true.  (Marketing hyperbole, I never knew thee!)  The truth is this: Time Drive is a simple backup program that does a good job of reliably backing up your data.  It offers a nice list of potential backup options: from an attached hard drive, to a computer over the network, or across the internet.  It makes it easy to search for and restore a lost file.  In short, Time Drive seeks to change the world by making an act of computer maintenance more convenient.

But the real test of a program isn’t how well it works, but how easy it is to fix when broken.  A good program does what you want, but a better program helps you get back on track when things go wrong.  Back when I was looking at other backup programs available for Linux, this was my number one frustration.  Most of the applications would work (for the most part), but I could never troubleshoot or repair problems when they happened.  There just wasn’t enough information available.

For an example, let’s take SBackup.  It’s a lovely little program,  except you have no way of knowing if it is working.  It doesn’t keep log files, it doesn’t notify you if a backup job failed.  It doesn’t let you know if it is running.  Its simplicity is actually symptomatic of a flaw: it’s incomplete.

These were problems that I desperately wanted to avoid with Time Drive.  And version 0.3 includes a number of refinements that solve these issues while at the same time making make it better, easier and more refined.  In the rest of this post, I’ll explain why.

Show me more... »

On the Surface Versus Working Deep

 | October 22, 2009 4:09 pm

Indoor Arena Amongst horse people, one of the fastest ways to raise hackles or hostilities is to call someone a “surface worker.”  It’s just one of those things that you don’t do in polite company.  After all, one of the reasons people are drawn to horses is to enjoy a real and deep connection.   To call them a “surface worker” is to accuse them of putting on a a circus act.  Certainly, the relationship may look real and geniune; but it's not.  It's not nothing but an act and fraud.

Given how the word is used and understood, I find it extremely ironic that so few people understand that “surface work” and it’s attendant ideas of conditioned response, sensitization, desensitization and instinct are actually very important to horse training.  If you want to have any type of real relationship or meaningful communication, you need to do a lot of very tedious surface work to get there.

Show me more... »