Wednesday 6 February 2008

Citations

http://www.blogger.com/img/gl.link.gifI've been thinking recently about citations.
I had with Geraldine towards the beginning of this year where she showed me a document containing citations for lots of papers she'd read. We discussed how to save references like these, and I mentioned that in the past I'd relied on the facilities of my OSX computer to manage this material. During my MA I had to read a great deal of literature. I took a theoretical course so it was much more about reading than anything else. My strategy back then was to organise my work using a filesystem hierarchy like so:

University
- Module Name
- - Assignment
- - - References

In the references folder I'd keep downloaded versions of everything I'd read, however briefly. If I spent some time reading a piece then I'd create a text document to accompany the downloaded version which would include the normal citation data (author, date, title, publisher, etc). Following that I'd write up one or two paragraphs about the piece, trying to summarise the important points and give some kind of qualitative indication of what I thought about it. Having to sit down and summarise someone else work was a really useful practice that I think would be well worth including in future ATC courses. It forced me to really consider what the author was saying and to identify the core ideas they used. This practice reinforced my understanding of their work and I felt that it helped me to be able to recall who they were and what they did. In addition when I came to write up my dissertation and I had a lot of references to include it was really easy for me to copy and paste the citation itself, and also to rework my summary into a format that I could include in the body of my work. I still have access to this annotated bibliography and am sure that it'll come in handy for further work that I'll be conducting on video games.

Another thing to remember is to rename files that you download. Typically when I read PDF papers they have numerical filenames (eg, "2534595143623.pdf") which are probably only significant as they key to the database from which they originate. The format I use for renaming is similar to the format you'd use for writing a reference anyway:

Surname, Firstname. [other authors.] Title

Clearly this is much more useful than just a number. It also means that I can search my harddrive very quickly (using Spotlight) to find any documents by a particular author, say.

On reflection the trouble with this technique is that I get duplicates as I would file by module name. It's not unusual to use papers from one module in another - in fact if you're not doing that then it begs the question why you're keeping references anyway...

This didn't become a big deal though.

Geraldine's technique was just to keep a single plain text file with all of the citation info for each document. Simple but effective.

Since then I've been thinking about citations, and have gone off my original method of just typing out the typical data by hand. The problem with the manual approach is that different publications have different standards for references, so you might end up having to edit the format and/or data when you come to insert it into another document. Apparently this becomes a massive task when you have to write up your thesis.

The University of Sussex do offer a course on how to use EndNote, but I couldn't get on as it was full by the time I heard it was available. I took a look at a couple of different formats and decided that BibTex is the most human-readable version of a computer format, and so is probably the best format for me to store my personal annotated bibliography in. Being human-readable is especially important for me as I have an OSX laptop and a Windows PC, so I don't want to use an application that can only run on one of those platforms.

Compare the following for readability,

BibTex:

@article{1326555,
author = {Sergey Yekhanin},
title = {Towards 3-query locally decodable codes of subexponential length},
journal = {J. ACM},
volume = {55},
number = {1},
year = {2008},
issn = {0004-5411},
pages = {1--16},
doi = {http://doi.acm.org/10.1145/1326554.1326555},
publisher = {ACM},
address = {New York, NY, USA},
}

EndNote

%0 Journal Article
%1 1326555
%A Sergey Yekhanin
%T Towards 3-query locally decodable codes of subexponential length
%J J. ACM
%@ 0004-5411
%V 55
%N 1
%P 1-16
%D 2008
%R http://doi.acm.org/10.1145/1326554.1326555
%I ACM

I'm not entirely sure which is better. EndNote is more concise, but at the expense of having to memorise or infer what the % characters indicate. Either way, if I keep my own bibliography in some standard format then at any time I should be able to convert it into another format if I so desire.

One of the nice things that I've also been doing is maintaining my own personal blogs. Including references in them is pretty handy too because I can include a direct hyperlink to the original document. Having it online is also nice because it means I don't need to worry about syncing between the two computers, nor do I have to worry about backing up my data.

In fact I have another blog that I was using to keep some notes about my DPhil project (which I haven't actually started yet), and one of the draft posts is essentially a TODO list composed of (hyperlinked) BibTex references. This is a list of documents I should read that are relevant to my project. I've been keeping this blog and the list since September last year, so it's getting pretty big now!


You'll notice that in the sample reference above a hyperlink is included as a DOI. This is important (and something I've changed to recently) because it tries to provide a constant URL where the data will always be available from. Normal URLs have a tendency to "rot" or go "stale", that is, when the document is moved the URL no longer functions. This happens over time so it's important to keep your bibliography "fresh" by using DOIs wherever possible, especiially if it's going to be some time before you refer to a particular document again.

Another technique that I've been using recently is following hyperlinked references. Often when I read a paper I want to see the other work they reference, but manually searching for these can be time consuming. That's why I was really happy to see that the ACM list citations as well as references for documents in their electronic library, and many of these are hyperlinked. Following these can be useful to find similar work (that doesn't necessarily turn up in a search, or if you don't want to trawl through all the search results).

No comments: