Open Source LIMS solutions?

Much is made over the extra effort it would take to produce scientific manuscripts with XML formatting in order to facilitate machine reading and therefore increase our ability to catalog and cross-reference the content therein.  While I don’t necessarily agree that the effort itself would be as difficult as some might think, I’m of the opinion that the hard part is just getting people to do it in the first place - a paradigm shift in how the papers are written.

As it stands, writing papers are already sort of tough.  You have to convert a large amount of often somewhat disjointed and relatively uncatalogued data (mostly in the lab notebooks and brains of the people who did the work) into a concise and understandable document, complete with figures and references.

One way to both smooth the production of XML-rich manuscripts as well as the actual process of writing the papers would be to place the data, more or less as it is collected, into an information management system, part of which would (mostly behind the scenes) index the data with XML tags.  Many products have been developed to handle research information, and these are usually referred to as Electronic Laboratory Notebooks (ELN) or Laboratory Information Management Systems (LIMS).

Unfortunately, most of the LIMS systems I have come across are closed-source applications marketed to industry or medical labs at a high cost.   They tend to be rather inflexible, and often require the users to learn a customized method of interacting with the software which doesn’t correlate to any other user experience.  I would much rather see an open-source, cost-effective (free would be nice) system marketed to all labs (although my main interest is in academic adoption).  The software should have an intuitive user-interface, and the more it resembles tools that the researchers may already be familiar with the better.

For instance, one component that comes to mind is a Twitter-like box for entering the day’s experiments.  These could be brief statements (”Performed minipreps on overnight cultures from 4/1/08.  Stored samples at -20 C”) that would be aggregated into a timeline by the software package.

Most importantly, the software needs to make it simple to cross-reference and link to data.  Presumably this means a database with hyperlinks to common folders in which the data would be uploaded and stored.  Wordpress already allows the user to upload files, and stores them in a folder which is identified by the date.  This should not be a difficult system to implement in a more academic setting.  In addition, the software could create automatic backups of the data and store them in a location of the user’s choosing, to avoid any data loss.

Now, assuming that the software has been designed to maximize indexing the data via tags (preferably XML markup) and cross-referencing text entries with data sources stored in the shared database, this should really make writing the paper simpler.  All of the data would be rapidly available via search, and (if the cross-referencing is extensive enough) the writer could simply “walk through” the results.  If the XML markup for the software is retained in the manuscript (or perhaps altered to match some agreed-upon standard), then there is no extra effort involved on the part of the writers to make their paper machine-readable.

So, does such a package exist today?  If not, how far away does one seem to be?

The inspiration for this post came from a blurb in my weekly C&E News mentioning MyLIS, a web-based open-source package which charges based on storage space.  Unfortunately, a trip to their website gives the impression that this is far from a professional grade product.  The corporate website appears to be a boilerplate template, and the “tour” of MyLIS is a single page with a single graphic and a few lines of text.  It’s hard to tell from the graphic, but it appears that the software is based on a Wiki-like system, which is fine, although once again the overall fit & polish just looks off.  I think you’d have a hard time showing this to a principal investigator and asking them to trust their data to it.  This is especially concerning as it seems that MyLIS stores all the data on a remote server, rather than on a local machine in the research lab/department.  I think that for reasons of assured access and data security, you need local storage.  Another system, geared towards structural biology labs, is HalX.  Once again, their website (the provider of the all-important first impression) is lacking. There is very little information accessible, and it’s really hard to say anything about the system.  Continuing in the theme of underdeveloped products is OpenLIMS, which no longer appears to be under active development.

I’ll stop the survey there, because frankly it’s a bit depressing.  What are we to take from this?  To me, it seems that there is some market for this type of software, since there are several aborted attempts by various individuals to create it.  It also seems that people who want this type of thing enough to work on it suffer from apathy, or perhaps they realize that the task is too daunting.

Here is what I would do (if I knew more coding):  I would leverage one of the fine open-source content management systems that are already available and widely used (Drupal is the first that comes to mind), and write modules which would enable the functionalities needed for a LIMS/ELN.  Eventually this may get to the point where you would have to fork off of the main CMS branch, but it would probably be better for everyone if this didn’t happen (a fork might mean that the LIMS developers would now have to also work on core functionality of the engine).

It’s a difficult and complicated project to be sure, but one that I think is very worthwhile.  I do think that this would be sort of a grass-roots (i.e. closer to the bench) way of making it easier to transition to open access distribution of the research data.  Once again I find myself wishing I had double majored in computer science as well as chemistry.

What do you think?  Do you believe that a community-developed LIMS package would be something you’d like to have in your lab?  Does it seem like too much effort for not enough reward?  Even if it were built, how would we convince people to use it

5 Responses to “Open Source LIMS solutions?”

  1. Jessica Says:

    I came across this post in searching for information on Open Source LIMS myself, though not in looking for ones that use XML to aid in publishing. It does sound like an interesting idea, though. Thanks for posting the short survey of sites, but I am curious as to whether there were more on your list? I’m not finding too many myself, though I did find a couple you don’t have here (you may have found them, though), some by Bika Labs (http://www.bikalabs.com/) though not a biological lab one like I was looking for, and Enfold Systems Sample Inventory Program (http://www.enfoldsystems.com/Products/Open/SIP/). They’re developed more in the content management systems style, from what I’ve read. I have no experience with either one, nor with their underlying architectures. There’s also the caBig project (https://cabig.nci.nih.gov/), with it’s sample tracking caTissue. Again, I have no experience with this one, but Indiana University has been involved in its development and plans to deploy it in some departments. This one is closer to community-developed, but still not quite there.

    I’ve been developing an in-house, open-source sample inventory system myself for the past year and a half– PHP/ Apache/ SQL Server (pretty easily adaptable to at least MySQL if not other databases as well– the SQL Server/ Windows choice wasn’t mine. I prefer MySQL/ Linux for development.) It’s not public domain at the moment, though there’s the potential for it to be so in the future.

    As for your job search, check out http://www.compbio.iupui.edu/mooney . I don’t know what your specific interests in research are, but feel free to send an email to Dr. Mooney if it sounds like it might be interesting. Programming experience always helps in this lab.

    And a side comment: hindsight seems to be 20/20, as they say, in regards to what you wish you’d majored in. I wish I had the bio to go with the CompSci.

  2. PA Says:

    Thanks for the comment.
    Perhaps I wasn’t clear in the original post - I came across a lot of websites for Open-Source LIMS packages, however most seemed to be abandoned and unfinished. Those that you have linked seem to be more up to date.

    It’s interesting that you mention you’re developing your own inventory system. At one point I had started toying around with my own version of a LIMS (starting with inventory - it just seems the most straightforward) built on Ruby. It was really more of a reason for me to check out the language, and I gave up on doing anything seriously with it. It seems like there are several people out there who do this sort of thing - start a project, perhaps get it to the level of local deployment, but nothing breaks through to be more widely used.

  3. Jacka Says:

    Hi!
    I think an open source LIMS package would be very helpful !
    I work on my own LIMS system too, based on PHP-AJAX/Apache/MySQL.
    At the beginning of 2007 our company decided to buy a LIMS-System.
    Now we have 2008 and there is no system available. But why?
    Our labs need an ELN and a LIMS and a Projectmanangement tool, but we did not found any software with all of this features. Of course…
    The first supplier showed us a wonderful application with integration of instruments, a powerful report editor and it was possible to cluster samples (like projects). But structure search? No sorry..
    Next supplier same problem the other way round..
    So we decided to develop our own system. I found a open source structure search package (http://merian.pch.univie.ac.at/~nhaider/cheminf/cmmm.html) and a free Java based structure editor (http://www.chemaxon.com/product/msketch.html) for the ELN module, so the first step was done. At the moment i work on the sample management. The idea is to give the opportunity to define endless method with XML-based flexible reports. Then we need a stability testing module. All of this “module” you can combine for your own system. (That´s the idea..).

    The reason why it is very interesting: there is no software that can do anything. With a open source community, everyone can build expansions.
    I wish you the best for your project!

    Greetings from Germany!
    Jacka

  4. Plausible Accuracy » Blog Archive » More on lab management Says:

    [...] who are already spread thin due to competing demands on their time.  This is why I’ve made calls for open source LIMS packages as well as taken some initial steps towards building one [...]

  5. John Says:

    Have you looked at joomlalims (http://groups.google.com/group/joomlalims)?
    This was originally mamboLIMS and seems to be back in development (I played around with a very early version some years ago and it seemed promising, but I’m not a programmer).

    It was also apparently the inspiration for http://www.yourlabdata.com, a service that is used by a few labs (somewhere in the 50s) and 274 registered users to date.
    Both are based on Mambo/Joomla CMS.

    Perhaps you might like to take a look at the code and either modify it for your own use or even join the development team…

    Anyway, keep up the good work!

Leave a Reply