Archive for May, 2008

Finding jobs in biotech, via DBiaDW

Friday, May 30th, 2008

Dr. Porter, of Discovering Biology in a Digital World has been running a series of posts on the job market for scientists.  While I think that in some places her conclusions are dubious, the ensuing discussion in the comments has been interesting.  I’ve posted most of my thoughts there, so rather than go into detail here I’ll just provide links to each of her posts.

Scientific Figures and Photoshop - Two Great Tastes That (Don’t?) Go Great Together

Friday, May 30th, 2008

Via AiE&S, I found an article in the esteemed Chronicle of Higher Education discussing the prevalence of digital manipulation of figures in scientific articles. This is an issue of interest to be, because I’m something of an amateur digital manipulator myself.  For years I’ve taken part in online Photoshop contests, and even won small monetary prizes here and there for my efforts.  One of the reasons I do this (in addition to how much fun I have with it) is the potential for applying the skills I learn to my work.  Mostly by this I mean my web design hobby, but knowing your way around Photoshop really helps when making presentations or other science-related graphics.  The problem is that there is a very fine line between making something look nice and altering the scientific meaning.

Papers are starting to employ tools to look for digital tampering:

New tools, such as software developed by Mr. Farid, are helping journal editors detect manipulated images. But some researchers are concerned about this level of scrutiny, arguing that it could lead to false accusations and unnecessarily delay research.

I have to say I fall on the side of the “concerned researchers” here. In my experience, you tend to have two groups of people. There are those who understand image manipulation, its strengths and weaknesses, and when it can be appropriately implemented. Then there are others who don’t understand what’s going on and tend to think that if you even open your TIFF file in the program you’ve just sabotaged it’s scientific validity. I worry that we will end up with a situation of photoshop = bad, and that’s simply not true at all. As a matter of fact, the very description of the software being employed indicates that this is already the path we’re heading down:

The software looks for patterns in the digital code underlying an image. When files are opened and altered in Photoshop, for instance, codes are added that Mr. Farid’s software can detect.

I can hardly think of a single image I haven’t opened in such a program (I usually use The Gimp though).  Almost every figure needs at least to be cropped, have a label or two added, and saved out as different formats.  Does that mean every figure in my papers will be throwing up flags?

To be fair, it seems like the journals themselves are still taking a level hand for the most part:

So far the journal’s editors have identified 250 papers with questionable figures. Out of those, 25 were rejected because the editors determined the alterations affected the data’s interpretation.

10% rejection because of altered meaning seems to indicate that the vast majority of digital editing is not scientifically harmful.

And, as we so often see, Open Access may lend a hand in solving the problem:

One new check on science images, though, is the blogosphere. As more papers are published in open-access journals, an informal group of watchdogs has emerged online.

“There’s a lot of folks who in their idle moments just take a good look at some figures randomly,” says John E. Dahlberg, director of the division of investigative oversight at the Office of Research Integrity. “We get allegations almost weekly involving people picking up problems with figures in grant applications or papers.”

I’m not sure that I approve of the online witch-hunt oversight scenario this seems to set up, and this type of method won’t detect image tampering prior to publication (or at least online availability).

It’s a sticky problem, to be sure.  It’s clear that image manipulation programs allow for data falsification that is almost indetectable to the human eye, especially when done by an expert in the use of the software.  These same programs (and largely identical usage of the programs) can pull new scientific information out of otherwise useless images.  In my opinion, specific alterations should be mentioned in the figure captions.  I do this when I present figures at group meeting, for instance noting that I’ve adjusted contrast for easier viewing.  I wouldn’t mind in these cases a requirement to submit both images to the journal, so that they can judge for themselves the validity of the manipulation.  I just don’t want scientist to be discredited simply because they are trying to, with the best of intentions, improve communication of their findings.

Django is like an alien spaceship of awesomeness

Thursday, May 29th, 2008

We sit eyeing one another, this spaceship and I.  The power it holds within is clear, but the methodology for harnessing that power escapes me.  It’s evident that a vastly superior being has designed this device to do amazing things, but a manner of interacting efficiently with it is not forthcoming to my uninitiated cortex.  I have managed to move it a bit, but I don’t think that holding a match to the propellant tank is what the designers had in mind.

It’s an enigma, this machine, but I plan to plumb the depths of its intricacies, perhaps learning more about myself in the process… (more…)

Struggling with research ennui

Wednesday, May 28th, 2008

When Mrs. PA successfully defended her thesis, I thought it might finally light a fire under me to push through whatever I needed to in order to finish up here so we could get on with our lives.  Instead, I’ve found myself in a more or less continuous state of ennui.  I have no motivation or interest to work on my thesis project.  Partly it’s because I don’t really believe that it will ever generate results, and therefore I don’t really see the point of even trying.  I know that this sort of defeatism is not unusual among graduate students, but I’m having a hard time yanking myself out of it.  I can’t even manage to use the reasoning “just finish it and you can get out of here” as enough impetus to apply myself.

On some level I feel like my reserve of “well it didn’t work that time, let’s tweak the parameters and try again” has just run out.  The “reward” from a scientific standpoint is more or less the same whether I actually do the experiments or not, because the experiments never work.

I think this is made worse by my particular situation.  Most graduate students at this stage would have enough data to just sort of drag themselves across the finish line.  Since I had to change projects, I’m left sitting in the middle of a pile of half-completed projects and seemingly intractable problems with each of them.

I really wish I could think of a way to snap myself out of this funk.  I know that it’s not helpful in any way.

Our first weekend Geocaching

Tuesday, May 27th, 2008

I’ve been lusting for a G.P.S. for some time, but Mrs. PA has been a hard sell.  Finally, when I managed to drag her into the store to look at one, I casually mentioned the hobby of Geocaching.  For some reason this seemed to sell her on the idea.  That day we ordered a basic model, a Garmin nuvi 200.

This weekend we struck out to do some hunting.  We used Geocaching.com to find the coordinates of some caches near our house.  The first day (Saturday), we had selected 3 locations in wooded areas near our house.  In retrospect, this was not the best place to start.  We only managed to find the first part of a 2-part cache, and couldn’t find the other two at all.  It was a bit disappointing, but the walk in the woods was nice.  After resting up on Sunday, we decided to give it another go on Memorial day.  This time, we chose 4 caches spread around the university campus.  This experience was much better, and we managed to find all 4 of them (including one about the size of the end of your little finger).

We had a lot of fun, but it’s clear that we’ve also got a lot left to learn.  The G.P.S. isn’t as accurate as it could be.  It’s not that far off, but when you are standing in the woods even a 3ft radius can hold a remarkable number of hiding spots.  It definitely helped to do some in a more developed setting - we got a feel for what types of hiding spots things are placed in and the container sizes/appearances.

The whole experience was a lot of fun.  There are still something like 200 caches within about 10 miles of our house, so I’m sure we’ve got a lot of interesting afternoons ahead of us.

Interview with Jean-Claude Bradley

Tuesday, May 27th, 2008

Bora (of A Blog Around the Clock and PLoS) has a nice interview with Jean-Claude Bradley on several aspects of Open (Notebook) Science.  Check it out!

Brainstorming a Feature Set for an Open-Source LIMS

Friday, May 23rd, 2008

As I investigate Django, I find myself matching up features of the framework with applications I’d like to implement if I were writing my own Laboratory Information Management System (LIMS).  So far my typical cycle goes something like this:

  • Find new (to me) development framework and do a cursory investigation
  • Work through some basic tutorials
  • Choose one component of a custom LIMS that looks to be the simplest to implement with the new framework and work on it
  • Get bogged down
  • Give up

With Django, I’m at step 3 of the process.  The interesting thing this time is that I can envision solutions to writing several of the LIMS modules I have in mind, rather than a rough idea for one and a hope that I’ll figure out the rest as I go.  Maybe this time around I just “get it” a little more than with previous systems.  Perhaps I just think that I do :)

With that in mind, I’ve turned once again to brainstorming the set of features that I would like to see in a LIMS.  Even if I don’t end up writing the software myself (a likely scenario), it’s worthwhile to have the ideas out there.  Here is my list, but feel free to add any you can think of in the comments.

  • User authentication
    • Django largely takes care of this automatically
  • Manuscript repository with version control (for collaborative document writing)
  • To-Do lists/Workflow management
  • Inventory/Re-ordering management
    • Chemical locations, MSDS links
  • Wiki (with the standard Wiki history allowing for reverts)
    • I’m imagining this as being used for protocols, but it could potentially hold a lot of things
  • Literature repository (can hold actual PDFs or link to Institutional/other open Repository)
  • Calendar (group & individual, perhaps the group calendar just aggregates the others)
  • Research Image repository/browser
  • Personal blogs/microblogs
    • Could use tags/categories to separate “lab notebook” entries from other, less formal posts
  • Portal page which can serve as public lab homepage if desired
  • Grant/manuscript tracking (could be integrated with the workflow manager above)
  • Teaching material repository
  • Automated backup of data
    • Daily/weekly database & file backup
  • Instrument interface API?

That is what I can think of off the top of my head.  Now, tell me all the things I’m missing.

One that I’m aware of is integration of laboratory instruments - the ability to have an instrument dump the data directly into the LIMS.  My reason for leaving this out is that I really think this is the most complicated part.  Every instrument will have different ways of outputting data.  My most ambitious goal would be to have some sort of ability for people to write their own interface modules, which could then be added on by that particular lab.  Even this is a task that I’m not really sure how to start on.

First look: Creating scientific web applications with Django

Thursday, May 22nd, 2008

Unrelated to the actual body of this post, but possibly of more interest to you, dear reader is that I’ve sent in another job application.  This time it is for an Associate Editor position at the esteemed Science magazine.  My qualifications are a bit less than what they seemed to be looking for, so I’m not terribly optimistic (what’s new).  As usual though I’m nervous…  All right, on to the actual post!

I like to think (perhaps a bit ambitiously) that all of my tinkering around has elevated me to the level of “novice” programmer.  I can usually decipher things that others have written (ok, I can often do so), and I’ve written several command-line scripts that will do something useful.  I think one of the key things I’ve learned is that coding is hard, and I have tons of respect for the people who’ve chosen to do this as their career.  Now that I’m starting to get a handle on everything I don’t know, I feel like I’m also starting to find the handholds I need to climb a little farther up the cliff/learning curve.

So far I’ve had the most success writing things in Python.  This is most likely because it’s a relatively simple language, designed to be accessible to noobs like me.  It’s a fine language which tends to do what I like in ways that (more or less) make sense, and since it’s usage is fairly widespread in bioinformatics I don’t feel like it’s a waste of time to learn.

The problem with most of my “applications” so far is that, like I said above, they are uniformly command-line scripts which either take console or text file input.  For my own personal use this is fine - I understand the quirks of the program and am comfortable operating from the console.  This tends to be a barrier to more widespread usage, however.  Most people (who might use one of the things I’ve coded) aren’t very comfortable at all with entering commands into the terminal or editing a configuration file by hand.

So, I wanted to start looking into ways to start writing things that had a friendlier user interface.  I looked into using Glade to make graphical front-ends, but was having trouble wrapping my head around all of the handlers and things.  I was also a little worried that this would restrict the final product to a Gnome-based desktop.  What I really wanted to do was make something accessible via the web, so that I could install the application on our lab’s central machine and let people use it from their own computers.  My problem was that I couldn’t find a decent (i.e. quickly understandable by me) way to build web apps based on Python.  That is until I found Django.

Django is a web framework based on Python that just makes it easy to develop a Python-based application and distribute it via the web.  I haven’t had time to build anything from the ground up yet (I’ve been working my way through the online tutorial/book), but I can definitely see the potential.  I’ve gotten much farther with Django in a much shorter time than with any of the other solutions I’ve looked at so far.

I’ll keep you up to date as I continue my experimentation.

Where did you find your current job?

Wednesday, May 21st, 2008

Mrs. PA and I are both looking for gainful employment these days, and we’re having a tough time finding two jobs that look interesting, are reasonably near one another, and which we both have a chance of actually getting.

So far I’ve done most of my searching using the internet of course.  I’ve tried RSS feeds from Craigslist job postings, manually checking individual sites, and other feeds from job aggregators (like ScienceCareers).  I feel like it’s a pretty wide net, but I haven’t caught many keepers yet.  Mrs. PA has found a few positions that she’s applied for, but they tend to be in fairly backwater places where my chances of employment are slim.  I’m worried that we’re looking for jobs in all the wrong places.

So, I guess it’s not a bad idea to ask people who have jobs where they found them.  Did you use one of the big job sites online?  Personal contact?  Newspapers?  How long did your job search take?  Did you go on a lot of interviews, or did you just happen to mesh with the first company?

The American Chemical Society is out of touch

Thursday, May 15th, 2008

Several recent links from Open Access News have reminded me once again of how strong the resistance to OA can be from publishers.

I’m a (5 yr +) member of the American Chemical Society, an organization which publishes several of the prominent journals in the field.  I have to admit that I hold the membership with a certain distaste.

Rudy Baum, the editor-in-chief of the weekly Chemical and Engineering News has long held anti-OA positions supported by arguments rife with Fear, Uncertainty and Doubt (FUD), as well as a fair does of self-delusion.  An example of just how confused Baum is about OA come from a 2004 editorial:

access to the STM literature is more open today than it ever has been: Anyone can do a search of the literature and obtain papers that interest them, so long as they are willing to pay a reasonable fee for access to the material.

Leaving aside the debate of what a “reasonable fee” might be, this quote shows that Baum has a fundamental misunderstanding of the difference between “accessible” and “open”. A front-row seat at the Metropolitan Opera is accessible for a “reasonable fee”, but I doubt people would consider it open to the general public.

OAN linked to a piece in Issues in Science and Technology Librarianship which delves further into ACS’s “war on OA”.  The icing on the ACS cake, however, is this bullshit-ridden interview with the president of ACS publications.  I wanted to excerpt the parts of the OA discussion that were crap and pick them apart one-by-one, but I realized I was going to be talking about almost every sentence.  So here you go (my comments in bold):

What are your views on open access?

We are in favour of various access models [as long as we get paid] and think authors should have the right to choose. We don’t think that governments or others should mandate what authors do and require them to pay [Except us, of course - we can mandate what authors do and require them to pay]  Note that this is a textbook FUD argument of “OA means you, the hardworking scientist, will have to pay more” which has been refuted time and time again.

Immediately on publication each of our authors is given a link that they can put on their websites or funding body’s site free of charge. How nice, I can link to my own work for free! There is a limit of 50 downloads of their paper in the first year.  50 downloads might last a few weeks tops, even for a moderately popular article.

If the author wants to place the whole article on their website or funding body’s site then we have our ‘AuthorChoice’ model where authors pay [terribly exorbinant fees] to make their articles open access [and ACS still keeps the copyright, it's win-win, really]. Most of our revenue comes from subscriptions, with a bit [ok, a lot] from advertising. We don’t see many authors choosing the AuthorChoice option [because we don't explain it whatsoever in our publication guides]. We’ve had this model out for about a year and less than one per cent of papers are published this way (ibid). Not all authors have access to funds that they could use to pay to publish (Then how do they afford your page charges?  Note: another attempt at the OA = author payment crap) and most of our authors are pleased with the access that others have to their papers anyway [because many of them aren't aware of other options and don't have contact with people at non-R1 research institutions].

We enable authors to submit their raw data too [The more of their intellectual work we can own the better of course]. We put this outside our firewall [but not outside of our copyright] so it is open to non-subscribers too but we do not tag this information.

We left the matter of putting preprints in repositories to editorial discretion on the individual journals and the editors have chosen not to allow this [on pain of death/firing]. After publication there is the option to have the free author directed link or to pay for open access. The society feels it is better to have the published version available.  I don’t know what these last two sentences mean, but I get the feeling it’s another rehash of the OA = $$$ deal.

So, that was enlightening I hope.  All in all, this paints a picture of the ACS as a top-to-bottom nemesis of OA and all that it stands for.  I would argue that perhaps its members are not as happy with the organization’s efforts to retain an iron-clad grip on the fruits of research as the ACS might think.  Every time the membership renewal notice comes around, I pause a minute before sending in my check.  Perhaps next time that pause will be more permanent.