Subscribe to PlausibleAccuracy
Posts
Comments

First look: Creating scientific web applications with Django

May 22nd, 2008

Unrelated to the actual body of this post, but possibly of more interest to you, dear reader is that I’ve sent in another job application.  This time it is for an Associate Editor position at the esteemed Science magazine.  My qualifications are a bit less than what they seemed to be looking for, so I’m not terribly optimistic (what’s new).  As usual though I’m nervous…  All right, on to the actual post!

I like to think (perhaps a bit ambitiously) that all of my tinkering around has elevated me to the level of “novice” programmer.  I can usually decipher things that others have written (ok, I can often do so), and I’ve written several command-line scripts that will do something useful.  I think one of the key things I’ve learned is that coding is hard, and I have tons of respect for the people who’ve chosen to do this as their career.  Now that I’m starting to get a handle on everything I don’t know, I feel like I’m also starting to find the handholds I need to climb a little farther up the cliff/learning curve.

So far I’ve had the most success writing things in Python.  This is most likely because it’s a relatively simple language, designed to be accessible to noobs like me.  It’s a fine language which tends to do what I like in ways that (more or less) make sense, and since it’s usage is fairly widespread in bioinformatics I don’t feel like it’s a waste of time to learn.

The problem with most of my “applications” so far is that, like I said above, they are uniformly command-line scripts which either take console or text file input.  For my own personal use this is fine - I understand the quirks of the program and am comfortable operating from the console.  This tends to be a barrier to more widespread usage, however.  Most people (who might use one of the things I’ve coded) aren’t very comfortable at all with entering commands into the terminal or editing a configuration file by hand.

So, I wanted to start looking into ways to start writing things that had a friendlier user interface.  I looked into using Glade to make graphical front-ends, but was having trouble wrapping my head around all of the handlers and things.  I was also a little worried that this would restrict the final product to a Gnome-based desktop.  What I really wanted to do was make something accessible via the web, so that I could install the application on our lab’s central machine and let people use it from their own computers.  My problem was that I couldn’t find a decent (i.e. quickly understandable by me) way to build web apps based on Python.  That is until I found Django.

Django is a web framework based on Python that just makes it easy to develop a Python-based application and distribute it via the web.  I haven’t had time to build anything from the ground up yet (I’ve been working my way through the online tutorial/book), but I can definitely see the potential.  I’ve gotten much farther with Django in a much shorter time than with any of the other solutions I’ve looked at so far.

I’ll keep you up to date as I continue my experimentation.

Where did you find your current job?

May 21st, 2008

Mrs. PA and I are both looking for gainful employment these days, and we’re having a tough time finding two jobs that look interesting, are reasonably near one another, and which we both have a chance of actually getting.

So far I’ve done most of my searching using the internet of course.  I’ve tried RSS feeds from Craigslist job postings, manually checking individual sites, and other feeds from job aggregators (like ScienceCareers).  I feel like it’s a pretty wide net, but I haven’t caught many keepers yet.  Mrs. PA has found a few positions that she’s applied for, but they tend to be in fairly backwater places where my chances of employment are slim.  I’m worried that we’re looking for jobs in all the wrong places.

So, I guess it’s not a bad idea to ask people who have jobs where they found them.  Did you use one of the big job sites online?  Personal contact?  Newspapers?  How long did your job search take?  Did you go on a lot of interviews, or did you just happen to mesh with the first company?

The American Chemical Society is out of touch

May 15th, 2008

Several recent links from Open Access News have reminded me once again of how strong the resistance to OA can be from publishers.

I’m a (5 yr +) member of the American Chemical Society, an organization which publishes several of the prominent journals in the field.  I have to admit that I hold the membership with a certain distaste.

Rudy Baum, the editor-in-chief of the weekly Chemical and Engineering News has long held anti-OA positions supported by arguments rife with Fear, Uncertainty and Doubt (FUD), as well as a fair does of self-delusion.  An example of just how confused Baum is about OA come from a 2004 editorial:

access to the STM literature is more open today than it ever has been: Anyone can do a search of the literature and obtain papers that interest them, so long as they are willing to pay a reasonable fee for access to the material.

Leaving aside the debate of what a “reasonable fee” might be, this quote shows that Baum has a fundamental misunderstanding of the difference between “accessible” and “open”. A front-row seat at the Metropolitan Opera is accessible for a “reasonable fee”, but I doubt people would consider it open to the general public.

OAN linked to a piece in Issues in Science and Technology Librarianship which delves further into ACS’s “war on OA”.  The icing on the ACS cake, however, is this bullshit-ridden interview with the president of ACS publications.  I wanted to excerpt the parts of the OA discussion that were crap and pick them apart one-by-one, but I realized I was going to be talking about almost every sentence.  So here you go (my comments in bold):

What are your views on open access?

We are in favour of various access models [as long as we get paid] and think authors should have the right to choose. We don’t think that governments or others should mandate what authors do and require them to pay [Except us, of course - we can mandate what authors do and require them to pay]  Note that this is a textbook FUD argument of “OA means you, the hardworking scientist, will have to pay more” which has been refuted time and time again.

Immediately on publication each of our authors is given a link that they can put on their websites or funding body’s site free of charge. How nice, I can link to my own work for free! There is a limit of 50 downloads of their paper in the first year.  50 downloads might last a few weeks tops, even for a moderately popular article.

If the author wants to place the whole article on their website or funding body’s site then we have our ‘AuthorChoice’ model where authors pay [terribly exorbinant fees] to make their articles open access [and ACS still keeps the copyright, it’s win-win, really]. Most of our revenue comes from subscriptions, with a bit [ok, a lot] from advertising. We don’t see many authors choosing the AuthorChoice option [because we don’t explain it whatsoever in our publication guides]. We’ve had this model out for about a year and less than one per cent of papers are published this way (ibid). Not all authors have access to funds that they could use to pay to publish (Then how do they afford your page charges?  Note: another attempt at the OA = author payment crap) and most of our authors are pleased with the access that others have to their papers anyway [because many of them aren’t aware of other options and don’t have contact with people at non-R1 research institutions].

We enable authors to submit their raw data too [The more of their intellectual work we can own the better of course]. We put this outside our firewall [but not outside of our copyright] so it is open to non-subscribers too but we do not tag this information.

We left the matter of putting preprints in repositories to editorial discretion on the individual journals and the editors have chosen not to allow this [on pain of death/firing]. After publication there is the option to have the free author directed link or to pay for open access. The society feels it is better to have the published version available.  I don’t know what these last two sentences mean, but I get the feeling it’s another rehash of the OA = $$$ deal.

So, that was enlightening I hope.  All in all, this paints a picture of the ACS as a top-to-bottom nemesis of OA and all that it stands for.  I would argue that perhaps its members are not as happy with the organization’s efforts to retain an iron-clad grip on the fruits of research as the ACS might think.  Every time the membership renewal notice comes around, I pause a minute before sending in my check.  Perhaps next time that pause will be more permanent.

My personal experience with biological repositories

May 13th, 2008

When I started out as a graduate student, in the dark distant past, I chose to work on a protein that we didn’t yet have in our lab.  One of my first goals was to acquire the gene and clone it into our expression system.  After reading some literature, I found one group who had mentioned getting the gene from another, and so I thought I had my opportunity.  I emailed the group who is on record as having supplied it and asked for the gene.  They replied “sure, no problem”; I waited.  Months.  Occasionally I would send off another email, but I didn’t want to seem too pushy.  They were doing me a favor, right?  In the meantime I worked with the mouse version of the gene.

Finally, the envelope came with the gene.  They didn’t provide any real information, but I went ahead and designed my PCR primers and got to work cloning it.  For months.  For some reason, my cloning reactions just weren’t working really well.  I struggled with PCR, digestions, ligations; it seemed like every step of the way was bogged down for some reason.  Eventually I managed to wrangle the gene into a vector and get a good sequence read.  At this point it became abundantly clear why I had been having so much trouble - the end of the gene was missing.  It turns out the lab I had gotten it from had used an enzyme in their own cloning that clipped the sequence short.

In retrospect, I probably should have figured this out much sooner.  The clues were all there, but as a young graduate student I was sure that an established lab would send me the proper gene, and I was just doing something wrong.

It turns out that around this time, another graduate student mentioned that they had bought the gene for their protein for something like $80 (the fee for this particular repository has since been raised to $120), which is just about as close to free as you could hope for.  It turned out that the same place carried my gene.  Of course I bought it, and within weeks my cloning was successful.

When people like John Wilbanks talk about developing these repositories, this is the type of situation they are looking to improve.  The old system, of asking a “favor”, little to no verification, and no real motivation for expediency or quality control is really shockingly bad.  It’s amazing to me that it’s taken this long to sort of start generating significant interest in validated, standardized, open repositories.  The clones, cell lines, mice, etc that we generate in great quantities need a better method of sharing and distribution than some antiquated version of quid pro quo.

Why don’t we have more “Principal Scientists” in academia?

May 12th, 2008

This weekend, Mrs. PA and I went out to dinner in town (where, coincidentally, I had one of the best beers I’ve ever tasted).  During the meal, we had a wide-ranging conversation on the difficulties of running a successful lab group.  The training you get as an undergraduate, graduate students, and post-doc does little to prepare you for many of the duties you undertake as a professor.  Teaching, grant-writing, and personnel management are areas that you spend a lot of time working in as a professor but likely have little to no exposure to prior to this position.  Indeed, the level of multitasking it takes in order to be effective as a Principal Investigator at a major research institution is rather astounding.  What tends to happen, in many cases, is that some facet of the position is left to its own devices.  Often this is the personnel management side of things.

We realized that there is already a position, prevalent in industry, which could help ease the burden on professors - the “Principal/Senior Scientist” job.  I did an internship at GiantPharm one summer, and worked in a small group.  There was a leader of the group, but his office was actually in another building on a hallway with other group leaders.  My interaction with him was sort of minimal.  I did, however, spend a lot of time talking to my immediate supervisor.  He was a long-time employee, Ph.D., and incredibly intelligent.  If I was stuck on a task or needed further direction, his office was always open.  Since he was doing research of his own, it was easy to chat with him informally about the work and hash out new ideas.  If I were a professor, I’d love to have someone like this in the lab.

In academia, there are sometimes “Research Scientists” working in a group.  In my experience these tend to be glorified (more or less permanent) post-docs.  They are focused on their own project, and could often not care less about mentoring graduate students or ensuring that the lab is running smoothly (as long as it doesn’t significantly impede their work they are ambivalent).  It’s worth noting that post-docs themselves frequently have a similar attitude.

I think that there is some room here.  Why not delegate some of the roles typically shouldered by a single P.I.?  For instance, the P.I. can focus on “the big picture” (where is the research going, what are our major findings, what is going on in the community), getting money, and their teaching duties.  In the meantime, you can bring in a scientist to be the “research lead”.  By this I mean the person who is in the lab working on a project, but who also oversees the day-to-day activities.  If a graduate student is having trouble getting their affinity column to work, they can go to the Scientist.  This person could be responsible for some of the management of the lab as well - if a student isn’t showing up to work, they can talk to them and/or elevate the situation to the P.I.

Now, I’m not a professor.  I’m sure there are some issues with this plan (or else why wouldn’t it be implemented).  Some that come to mind:

  • Funding - you are going to have to pay this Scientist more than you pay a post-doc.  $50,000/yr?  Somewhere in that ballpark is my guess.  It’s roughly equivalent to one post-doc plus an additional graduate student.  I don’t think that this is too onerous.
  • Appearance of laziness - Will other faculty members think that you are unable to “handle” being a professor if you have to hire someone else to share the workload?  I’m not really sure about the answer here, but I’d hope this could be minimized.
  • What about the Scientist’s career? Won’t they get unhappy and leave?  In my perfect world, the candidate would be someone who has completed their Ph.D. but is uninterested in some of the aspects of joining a faculty.  Perhaps they just don’t like writing grants and want to work at the bench, but aren’t fond of industrial work either.  There are people out there like this, trust me.  They would be thrilled to have a position like the one I’m talking about here.

I’m sure there are other problems, and I hope you’ll bring them up in the comments.  If you are in a faculty-like position, I’d really like to hear your thoughts on this.  Do you see the utility of hiring such a researcher?  Why is it not done?

Data should be public domain, and more esoteric blog-based ‘rasslin’

May 12th, 2008

Over the end of last week, I noticed several items coming down the RSS tubes that seemed to be involved with the permission barriers we place on scientific data.  These posts also seemed to be interrelated as well.  At the time I couldn’t give them the attention they deservered, so I filed them away for later digestion.  I’d like to discuss them here now, at risks of kicking an anthill that seems to have settled down a bit over the weekend.

As far as I can tell, things seemed to start when Chemspider chose to license their data under a Creative Commons license.  This is obviously (from my point of view) an attempt on their part to do the right thing - ensure that their data is freely available, and to give them some controls to “enforce the freedom”.  Then the wonderfully muddy communication medium of the internet kicked in, and people started getting angry at one another.  It seems that Peter Murray Rust published (somewhat erroneously) a conversation between himself and John Wilbanks.  This conversation was taken somewhat out of context by the folks over at Chemspider, and the ball was rolling.

Read the rest of this entry »

Do you know who designed the Open Access logo?

May 8th, 2008

Roderic Page (and I) want to know!

Mysterious Logo

Some time back I surfed around trying to find the original source, but didn’t have any luck turning it up.

FoldIt! - Combining slacking off and actual work

May 8th, 2008

The Baker lab at the University of Washington has been influential in the development of computational tools to model protein folding.  They are most well known for the ROSETTA package, which does a fantastic job in many cases of predicting a protein fold based on fragments of other known structures.

They have now developed a protein folding game, of all things, called FoldIt!.  The user registers for an account on the website and downloads a program which runs the game locally.  As of now they only have Windows and Mac clients, but I got the game running under Wine no problem.

FoldIt! presents the player with “puzzles” which involve resolving steric clashes in the proteins.  The easy tutorial levels teach you how to play the game (which mostly involves clicking and dragging on different parts of the protein), and you’re guided by a funny cartoon version of Dr. Baker himself.  You get points based on how “good” your solution to the puzzle is, and complete a given puzzle when you cross a certain threshold score.

It’s sort of fiendishly fun to play, although maybe that’s just because I’m a biochemist.  It also seems like it would make a GREAT learning tool for use in biochemistry courses.  You can form groups as well, so it might be fun to make a group out of your class and see how highly they can place.

A writing experiment

May 7th, 2008

I’ve got about 70 pages written on a document that I call “my thesis”. The problem is, I hate it. I’ve written it all in fits and spurts, jumping around from one section to the next. Some days I’ll write pages and pages and it seems like it’s going really well, and other days I’ll spend all day staring at emacs and not getting anything down. Lately it’s been much more of the latter.

I decided to give something a shot. I fired up OpenOffice and set it to full screen web view. This gives you a glorious screen of space to write in, without any distractions. I just started writing, trying as hard as I could to keep things like formatting, citations, paragraph structure, and above all sections out of my mind. I want to write something that flows.  Maybe it’s a crazy idea.  A thesis is a really large document to write from start to finish, and the nature of research (especially my research) makes combining the whole thing into a fluid narrative tough.  Call me stupid, but I’m going to take a stab at it.

I’m not sure what this document is yet, to be honest.  It may become part of my thesis.  It may just be a convenient way to help me organize my thoughts before I write them into the official document.  It’s just as likely (more likely in some ways) that I’ll give up on this a week from now.  Until then, I’ll be posting it here on Plausible Accuracy for you to read and comment on.  Any feedback is greatly appreciated.

I’m planning on just posting chunks whenever they get to some modiucm of “done” - that is to say more or less when one train of thought has been laid down.  I’m not editing this as I go (besides the most obvious of typos), not including citations or figures.  The idea is just to write.  I’ll post links to all the sections somewhere as I add on.

Read the rest of this entry »

More testing with VMD and Tachyon

May 6th, 2008

I’m still testing out some of the advanced features of using Tachyon to render nice images of biological macromolecules. I came across these beautiful images of bacteria which are able to consume radioactive waste, and decided to tinker a bit to see if I could get something similar out of VMD.

First of all I loaded in my molecule and set it up similar to the exercises from the other day: white background, surface representation, diffuse material. I also added the Depth Cue feature of VMD, which adds a fog which increases in density with depth. This helps to add a bit of a 3D feel to the representation. I also played around with the various lights, settling on having lights 0 & 2 on.
I rendered the image with:

"/usr/local/lib/vmd/tachyon_LINUX" -aasamples 4 -rescale_lights 0.3 -add_skylight 0.9 %s -format TARGA -o %s.tga

Note: this takes about 8 minutes to render on my laptop at about 700×700 resolution.
If my understanding is correct, this should give a scene that is dominated a fair bit by the skylight parameter, and this is more or less the case. The image, while interesting in some ways, is far too bright!
Let’s drop the skylight down then:

"/usr/local/lib/vmd/tachyon_LINUX" -aasamples 4 -rescale_lights 0.3 -add_skylight 0.6 %s -format TARGA -o %s.tga

Well that darkened the shadows a bit, but the overall image is still way too bright. How about dropping the lights?

"/usr/local/lib/vmd/tachyon_LINUX" -aasamples 4 -rescale_lights 0.1 -add_skylight 0.6 %s -format TARGA -o %s.tga

Well, still far too light. What’s happening is that the depth cue fades the image to the background color (in this case white) as it goes. Let’s drop the depth cue density in order to cut back on the lightening. This setting is found in Display–>Display Settings. I adjusted it to a value of 0.15, still using the Exp2 function for the density. When I rendered this (using the same settings as the last one above, it looked OK, but not fantastic. Mostly it was just “flat”, if that makes sense - not a lot of visual appeal. I rescaled the lights back up to 0.3, and this was better.

Something still isn’t “there”, though. To be sure, the tachyon renders look nice, but I just don’t feel like this is the best that can be done. I’ll have to keep toying with it.