I do read papers that don’t come out of David Baker’s Lab

ResearchBlogging.orgBut not today.  To be fair the paper I’ll be talking about today (from today’s issue of PNAS) inolves quite a few researchers from several institutions.  In it, the researchers describe sort of a new way of “solving” protein structures, although the technique they describe really sits at the boundary of solution and prediction.

In effect, the method involves the early stages of solving a protein’s structure using NMR spectroscopy.  This is a well-established method which has yielded lots of structures.  One of the nice things about NMR is that it’s relatively easy (compared to X-Ray crystallography) to get your sample - you “just” need to have a very pure, highly concentrated bit of protein.  Once you have the sample, it’s also relatively facile to collect data on it - typical NMR experiments take a few days, but pertinent information is available in minutes.  Contrast this to X-ray crystallography, in which growing the crystal might take weeks or months (even once the proper conditions are identified), and data collection takes on the order of hours.

The gist of the method described in the paper is this: you collect the “fast and easy” information on your NMR sample, then construct a model of your protein that uses this data to extract fragments of already-solved structures and put them together in a way which matches your NMR information.  It’s sort of like building a new device from LEGO parts which you’ve gotten by breaking up other devices.  Well, to extend the analogy to the breaking point, what they really do is use the amino acid seqeunce of the protein they are examining to pull out a bunch of LEGO pieces (solved structures with similar sequences), and then apply the NMR data to pick the “best” piece for that bit of the protein.  They call this CS-ROSETTA (CS for chemical shift, the NMR data; ROSETTA for the method of picking out the individual segments of structure and assembling them into a new protein).

CS-ROSETTA OverlayNow that I’ve thoroughly muddled a rather efficient and clean approach, how does it do?  Pretty well, according to the authors.  The figure to the right is from the paper, and shows an overlay of the actual structures of some of the test proteins (determined by X-ray crystallography or NMR, in blue) and the lowest-energy structures from their CS-ROSETTA method (in red).  You can see that they overlap quite nicely.

They follow up the test case by using the CS-ROSETTA technique on several structures that were in the process of being solved by a proteomics consortium.  Once again, they find very good agreement between the final structures (solved after the modeling with CS-ROSETTA) and their predicted folds.

As a structural biochemist myself, I’m always interested in new ways to solve structures as quickly and efficiently as possible.  What concerns me is that methods based on simulations, no matter how accurate or elegant, will always be viewed with skepticism by the scientific community.  Although there is a fair amount of simulation already in the “standard” methods of NMR and X-ray crystallography, it seems that this can be more easily justified in the eyes of the community than the types of calculations needed to “solve” a structure with ROSETTA or other computational methods.

Shen, Y., Lange, O., Delaglio, F., Rossi, P., Aramini, J.M., Liu, G., Eletsky, A., Wu, Y., Singarapu, K.K., Lemak, A., Ignatchenko, A., Arrowsmith, C.H., Szyperski, T., Montelione, G.T., Baker, D., Bax, A. (2008). From the Cover: Consistent blind protein structure generation from NMR chemical shift data. Proceedings of the National Academy of Sciences, 105(12), 4685-4690. DOI: 10.1073/pnas.0800256105

2 Responses to “I do read papers that don’t come out of David Baker’s Lab”

  1. Michael Clarkson Says:

    This is of course very similar to Michele Vendruscolo’s CHESHIRE approach (described last year), although if you compare proteins solved by both methods CS-ROSETTA comes out ahead (I have a little more on this at my site). You make an interesting point about the use of simulations in the definition of structure. Structure determination by NMR almost always involves some kind of simulation, especially when NOEs are being automatically assigned (by Aria or CYANA). Even when the entire restraint set is human-processed, the actual structural determination generally involves a kind of simulation to move the protein into a shape that satisfies nearly all the NOEs. In addition, a true molecular dynamics simulation (in the form of a few hundred ps equilibration) is often appended to the structure determination process in order to improve the characteristics of the structure. To my knowledge, nobody really bats an eye at this, although I have doubts about whether MD simulations added for polish improve structural accuracy in any meaningful way.

  2. PA Says:

    I think it comes down to the fact that in any of the methods proposed by modelers, you are making the assumption that proteins with similar sequences will fold in a similar way. While this seems to be true a large percentage of the time, it is unlikely to be true all the time, and therefore casts some doubt on the results.
    One of the reasons that CS-ROSETTA is an improvement is that you are starting to base this assumption (similar sequence - similar fold) on experimental evidence, in the form of the chemical shifts. It will most likely still require some convincing in the community to gain wider acceptance for the method, however.

Leave a Reply