What will we learn from Lab 104 Round 1?

  • 5
  • Question
  • Updated 8 years ago
  • Answered
Instead of looking back retrospectively at lab results after they happen, or just tinkering with designs until they work, let's find some questions that might get answered by various types of results from this lab round.

Synthesized Designs for this Round

What will you be looking for when the synthesis results come back? What hypotheses do you think we testing?
Photo of Chris Cunningham [ccccc]

Chris Cunningham [ccccc]

  • 97 Posts
  • 13 Reply Likes
  • confident

Posted 8 years ago

  • 5
Photo of dimension9

dimension9

  • 186 Posts
  • 45 Reply Likes
First Big One is... How will designs influenced by RNAFold Webserver results do compared to designs which did not consider this input
Photo of Chris Cunningham [ccccc]

Chris Cunningham [ccccc]

  • 97 Posts
  • 13 Reply Likes
Excellent!

I'll post the RNAFold numbers of the synthesized designs after I go grocery shopping this afternoon, if they aren't already posted somewhere.
Photo of Chris Cunningham [ccccc]

Chris Cunningham [ccccc]

  • 97 Posts
  • 13 Reply Likes
Sad, GetSat doesn't like HTML tables :/

  • Ding's Star Improved
    • MFE -58.42 kcal/mol
    • frequency of MFE 97.47%
    • ensemble diversity 0.06

  • Berex Star Two
    • MFE -60.71 kcal/mol
    • frequency of MFE 97.67%
    • ensemble diversity 0.05

  • PentaPuppy9
    • MFE -49.59 kcal/mol
    • frequency of MFE 86.83%
    • ensemble diversity 0.31

  • 1337
    • MFE -52.45 kcal/mol
    • frequency of MFE 78.36%
    • ensemble diversity 0.71

  • Alpha Centauri
    • MFE -44.34 kcal/mol
    • frequency of MFE 67.94%
    • ensemble diversity 2.52

  • Twilight
    • MFE -57.82 kcal/mol
    • frequency of MFE 96.55%
    • ensemble diversity 0.08

  • 46 - Return of the RNA
    • MFE -49.65 kcal/mol
    • frequency of MFE 48.05%
    • ensemble diversity 2.42

  • Twinkle Twinkle
    • MFE -47.60 kcal/mol
    • frequency of MFE 61.13%
    • ensemble diversity 1.05



RNAFold basically thinks that Alpha Centauri's second and third arms will fly apart, as far as I can tell from the positional entropy picture.

It thinks 46 - Return of the RNA is going to absolutely explode regarding the multiloop in the middle, with the A's and U's all grabbing each other.

And it thinks Twinkle Twinkle is going to come apart at every one of the bonds that closes the loop. It also correctly points out (via dot plot) that in Twinkle Twinkle, the UCU at 7-8-9 that is supposed to bond with GGA at 102-101-100 might instead bond with AGA at 104-103-102. Uhoh!

It seems pretty happy about the other designs all-around. :)
Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
One thing I think is interesting in terms of failure of RNAfold to predict results is that none of the dotplots I ran for Berex Star Two (using each of the four energy models) showed any probability for the GAU at 17-18-19 to bond with the GUC at 57-58-59, which turned out to be the major weak point in the design.
Photo of Chris Cunningham [ccccc]

Chris Cunningham [ccccc]

  • 97 Posts
  • 13 Reply Likes
I'm curious to see how the various tetraloops hold up since the tetraloop reference table went up and showed us some numbers to test against.


  • C[GGAA]G, G[GAGA]C, C[GCAA]G, and G[GUGA]C. [Ding's Star Imrpoved]

  • G[GAAA]C, C[GAAA]G. [Berex Star Two].

  • G[GGAA]C, C[GAGA]G. [PentaPuppy2]

  • C[GAAA]G, G[GAAA]C, G[UUCG]C [1337]

  • G[GAGA]C, C[GAAA]G, U[GAGA]G, G[GUGA]C. [Alpha Centauri]

  • G[GAAA]C. [Twilight]

  • C[UUCG]G. [46 - Return of the RNA]

  • C[UUCG]G, G[GUGA]C, G[GAAA]C, G[GAGA]C. [Twinkle Twinkle]

Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
A couple quick thoughts on tetraloop results so far...

In PentaPuppy2, the tetraloop at 42-47 ran into trouble. It's a GGGAAC loop with a AU pair at 41-48 and the Gs slipped to form a GC at 43-47, leaving a GAA triloop at 44-45-46.

Something very similar also happened in 1337 in the tetraloop at 62-67. This one's a GGAAAC loop with an AU pair at 61-68 and again the Gs slipped to form a GU at 61-68 and GC at 62-67.

Aside from those two, all the other tetraloops in the designs analyzed so far seem to have held up well.

One other comment though on scoring. Since rhiju explained how points are rewarded for nucleotides that are supposed to be unbonded in this thread, I took another look at the tetraloops in these five designs. It looks like the accessibility of Gs in xGAAAx tetraloops is usually still below the threshold for loops (so you lose points on them), but in CGGAAG and GGAGAC they tend to be okay (they'll still show up as blue on the graphic, but you get points for them). I'll try and keep track of this as more results come in.
Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
I'm also curious about tetraloops.

A couple other things I'm keeping my eye on this round are what the effect is of leaving the central multiloop as all A versus the various changes made in some of the designs, and the bonding pairs closing the stems.

In particular, I'm curious to see whether the AU and UA closings in "Twinkle Twinkle" will hold. Also, in the designs that use all GC or CG to close them, whether it matters if they're all oriented the same way (as in "Berex Star Two", "Ding's Star Improved", and "Twilight" which use all GC or "46 - Return of the RNA" which uses all CG) or a mix (as in "PentaPuppy2" and "1337").

In terms of the multiloop, here's the variations being synthesized:

xAAAAxxAAAAxxAAAAxxAAAAxxAAAAx Twilight, Ding's Star Improved, Berex Star Two
xGAAAxxGAAAxxGAAAxxGAAAxxGAAAx PentaPuppy2
xACAAxxACAAxxACAAxxACAAxxACAAx Twinkle Twinkle
xAUUAxxAUUAxxAUUAxxAUUAxxAUUAx 46 - Return of the RNA
xGUGGxxGGGAxxGGGAxxGGGAxxGGGAx Alpha Centauri
xACAAxxAUCAxxACCAxxAAUAxxAAUAx 1337
Photo of dimension9

dimension9

  • 186 Posts
  • 45 Reply Likes
God, I am loving this thread!!!!! Great Comparative Statistics gathering Chris & Ding! :) We are all going to learn A LOT from this one!!!
Photo of dimension9

dimension9

  • 186 Posts
  • 45 Reply Likes
I noticed that one distinct group of designers thus far has seemed to emphasize attention to "Frequency of MFE" and "Ensemble Diversity" numbers over colors on the "Positional Entropy Color Plot," and having a lower, flatter "Entropy/Position Graph" ...while others did just the opposite, that is, they seemed to design mostly around achieving a great entropy profile in both the above Entropy measures

I have not gone through and tabulated a list of these, which designs represent which approach (though perhaps I still will if time permits), but I will be very interested to see which group of designs seem to do better overall.

Could be very instructive.
Photo of Chris Cunningham [ccccc]

Chris Cunningham [ccccc]

  • 97 Posts
  • 13 Reply Likes
My main problem with the Positional Entropy Color Plot is that I can't get a readout of that data in any way that allows me to compare it to other things. The scale is different every time, and it seems arbitrary sometimes. If you find some way to post comparative statistics from those diagrams, please do: but the graphs with the spikes are useless unless you can get them on a common scale and the color-coded things are the same way.

Edit: Also, which designs fall into which groups? It would be nice to have you say "the Ensemble Diversity people are designs 1 3 and 4, while the Entropy/Position Graph people are designs 2 5 and 6" so that after the results come back I will have a better idea who to look at.

Personally, Twinkle Twinkle ignored RNAFold entirely. :P
Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
I think the Positional Entropy Color Plot is fantastic for telling which parts of a single design might be problem areas, but as ccccc say, no good for comparing between designs. I know I used it as an important tool in trying to make Ding's Star Improved better than Ding's Star, though in comparing designs I look more at ensemble diversity, maximum entropy, and MFE frequency.

I wouldn't say that the Entropy/Position Graph is entirely useless for comparing between designs - at least the scale is clearly labeled, so you can find where the highest peaks are and compare that to where they would be on another graph (whether that's in a flat line at the bottom or way the heck up off the graph). But certainly just looking for bumpiness is going to lead to scale problems (not to mention entropy is a logarithmic function so the difference between say 0.1 and 0.2 isn't the same really as the difference between 0.8 and 0.9).
Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
Something else I'm interested in seeing this week is to compare the results of the different energy models available in RNAfold to the synthesis results.

Mostly we've been using Turner 1999, since that's what EteRNA uses. But I ran the eight selected designs through the other three models available (Turner 2004, Andronescu 2007, Matthews DNA parameters 2004) just to compare. In all cases I used the "unpaired bases participate in at most one dangling end" option.

Here's the results:









edit to add an explanation of a couple of columns:

"MFE and centroid same shape" means whether or not the graphic result for the minimal free energy configuration and the centroid configuration were identical. Y means yes. Y* means yes, but neither matched our target shape. N means no, but one matched our target shape. N* means no and neither matched our target shape.

Entropy Range is the maximum entropy at any single base in the entire sequence, as read off the Entropy/Position graph. Entropy Range Unlocked Structure is my eyeball estimate of the maximum entropy at any unlocked base, again read off the Entropy/Position graph. Some of these are more accurate than others, because of scale differences in the graphs.

I think the rest of it should be familiar to people who have been looking at RNAfold.
Photo of alan.robot

alan.robot

  • 91 Posts
  • 36 Reply Likes
I'm curious myself how arbitrary/realistic the dangling end treatment for eterna is. According to my readings, the end effect should be approximately the same for A's AND G's, but somehow eterna really likes G's as people have discovered for stabilizing loops and bulges. Maybe I'm missing something. . . .

http://rna.urmc.rochester.edu/NNDB/tu...

There are also other settings that allow for dangling ends on both sides of a helix instead of just one side, and coaxial stacking which is important for multi-loop junctions.
Photo of pbangham

pbangham

  • 2 Posts
  • 1 Reply Like
Very interested to see how the reality matches up to the modelling - would like to see if anything comes out that completely contradicts what 'should' work, and therefore wouldn't have a chance of being synthesised outside of a programme like this.
Photo of Matt Baumgartner [mpb21]

Matt Baumgartner [mpb21], Alum

  • 128 Posts
  • 33 Reply Likes
Fairly off topic, but I am not sure where a good place to post this is, but I came a cross a paper about RNA folding algorithms and ViennaRNA is mentioned.
I'm not sure if everyone can read it, if you are at a university, you likely can, but I don't know about everyone else.

http://www.nature.com/nbt/journal/v22...
btw, it only 2 pages.

Edit: Go to this link for a publicly accessible link
Photo of chaendryn

chaendryn

  • 29 Posts
  • 1 Reply Like
Can't access it, Matt :( Not at university unfortunately
Photo of Matt Baumgartner [mpb21]

Matt Baumgartner [mpb21], Alum

  • 128 Posts
  • 33 Reply Likes
alan.robot sent me a publicly accesible link.
ftp://selab.janelia.org/pub/publications/Eddy-ATG5/Eddy-ATG5-reprint.pdf

Thanks, alan!
Photo of chaendryn

chaendryn

  • 29 Posts
  • 1 Reply Like
Awesome :) Thanks Alan and Matt