Is NUPACK's energy estimation of flush stacking leading us astray in the OpenTB puzzles?

  • 2
  • Question
  • Updated 3 years ago
@nando, @rhiju: I've been noticing a number of designs being submitted that rely on what seems to be to be a mis-prediction on the part of NUPACK.

Compare the following prediction, which seems reasonable:



to this is one, where I shifted bases 21-25 to the left, creating the potential for two flush stacks:

NUPACK predicts that the the reporter will not bind in this scenario, where it would if there is a base separating the two stacks.

To understand why NUPACK made this prediction, I set the target mode to have both stacks form:


Nupack is predicting a 0.0 energy bonus between the two stacks.  My understanding of the Turner 2004 parameters, on the other hand, is that flush stacks should be evaluated as though both backbones were unbroken. In this case, that corresponds to a -1.7 value instead of 0.

I know the very low concentration of the reporter oligo that we are using in this round leads to predictions I am not used to seeing.  (Specifically, the estimated energy for a folding that binds the reporter can need to be not just lower, but a lot lower than the non-binding folding energy, for the former to be predicted.  And I understand why that is.)  But I can't see how the low concentration would be a factor in the comparison above.  Am I missing something?

I don't expect Nando to try to tweak NUPACK.  But if I am right, we can at least alert players that designs that rely on flush stacking to eject the reporter are not likely to work that way in vitro.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes

Posted 3 years ago

  • 2
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 123 Reply Likes
Omei, that's potentially an important obsercation. and actually that does look like a bug in NUPACK.

I know michelle was looking at how the code treats stacked helices, and it has some 'special' code to handle them, which may warrant a closer look.

but more likely is that in multistrand mode, NUPACK 'forgets' to handle flush stacks. if that's the case, perhaps if we find the bug fix we can implement and also let the NUPACK devs know about it.

@nando, any thoughts?
Photo of wuami

wuami, Researcher

  • 22 Posts
  • 6 Reply Likes
I haven't specifically looked into coaxial stacking in the multistrand case, although I agree that it seems likely that that's what is happening here.
Photo of nando

nando, Player Developer

  • 388 Posts
  • 71 Reply Likes
About the specifics of the presented issue: to begin with, NuPACK's engine is tailored for older parameters (1995-1999) and is missing some code for handling parameters that were introduced only in Turner 2004. This is not necessarily the case of the one you're mentioning, but I wanted to make it clear that there are things specified in Turner 2004 that current NuPACK builds *cannot* even handle.

Also, in the case you're presenting, I would be quite doubtful that Turner 2004 has it right in this case. Really? A break in the backbone and the stability is supposed to be exactly the same? I propose people make following experiment: take 2 identical ladders. On one of them, use a saw to break through one of the rails, and now test and compare their stability thoroughly.

Now, from a general point of view: our models are just that, models. They are clearly imperfect and have always been. If they had been good enough, Eterna players would *not* have been capable of intuiting methods to improve the design of single state structures like they did a couple years ago.

Another important factor is that the MFE is not necessarily a very good predictor of how RNA strands behave in solution. It is a reasonable start, but it's no guarantee, by far, no matter what the model or the folding engine. 

In short: I don't believe that it is worth spending much time investigating the presented issue. Many other factors can influence the result/prediction much more strongly anyway.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Good idea about doing experiments.  I'll propose a different set of experiments, though. Instead of ladders, we could use RNA molecules.  As I understand it, there is a crazy professor at Stanford who is willing to run experiments on RNA molecules for no charge.  In fact, the latest thing I heard is that he is actually encouraging people to publish the results of their experiments in scientific journals.  Now that's really crazy.
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 123 Reply Likes
@omei it might not take long for you to verify this omission of stacking bonus in multistrand -- just send in some of these test examples onto the NUPACK server and see if flush stack bonuses go in properly for single-strand cases but not multi-strand. 

if that's the case, the nupack source code is available for download -- perhaps if you or others check it out you might find a few lines that would fix it. And then we could easily implement it in the game as well as coordinate with the nupack developers to see if they'd be interested in updating their code.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
@rhiju: following up on your suggestion to see if flush stack bonuses go in properly for single-strand cases but not multi-strand, I created two series of flush or near flush stacking scenarios, one using a single strand and one using 2. At least for the closing CG/UA pairs that I tried, NUPAC makes no distinction between single and two strands when predicting the effect of inserting additional As between two flush stacks.  In both cases, the predicted MFE drops by .9 kcal/mol for the first A, and .3 for the second.  
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
It looks to me that NUPACK is indeed leading us astray.  In R104, I submitted some designs that ignored the stated puzzle goals and instead did a controlled test of stacking predictions.  I've only looked at one of my better-thought-out experiments so far, but the results seem unambiguous. 

For the A (AND) B - RO puzzle, I submitted 7 designs with the name 
Omei Coaxial Stacking Series A v<N>, where N varied from 1 to 7.  Here is state 4 of v4:


In state 4 (both A and B present at 100nM), all but the 16 center bases are tied up either in static stems or by one or the other oligo.  The only difference between the 7 variations is where the reporter complement is positioned.  In v1, it is positioned all the way to the left of the 16 center bases; in v7, it is all the way to the right.

NUPACK predicts that the MFE is lowest for the middle 3 positions, increased by .02 kcal where the there is a single A separating the reporter complement from either of the other stacks, and significantly higher when the stacks abut.  In fact, when slid all the way to the left, NUPACK predicts that the MFE structure doesn't have the reporter binding.

The lab data appears to say the opposite:


Unless I am just totally confused (which is always a possibility), low values of KD100nM_A_B say that it takes only a small concentration of the the reporter for the reporter to bind to the design. Higher values say that it takes higher concentrations, i.e. the binding of the reporter is weaker.  So the lab data seems to say that the binding of the reporter is stronger the closer the reporter stack is to one or the other of the A or B oligo stacks.

Please (gently) correct me if I've misinterpreted the data.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
This observation for the reporter oligo, and the generalization that co-folding gets stronger, not weaker, as the gap between the co-folded helices gets close together, is turning out to explain a lot of things that are otherwise puzzling in the data, both from R104 and previous rounds.

I'l start with an example from the R104 Sensor A - RIRO puzzle.  I submitted a series of 7 designs that had (at least appeared to have had) a simple pattern that has always done well for me in the past.  They all had (almost) the same active switching sequence, but unlike the example above, the surrounding bases weren't systematically controlled.

Here is NUPACK's prediction for one of them:

I've marked the strong kernel attractions that can bind to either the reporter or the A oligo.

Much to my surprise when I saw the very first preliminary data, five of these seven had a fold change less than 1.0, meaning that instead of the reporter being ejected by the A oligo, it was supposedly binding more strongly when A was present.  (In fact, this design was supposedly acting as a good RI/RI switch, with a fold change of 13.8 when viewed that way.) Since the data didn't make sense, I mentioned it to Johan, with the idea that it might point to a problem in some step of the processing.  But he found nothing wrong in the processing, so eventually I had to accept the data and figured out what really happened. 

The data implies this is what actually happened:

The energy reduction inherent in adjacent stacking was allowing the design to bind simultaneously to the A oligo and the reporter, making the net energy of the dimer lower than having only A bound.  NUPACK wouldn't predict this because its energy model adds a penalty for the adjacent stacking instead of a bonus. 

Note that I have also marked another pair of adjacent stacks in yellow.  Thus, the folding of this design, which had the lowest fold change of the seven (as a RIRO), is reinforced by two adjacent stacks.  Other designs in the series had differing number of bases between these two stacks, and that seems to be the major contribution to the ordering of KD values for this state.  There are no other designs with this second adjacent stack; the only design with one separating base between the stacks has the next lowest KD, and the only design that has a fold change significantly greater than 1.0 has no supporting stack at all on that end.
(Edited)
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
I did another series of designs for the A (AND) B - RO puzzle, just like the one above but with the A and B complements reversed.


Results for the last two designs got lost in the processing, but the other 5 tell the same basic story.



Again, the reporter binds much more strongly when it is flush against the oligo stack.  

Here, there is no confirmation that  the single-base separation has an energy value between those for 0 and 2 base separations.  But the error factors say that the differences among the latter four numbers shouldn't be taken as being statistically significant.
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
Last time I checked v6 and v7 were not lost, just under strange names (probably their initial names), search in the sequence column to find them.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
That's what I thought at first, too.  But the ids of those incorrectly named designs don't match the ids of my designs, either.  The ids are what ties the eterna database to the measurements, so I'm pretty sure that those two lines of data go with their ids, whatever designs those happen to be.

Thank you for caring!  
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
Did you enter those two designs under those names initially and later delete and resubmit under the new names? Maybe the DB for some reason kept the first entries.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
No, I didn't change the name.  The flow of events that led to the publishing of these incorrect entries is collectively known, though I'm not sure any one dev has the whole picture.  Only about 1 percent of the entries were lost, and I think most of those disappeared completely rather than showing up under an assumed name.  I've given a heads up to the three players that seemed to be most affected.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
@rhiju: You had suggested that NUPACK's mis-prediction might be a bug in the code.  But now that Vienna has a server for predicting dimer structure, I ran a test on it.  Here's the results of two runs, the first with the reporter complement directly adjacent to the main RNA's hairpin, and a second with a single A inserted between the reporter complement and the hairpin stem.





Vienna also believes that abutting the stems is energetically unfavorable.

This makes me wonder whether there just isn't good experimental data for dimeric coaxial stacking, and whether we should submit more experiments of this type in the current round with the intent of publishing the data.  

Your thoughts?
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 123 Reply Likes
hey, that's a super-interesting idea. can you outline a systematic set of expts that would let us measure the dimeric coaxial stacking energy in many different ways?
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
I think I would start out by running the same series (7 designs) using the oligos in the present round, which are different than the first:  For example,

Using one of the A, B or C oligos at either end, this would be 9 series.  The primary goal here would be to compare the flush values (at either end), against the middle values, where presumably interaction between the two oligos is negligible.

(We could try for a larger variety of base pairings by considering designs of this form:
But I don't think we should use these, at least initially, because of the fact that in the experiment, the ends of our design are not really free, but tethered to DNA molecules, and we really don't know how that would affect things.)

That's only 63 synthesis slots.  The next tier of experiments, I think, should be to measure what happens when there are one or two single strand strands emerging from the abutment of the stacks, e.g. 

The number of possibilities here (e.g. which one or both of the helixes has an extending strand, the length of the strands, the bases that make them up) quickly explodes, so we could take advantage of how many synthesis slots were available.  I would have to think more about what to prioritize once I had some notion of how many slots were available.

But before spending too much time working on the finer details of this proposal, I would want to see if others have alternate suggestions for entirely different experimental configurations.
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
My suggestion is that you reduce the complexity by using only 3 interacting oligos. It would be good to have some way to test all possible pairs next to the nick - the energy would probably be different for different pairs. Testing all pairs with these limited inputs might be hard though.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Thanks for your interest, Atanas.  

I was thinking that each specific design should bind to only three oligos (the reporter and two others, one at each side). I'm thinking that is what you meant.  Or did you mean only gather data on one combination of the A/B/C choices.

And agreed it would be good to get some data for all possible pairs across the gap.  Although the experimental setup doesn't give us enough flexibility to do that without having one or both single strands sticking out, I think that should be one of the considerations for choosing designs for the second, more open ended, level of experimentation.
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
I mean the design, the reporter and one oligo, so that you can test the gap between the reporter and the oligo. Considering we can't test all possible pairs anyway, maybe the design+reporter is also a viable test even though the design has one end tethered, the other could be used for the test.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
@Atanas: OK, I see how your count of three differed from mine.

As far as "test the gap between the reporter and the oligo", my experiment did that.  But the gap on one side was widening at the same time the gap on the other side was narrowing.  So now I'm thinking that you mean there might be a better way to design the experiments that would better isolate the contribution of just one of the gaps.  I can think of other ways to design a series, but do you have a specific alternative in mind?

As for what is actually happening at the ends of the our design, unless I misunderstood his comment, Rhiju recently said that there was a DNA strand at both ends of our design RNA.  @rhiju, did I understand correctly?
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
I think the most flexible would be the one where only the design and the reporter interact, because we can tweak the design as we wish, while the input oligos are fixed. But if both ends of the design are tethered this won't be possible.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Ok, I think I have a proposal for how to quantifying dimeric stacking energies.

@Atanas, your suggestion that we reduce the complexity by limiting ourselves to three oligos at a time really helped me come up with this proposal.

I'm creating a document Quantifying the thermodynamics of dimeric stacking to put in more details, but here's the main idea:



Extending this outline in three dimensions (rightward, downward, and putting A, B or C on either end of the reporter), we can systematically create an extremely large set of data on dimeric stacking.

The interpretation of the data, though, is going to be somewhat challenging, and refining that plan will probably influence which of the above measurements are most important.

I think the goals of the analysis should be
  1. First fit the data to a nearest neighbor model that only takes into account the four bases that form the base of the two helices, i.e. ignoring the single-stranded dangles.  This is clearly an oversimplification of the situation, but should be a lot more accurate than what NUPACK is currently using.
  2. Extend the analysis to include the two dangling bases next to the helix.  These bases are clearly important, because they can form non-canonical bonds with the other helix.  The most commonly interaction might be a dimeric version of the A-minor motif, where dangling A would fit into the minor groove of the other helix, forming a base triple.  A reasonable good "nearest neighbor" model for dimeric stacking will probably need to consider these two dangling bases in addition to the four that make up the nicked 0-loop.
Is anyone interested in collaborating on this effort?  You wouldn't need to have a lot of Eterna experience.  The primary skills that would be useful would be programming (for generating the designs algorithmically and perhaps some manipulation of the data after collection), data analysis, and paper writing.
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
Have you checked if nupack reports 0 for the case of a dangling base(s). As far as I understand their algorithm, the case of dangling bases is evaluated as a outer loop with that consists of the dangling bases and should report non-zero energy for that loop. That is, the second and on series probably have already been modeled by nupack.

Also if I correctly understood, there is a more updated model of Turner? Maybe someone already researched this?

Another concern is how do you actually measure this energy. My understanding is that dG reported by the experiment is the difference in the reporter energy of being bound and unbound. Would that be enough to isolate just the energy of the gap? I don't see any relation between the dG reported by the experiments and the ones in the simulation regarding the energy of the reporter binding (even if I disassemble it into Kd-s per complex and include pseudo knots). So maybe a model that explains the dG-s reported by the experiment and their relation to the simulation might be required.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Good questions. :-)

I haven't looked into NUPACK's estimate on the dangling base associated with stacking of separate oligos.  At this point, I'm working on the presumption that NUPACK is not helpful for predicting our lab results once the two oligos become adjacent.  That may turn out to be unduly pessimistic, but at the moment I am focusing on the opportunity to gather experimental data from the current round, which is closing in only a couple of days.

I have looked at the Turner parameters, and can say that they don't address the question of multiple oligos at all.  I've also searched the net for any sign of thermodynamic parameters for adjacent dimeric stacks, and have come up empty. But I'm not an expert in the field, so this doesn't say a lot.

As for calculating the energy difference between two conditions, it is my understanding that the natural log of the ratio of the KDs (or the difference in the logs of the KDs) is a measure of the energy difference, in units of kT, between the two experimental states. (Given, of course, some simplifying assumptions).  But I really need to get more input from @rhiju before saying anything about that with confidence.
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
Do you know if the experiment uses the input oligos as defined in the games or it uses the full-length tuberculosis RNAs?
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Yes, the same as those in the game.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
@atanas, prompted by your question about how the energy contribution due to stacking, I've modified my proposal;  I realized that some of the data I collected in the previous version wouldn't have a simple comparison.

Here's the current screenshot from https://docs.google.com/document/d/1-G1pVWzacMBXuwISi2qaGfGrluHHdWEpE-qzYeUQD14/edit:



The idea here is to make sure that every observation of directly connected stacks has a an instance of a single base separation to be compared against.  

I've actually added two comparisons, using either an A and a U for the intervening base.  Ideally, it wouldn't make a difference.  But it probably will, and I would like to know what that is.  Hopefully, that knowledge can be used to refine the estimate of how much of the difference is due to the coaxial stacking, as opposed to some base pairing involving the separating A.
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
I think dangling oligos would create too much interference as the number of dangling bases increase. That is, the first series look ok, the next series start to introduce more and more factors that could contribute to the energy. So I think we should reduce the tests to the 0-gap vs 1-gap cases.

We can measure the difference between the 0-gap and the 1-gap it seems by the dG-s of the two experiments. What bothers me is the to find the dG of the 0-gap we need to know the dG of the 1-gap. We can take the 1-gap energy from the current model used by nupack.
Adding a test with A replaced with U is a good idea because then we would have to values for the 0-gap vs 1-gap model and if we both values match then probably the model is good and our measurement might be ok. Maybe we should try with not only U but C and G.
Still there is the danger that this extra base might influence the dG by influencing the pairing probability and thus preventing or helping the reporter bind.
In general the experiment is very limited by the fixed inputs and that the inputs and the reporter are not equal in their chance to interact with the design because of the way the experiment is set up. Thus probably the result wouldn't be deemed very reliable. But if we are lucky to get the exact same energy value in all 4 measurements this might be a good start.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Agreed that there is a possibility of longer dangles influencing the results.  For example, it is known that triplet stranded RNA helices can form.  But if we limit the experiment to no more than single dangles, we only have 6 * 3 = 18 data points we can measure with the fixed set of oligos.  But all of the other data points, 1680 in all, are relevant to future rounds of the OpenTB challenge because all can be present in a plausible switch.  What's in question is how well be can distinguish the relevant contributions of pi-pi stacking, pair-pair bonding, other hydrogen bonds, backbone torsion, ... .  And that's always an issue issue in interpreting RNA thermodynamic data.

I'm think that when we get the results we would examine them critically before deciding exactly how to process it and what to claim as a conclusion. For example, if ddG values for some (or all) of the measurement points are clearly outside the distribution predicted by most, we might eliminate extreme outliers and/or fit the data to a more complex model, perhaps one that includes dangle lengths, and not claim it is all due to pi-pi stacking. 
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
I think I have a better plan for picking the non-adjacent stacking case against the adjacent one.

As I was writing the script to generate the designs, I realized that my plan for using both A and U as spacing bases wouldn't be sufficient in the case where the two nearest dangles were A and U; the spacing base would pair with one or the other of the dangling bases.

A better plan, I think, is to always use a spacing of two, with each of the spacing bases selected so as not to form a pair (of the strong ones Eterna recognizes.)

So for the adjacent stacks on the left, a possible comparison case would be on the right, where the the spacing base U would be randomly chosen between U and C, since A and G would form a pair with the dangling U.  Similarly, the second spacing base would be chosen between A and G, so as not to pair with the dangling G.

This way,
  • This requires only two designs per data point,
  • Neither of the stacks will be inadvertently extended, and
  • We'll get sample of all the various adjacent bases, so we can better distinguish the base stacking energy from non-classical hydrogen bonding of the unpaired bases.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
@rhiju, in one of the sets of data I published above,



the KD values for stacks that were flush and separated by a single A were 0.28 and 5.38.  It is my understanding that the natural log of the ratio of the KDs is a measurement of the energy difference, in units of kT, between the two experimental states. In this case the computed value is -2.96 kT

Two questions:
  1. Is this indeed how we should interpret Johan's measurements, and
  2. If so, what is the conversion of kT units to the kcal/mol units we normally use? 
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 123 Reply Likes
1. yes. that's totally valid.

2. k = Boltzmann constant[ 0.0019872041(18), kcal/(mol⋅K)] and T = 37°C = 273.15+37 K = 310.15 K, so kT = 0.616 kcal/mol.   In your example, -2.96 kT = - 1.8 kcal/mol. Pretty big!