Switch Scores for EteRNA Switch Puzzles

  • 11
  • Article
  • Updated 5 months ago
An exciting direction in EteRNA is the study of riboswitches!

We have recently finished our pilot experiments with great initial success. Using a new technique that measures switching directly on a sequencing chip we directly observe the switching for thousands of designs at once. The signal is generated by a fluorescent RNA binding protein, MS2, and instead of the standard EteRNA score, which is based on the correct folding of each base, we have introduced a new Switch Score.

The Switch Score (0 - 100) has three components:
1) The Switch Subscore (0 - 40)
2) The Baseline Subscore (0 - 30)
3) The Folding Subscore (0 - 30)

The scoring scheme is summarized below. A more detailed description is given in this PDF:
https://drive.google.com/open?id=0B_N0OA9NROPGel80SG5LM0wtZms&authuser=0

A typical example of a switch puzzle is shown below:


The player designs the structures in [1*] and [2]. To observe the switching we then measure the fluorescent signal of MS2, which binds specifically to the MS2 hairpin seen in [2]. In the absence of FMN, the MS2 should bind and the switch is ON. On the other hand, if we introduce FMN, the ligand in [1*], the switch should be OFF and not exhibit fluorescence.

No switch is 100% ON or OFF in the absence or presence of ligand, but a good switch can come very close (and get a perfect EteRNA Switch Score!). A some MS2 concentration, the difference should be large (e.g., at ~100 nM MS2 in figure below). In practice, we don't know this concentration beforehand so instead we perform measurements at many concentrations to obtain binding curves. When the switch turns OFF (red curve), the effective dissociation constant increases. The dissociation constant, Kd, is the concentration where half of the RNA binds MS2.


The Switch Subscore quantifies how far apart the Kd's are in the absence and presence of FMN (horizontal distance between the red and blue curves).

The Baseline Subscore is a measure of how close the ON-state is to the the original MS2 hairpin (lower Kd is better, i.e., blue curve should be far to the left).

The Folding Subscore is high if MS2 bind properly in the ON-state at any concentration (the score should be high for the blue curve at high concentrations of MS2, i.e., high values to the right)

In our first experiments, we found that the easiest score to maximize is the Folding Subscore, followed by the Baseline Subscore. These two ensure that the MS2 hairpin is properly formed in the ON-state. The hard one is the Switch Subscore, which is the highest when the energy difference between the states is finely-tuned to the energy conferred by binding to FMN (or other future ligands).
Photo of johana

johana, Researcher

  • 96 Posts
  • 45 Reply Likes

Posted 4 years ago

  • 11
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Part III - Potential switch pattern?

While doing the color marking of the repeats in riboswitches - as mentioned in the above post - I noticed a peculiar pattern that appeared to be around in many of the switches. I noticed it because I was looking for patterns in the positions of the G segments compared to each other. A lot of the switches had at least 1, sometimes more of their G segments really close, whereas the C segments often also had some closeness, but not to same degree. And this pattern kept turning up. (Y)GGNNGG, (Y)GGNGG, to some degree also a reduced pattern (Y)GNNGG, (Y)GGNG and variations on that theme (Y signifying pYrimidine - meaning C and U bases). Regularly there is a pyrimidine in the end too. Plus at the beginning of the second G repeat.

Peculiar enough this G segment repeat pattern often appears to land in the switching area, and I can also find it in a number of the eterna switch winners, both FMN and TEP, although not all, though the pyrimidine start gets lost, due to locked FMN sequence. As this is where this two close double G repeats often turns up. 1 of the G repeats often being in stem and the other being in loop. Which reminds me of something else. A number of the G and C repeats are placed in loops and I think they are left there as to initiate the switching.

Later I ran the natural occurring riboswitches through Vienna RNA fold, to see if this sequence would land in high entropy area, and it regular do.


>Magnesium riboswitch mgtA: E. coli. Alteration: Normal.
CUUACCGGAGGUUAUAUGGAACCUGAUCCCACGCCUCUCCCUCGACGGAGAUUAAAACUUUUCCGGUAAGCCCGUCUUUUCACGGCGUUACCGGAUGCGUAAGGCCGUGA

The pseudoknots riboswitches mostly seems to be excepted from this sequence pattern. Perhaps they have another switching mechanism?

The two close G repeats often work this way where the one repeat will be embedded in stem and the other in loop area, in one state and then the repeat G’s in the loop helps as anchor for with the shifting to the other state. Similar to the twin G’s in FMN where the twin G’s are in the aptamer loop, but often gets bound up when state is shifted.

Now I think I finally understand, why there are the many G and C repeats in riboswitches. I think the C repeats help raise entropy as do U repeats. Plus when one make the Entropy of the design higher through repeat sequence and thus highly raises the probability that the design can fold into many other structures than the target structure(s) - one also needs to make the binding parts stronger = lots of GC = lots of G and C repeats.

I think one can balance the entropy by on a good level, by playing the right amount and types of repeats. I even think there is a different frequency of what kind of repeat there is. C and G repeat occur to a much higher degree, where normally A repeat and to some degree U repeat dominates in static puzzles. I think the ratio between the different base repeats matters.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
Eli, I like your line of thought. I would like to support it from the perspective of the finer details that are ignored in nearest neighbor energy models (which is basically all the state of the art offers).

Consider the following. It is Chimera's "ladder" rendition of the 3D structure of
a hairpin from the human 7SK snRNA in complex with arginine.
(I selected it just as an example of an RNA that does not seem to rely on changing its shape to perform its function.)



This model comes from NMR imaging, which is capable of "seeing" multiple configurations that the RNA takes on. (Unlike X-rat crystallography, which requires that the all the molecules be "frozen" into one configuration, so a crystal can form.)
All the configurations are superimposed here, and you can see that for the most part, the differences are small. (I've called out the one exception, where the uracil bulge will occasionally form a hydrogen bond with the uracil on the other side of the helix.

In contrast, consider the following NMR model, which is for a riboswitch (specifically, a preQ1 riboswitch in the bound state).



The image in the upper left shows all the configurations. Notice how much more variety (i.e. entropy) there is. In particular, there is a lot of switching of specific hydrogen bonds, while the overall structure remains essentially unchanged. The other three quadrants each show just one of the 21 states that are superimposed in the first quadrant.

What I think is happening is that this local variation in states (substates?) form a "broad energy valley" of states that increases the stability of the general shape more than the single minimum free energy value suggests.

But entropy is a two-edged sword. If there is a lot of possible variation that stabilizes each of the two desired states (e.g., in the Exclusion case, one and only one of the FMN and MS2 bound ), that is good. But variation that allows for neither of them to be bound would result in a "mushy" switch, which wouldn't get good Eterna scores.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Hi Omei!

Thx :)

This is super cool! I love the images and the explanation. When I run the two sequences through Vienna, it shows quite accurate where change is happening according to the NMR you show. For the first one, only one base is higher entropy in the bulge region. The general entropy is low, but entropy spikes at the bulge base and its sometime partner.





So while the two designs seem similar in entropy when running them through Vienna, they are not. (First one 0.9 and second one 1) As the higher entropy in the RNA hairpin is a spike at only two bases, but the slightly higher entropy in the riboswitch is spread over the whole stem. The latter fitting quite nicely with that there is much more movement in the NMR images.





I find your thoughts interesting on entropy being a double edged sword. Its needed for getting work done, but it might not be doing what we intended it too.

I can't see anything about entropy in relation to state with Vienna since it only treats RNA as single state. Only where it thinks there is action. So anything that could show us entropy trends for the different states would be most helpful. Or if we could find some features that can help us tell on if one of the states are not going to form.
Photo of jandersonlee

jandersonlee

  • 549 Posts
  • 122 Reply Likes
I use RNAsubopt a lot, which shows multiple foldings rather than just the MFE shape. I've not developed a bot for switch design using it yet, but if I do I will definitely keep the entropy concept in mind.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Hi JL!

Thx for the RNAsubop tip. I will try take a look at it.

I also really like the thought of a switch bot taking entropy into account. :)
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
Good news on using ViennaRNA: Version 2.2, which is still considered beta but which is being used on the Vienna Web server, can calculate the partition function for hard constraints. What that means is that you can get statistics that represent the whole ensemble of possible foldings (like entropy and base pairing probabilities) using the same constraints that the Eterna UI currently uses to estimate the bound state MFE. You can find the constraints for the original six MS2 puzzles on Nando's eternadev server, here for example. And just knowing that in the constraint language, "|" means paired, "x" means not paired, and "." means "don't care", you can figure out the proper constraint for any other placement of the FMN aptamer.

Note that Nando has pointed out that this hard constraint is not exactly the same thing as modeling the actual FMN binding. But given all the other simplifying assumptions being made, I think it is good enough to yield some insight into the bound state. Nando hacked up (his description) a Vienna 1.x version that does a better calculation. I tried that version out, but it seemed like the current 2.2 version gave more plausible results in the unbound state, so I have been using 2.2 in my current investigations of predicting the MS2 switch scores from the partition functions of the two states. (Which, btw, has yielded some positive results, but I still have more work to do before "publishing" that.)

One other caveat: There was a bug in release candidate 2 of ViennaRNA that often caused it to abort when calculating the partition function in combination with the constraints for our MS2 RNAs. I submitted a bug report, and got a quite prompt response acknowledging the bug and saying it had been found, and the fix would be appearing in release candidate 3. I just checked, and the source code for release candidate 3 is now available. I'm guessing the Web server has also been updated to RC3, but I don't know how to verify it short of trying it out. If you do get the error "unbalanced brackets in make_pair_table", post the sequence. I'll be downloading RC3 so I can use it to check if the bug lives on, or if RC3 just hasn't made its way to the server yet.
Photo of jandersonlee

jandersonlee

  • 549 Posts
  • 122 Reply Likes
One thought on the "entropy" concept for switches is that "zipper" style switches may show higher entropy on some of the bases. In these sorts of switches, one part of the shape (typically a stack/stem) has bonds that break and can form intermediate shape with another part (normally another stack/stem) as the two shapes flow back and forth (unzip/zip).

One example (though not necessarily a great one) might be JL ENG3 1.03 in the Exclusion NG 3 lab, where forming the aptamer loop requires breaking one bond in the MS2 arm so that the closing pair can form for the loop. As further pairs break in the MS2 arm, more pairs can form in the stem holding the aptamer loop closed, until enough pairs have formed so that the rest of the arm is more stable in a different configuration.

The zipper style seemed to work well for the miRNA lab at least.

Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
I like your line of thought. :)

In the design you show, only the last end of the MS2 sequence seems to split up and zip onto one of the aptamer gate doors, though the middle of it do a slide. In the designs where both ends of the MS2 seems to pair up according to the simulation, entropy gets even higher.

The zipper style dominated the microRNA labs, but were less prevalent in the MS2 labs although present in some Exclusion 2 designs. I think the reason for this is that the FMN has some complementarity to the MS2 C's - and more can be created by an sequence add on. This means that it is advantageous to use small and strong C segments for turn off of the FMN G's or small and strong C segments for turnoff of the MS2 G's or FMN G's. Basically FMN are easy to turn off with a small magnet segment. In the MIR lab, there are no FMN's. So it all depends on the MS2 sequence. And a sliding zippering is often what did the MS2 turnoff. Only a minority of the high scorers used magnet C/U segments.

I got really inspired by your comment. I had considered that there might be differences between different switch solving types, but had not looked into it. I got courage worked up for it now.

There seems to be a difference in entropy between switch designs using a complementary style and those using magnet segment style. Just like I caught some of the major solving types for the MS2 switch labs high scorers using different amount of GUs depending on type, I think the same can be the case for switches solve types in relation to entropy. . (https://getsatisfaction.com/eternagam...)

For some reason I had expected entropy to be higher for the magnet solving style over the complementary one. But it seems to be the exact opposite.

Complementary style typically have a longer and less strong complement pair up with part or two stretches pair up with both most of the MS2 sequence. It often appears after the MS2, sometimes also before the MS2 or even both. Complementary style typically take a bigger part of the switch moving and it seems to have higher entropy.

Magnet solving style are designs that uses a small strong magnet element of mainly G's or C/U's and seems to be a good deal lower in entropy. Some designs have a mix of both styles.

Now I wonder if a next door slide takes lower entropy compared to a sequence jump to a pairing a bit a way (the latter is in the design you linked).

I also wonder if there is an entropy difference for the labs where MS2 is off in 1 state versus the ones where MS2 is off in 2 state.

The MIR labs seems be be lower in entropy compared to the MS2 labs. On the other hand only part of the switch stays in the MIR ones, since the microRNA is binding in only one state. The Turnoff V2, variant 2 seems to have a lower entropy compared to the MIR 208a designs.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
MORE ABOUT MS2 GATES

I have some more I have been thinking about and hope can benefit our MS2 designing.

Closeness of switch elements matters

While I think FMN and MS2 were too close together (next to each other or 1 base apart) in the first MS2 exclusion labs, I think they do need to be quite close - if not in sequence - then at least in 3D space. As several of better designs from the labs in second round specifically inserted a static stem between the FMN element that was longest away from the MS2 and the MS2. Also in microRNA labs, the MIR complement gets brought close to the MS2 sequence.

It seems to be a principle that to get a switch happen, the two elements that connects to either molecule or ligand, be it FMN, MS2 or mir, need to be brought in close range, and then some part of them will often pair up close to each other. Each one taking a turn to turn the other on or off. While in the Same State labs they both helped each other turn on or off, by pairing directly with each other for turn on or turn off.

But still the aptamer have quite specific wants for specific for its gate sequence and to so may MS2 also to some degree. (https://getsatisfaction.com/eternagam...)

In the MS2 player puzzles (computer simulation) it is my experience that MS2 do not really need to have an MS2 gate to solve. At least not with FMN around. It first do in the logic gates puzzles. I think the reason for that is that both mir sequences gets brought really close to the MS2 sequence. Actually they have a complement on either side of the MS2 sequence, That both works for turn on for the individual mir, but also works for turn off, when the mir complements pair up with each other in front of the MS2, prohibiting the mir sequences from pairing up with their complement. On top of that the left sequence after the MS2 is not only turn on for one of the mirs, plus turn off for both of the mirs, but also turn off sequence for the MS2. No wonder why this inverted XOR puzzle is a grumpy one and don't like changes.

Even the microRNA labs have quite specific wants for the MS2 closing doors. They seem to be somewhere halfway between complementary to parts of MS2 and complementary to the microRNA. I simply think there is a limited amount of legal solves, that can both turn on and off on its own (Not involving things like FMN) Ok, when involving part of microRNA - which is a changeable variant - the complement stretch between MS2 and MIR will have to change accordingly.

But generally the MS2 hairpin gates has not much allowed change - although more than the FMN aptamer. They are typically a bit longer than the aptamer gates .

I even had problems interchanging MS2 surrounding solves, from other lab puzzles, except if the MS2 sequence has a somewhat similar position in the puzzle. So for some puzzles there will only be few good variants of MS2 gates. This at seems to be the case for the logic gates puzzles - the inverted one in particular. But knowing when and why there may be few, could be kind of like a toolbox. And perhaps later something to teach the robot, if we find when and where it can with good success, reuse past good solves. Plus we may also learn something about good distance from the MS2 gate doors to the rest of the puzzle.

When I took a look at the switch archetypes drawings for the high scorers in the MS2 switch lab (https://getsatisfaction.com/eternagam...) I found that some pair ups were more likely to happen than others.

The early FMN sequence (FMN1) is more likely to be involved in the switch mechanism - bound when the aptamer is not in use, than (FMN2). And generally both ends of MS2 liked to be paired up somewhere, when the MS2 hairpin shouldn’t form. Rough numbers below.



Also note despite counting from the drawings, each lab design type don’t have an equal amount of solves. As they count pattern for the high scorer which may be a single design but also from the solve style among the majority of the winners.

While I say that MS2 likes to be close to one or both of the FMN sequences, I also think that different solving styles takes different distance. The complementary solving style seems to take a greater distance than magnet solving style.

First with last rounds coming results will we be able to see more about which solve types will dominate among the winners.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
MS2 Gates Continued



I have been talking a lot about MS2 gates. I think I can now say more about where and when they like to happen.

Background on MS2 gates

I have been working on the NG Same State labs first as I based on past experience think those are the ones we will have an easier time get working. However yesterday I started submitting designs for the Exclusion NG1 lab yesterday when a thought popped up. The designs I worked with seem to keep insisting on having a MS2 gate door. Something was not to the same extent the case for the NG Same State labs.


Hypothesis

I think that MS2 gates are more needed in the Exclusion type lab - or with other words in the turnoff labs - where MS2 have to form in 1 state and get turned of in 2 state. I think MS2 gates will be less frequent in the turn on labs where MS2 should not be present in 1 state.

Now the distance or rather lack of same between aptamer sequence and MS2 was forced in the early exclusion labs, which means that the MS2 gate would often happen as means to get a sequence pair up with the one FMN sequence that was next to the MS2, which was a pretty good way of ensuring - 1: that MS2 formed, 2: that FMN didn’t form. But now we can choose the distance and the MS2 gate thing still very often happen. One way or the other.


MS2 and shield bases

Now I have been talking about MS2 gates forming for the turn off labs. But in several cases a MS2 gate doesn’t form at all. It can also regularly be either an internal loop or a multiloop, but with something added. What happens is rather shield bases that are put at both sides of the MS2. (I think the shielding base can also be helpful for turn on labs to avoid single base areas pairing up with each other.)

So contrary to the usual mainly A in single base area, what happens are base patterns that specifically shield each other from pairing up and becoming stem.

Where the bases doesn’t pair with each other, but rather ensure that the sequences just around the MS2 doesn’t pair up.

It can be some G bases on either side or more often it is U bases. U’s seems to work great for shielding.

I have also been talking about U base shielding for a while - but for static designs. Stretches of U’s spread in specific frequencies in bigger single base areas that prevent the single base area from pairing up with each other and forming stems. It was rather helpful for big hairpin loops.

Loop shield of U’s and nearest stem effect...


Shielding bases, sliding and the first switch lab

It also seems that having some few repeat U’s in single base area between sections that needs to get moving are great for creating slide and action.

Even other color bases spaced out by A’s seems to help on getting movement. I recall our first switch lab. The round 2 high scorer Tebowned (89%) by Mikestrange, had an odd pattern of spaced out G’s in single base area, which was unusual in good static designs for single base area.



Some of us did try to remove these G's and replace them with A’s - because it looked nicer. :) However mostly the design got very grumpy about it. :) As can be seen from the designs sorted after score, most of the high scorers retained the weird pattern.



Makes more sense now, that we know that much A’s and long repeats of them seems to be cardinal sin in switches, to a higher degree than in static designs.


MicroRNA and MS2 gate doors

Back to MS2 gate doors. Both the MS2 gate doors and the shielding behaviors is much more outspoken in the Turn off labs compared to the turn on labs.

But the microRNA labs have MS2 gate doors in both turn on and turn off labs. What is going on?

I think the fact that the microRNA (ligand) is not part of the design sequence itself is the reason for that.


Sum up

I think the seemingly need for having a MS2 gate form or shielding bases around the MS2, is what makes the turnoff designs (MS2 forming in 1 state) harder than than the turnoff labs. (MS2 not formed in 1 state).
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
Ah, some light reading at night, always nice.
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
so, here we go, analysis of mirna hsa 208.

First, it became clear during designing submissions for the lab, that we needed a good "hinge", around which the designs could move freely between the states to score high. I tried around with what base that should be made of, it seemed obvious that we'd need at least 3, maybe 5 bases of the same type to do the trick.
It could not be U's , as we needed 5 U's in a row to satisfy the mirna coming along, and twice 6ish U's in a row struck me as strange.
My hypothesis was that it shouldn't be A;s, since the yield would be extremely low for these designs. Well, I was proven wrong, it was A's after all.
Let's take a look:
I analyzed the five best scorers:

and the one design of mine that sucked the most, for comparison.

No.1
The first three follow similar lines, and they all use the same hinge:


and No. 2


and No.3


and No. 5



All excellent designs, that use the smart hinge trick. The A's indicated by the arrows pivot between the state, acting as the flexible connection between the main hairpin and the short hairpin in state 1, and as a separator piece in state two allowing the mirna to dock solidly to the unravelled hairpin.

Consequently, having such a pivot hinge is critical in allowing good attachment for this micro RNA.

Now, we come to one of the surprises, number four.




By adding a miRNA non-compatible A on position 29, as indicated by the arrow, the miRNA gets effectively split. Now, I have tried that in many positions, but splitting up the 24-28 row of abovementioned Uracils never resulted in a good design, so I tried to split it right after or right before this row of what I deemed critical connection points for the miRNA (the 5 U's). In this case, it worked marvellously. And not only that, the score is insanely high, which goes to show that the miRNA can be folded along a well-offered RNA connector, and does not have to line up as a straight one half of a hairpin. That's very good to know, as it offers switch designers more choice and variety in the making of the molecules.

So, now let's compare this with my design 009. At 60/100, one of the three criteria scored 0 points. While I don't know which, my hypothesis would be that the Kd value difference is low.


And one can immediately see, why it is such a bad design. The available stretch of similar bases is too short. Likely, the molecule sterically or isometrically hindered in switching between the states, which may may docking for the miRNA difficult, and thus results in low Kd scores. This problem is attenuated by having a solid wall of immovable RNA glue (GC) on the 4-9/16-21 stretch, which allows for no shifting in the molecule.

Your thoughts on this analysis are welcome.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Hi Salish!

I like what you have been up to. :)

I have added an image of the lab list that demonstrate what you say, that the A’s in line are the top solutions. The blue highlights marks the minority of the winners that use the other solving method. (The one that made you wonder, when you checked your image No 4.)



I can confirm. The 208 design really likes its spacer A’s. I tried place other single bases than A in the region in already high scoring designs.and most of the designs didn’t get more happy about it.

Most of these designs where I added a G to the A region - to make a GU in the structure - scored 94% in round 1. Except for the D97 and Zipper 23 that originally scored 92%.

I have marked my mods with green blue and red color to indicate whether they were improved, neutral or worsened in score. Most worsened. Only one design actively improved its score - its also the one that had a A’s area of 3.




Do the U line and A hinge happen in the turnoff labs?

In the turnoff labs this U line doesn’t fully happen as the microRNA’s A’s often gets used for ring bases in a multiloop. But still there is a bit of an A hinge between the 1 static stem and the switching part of the design.



Instead of the fullblooded U’s line, some of the U bases gets made G’s.



The line of U’s but now with G’s, is a pattern that goes through the good designs of this lab. And shorter variations on the theme goes through the other turnoff labs before the MS2 sequence.

I guess one of the G’s is to make it connect up with part of the C/U element (Green highlight) and perhaps partly so it doesn’t pair with the yellow microRNA bases.

Salish99_TOv2_r3_018 (100%)




CU segment in turnoff labs

Another pattern that goes through this lab is the CU magnet segment after the MS2 sequence, which works for turnoff of the MS2. It was already present in the Variant 2, V2 lab and continues in the Variant 3 ones.





Perhaps more like some shielding effect happening, as there are a CU segment before the MS2 also.





Where do these A hinges turn up?

There actually is kind of a pattern to where the A hinges turn up. They seem to turn up between what is made a static stem and the rest of the design that is switching.







I wonder if the A hinge bases are some kind of separator between parts of the design. Like separation between switching and non switching parts.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Salish’s hinge

I have been thinking more about the hinge thoughts of Salish.

I think it is partially there because of the complementary sequence that the Mir 2 part of the mIR sequence calls for (Image to the right). However I suspect that hinge or need for a space between the moving and non moving part of the puzzle or potentially between the parts pairing with the microRNA and the other part, will turn up none the less when we run other labs with different mIR’s.


Comment on microRNA tails

The logic gates puzzles have these dangling "sequences/tails" too, in connection with the microRNA's when they are turned off.

https://getsatisfaction.com/eternagam...

Actually the dangle sequence itself is already kind of build into the microRNA sequence already via its linker sequence.

I think that this complementary tail dangle is going to be a hallmark of future microRNA switches.

MicroRNA switches versus other kind of switches

After I realized what JL had been doing with microRNA dangling tails to initiate the switch, it got me thinking about if this was going on in other kinds of switches. It is to some degree. Particularly those where ends of the RNA sequence are somehow involved in the switching.

If this is the case then we should be able to reuse JLs strategy for future microRNA design and perhaps also for other switches, when the switch goes on in the tail region.

This make me think that unpaired tail sequences are only wanted, if they can be of use as taking part in the switch. In Ex 1 and Ex 4, the tail dangles are involved in the switching, but not in pairing with each other. There one of the tails works as a small catching dangling sequence, helping the switch along. I believe those labs that involves the tails in the switching will have a much more limited solve space.

Riboswitches and switching area

I have been running some of the natural riboswitches through Eterna to get an idea what they looked like. I picked those who had either single or double dot notation - but no pseudo knots. Then I also ran them in Vienna, to get an idea from entropy highs of what areas in the switch was moving. (I know that eterna and Vienna both Vienna energy models - can not show them correct, but it could still give me an idea of likely areas of switching.)

Most of them seemed to have their switch go on somewhere safe inside of the RNA sequence and not at the tail ends. Most of them also had some kind of multiloop going on in relation to the switch. Only there were a few that had the switch going on in an open tail region.

Natural Riboswitches

So unless the tails at beginning or end of the RNA sequence are very short and if they are not actively involved in making the switch happen, I think it will generally be better making loose tails pair up with each other or if only one of them is long, make it fold a static stem with itself. If they are short and are not involved in the switch itself and too short to make static stems or pairing, then giving them shield bases so they don’t pair, can be helpful too.
Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
I'm not too sure about the presence of "A hinges" being a necessity to any riboswitch design. Rather, in the cases presented above, they are all relatively identical in the way they were designed as riboswitches, being they are modifications of a previous high scoring design.

The counterexample is a rather peculiar result in itself:
https://s3.amazonaws.com/eterna/labs/...
The kd for no FMN shows quite a range and distribution in the cluster ensemble, with the MS2 aptamer being a significant part of the resulting probability space. As for parity between the counterexample and the other results, there are several changes in secondary structure and sequence mutations that alter the thermodynamics of both the ON and OFF states that may make comparison of shorter and longer danglings an issue.

As for Eli's comment about some sort of sequence space between "moving and non moving parts of the puzzle," it would be a safer bet to design with some sort of space between helices, given unknowns in the thermodynamic properties of stems that are sterically close. Granted, I would assume that there have not been too many designs with the objective of testing this hypothesis, so perhaps we'll one day see the opposite as our preferred design strategy.
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
In the case of the 570136/5766376 puzzle, the quad A is missing. Granted, this may not be the main reason for the low scoring.
Do we know the steric limits of the bases? how bendable are AA, AU, AG, AC, UG, UC, GC, UU, GG, CC, neighbors, and AAA, AAU, AAG, AAC, AUA, etc etc up to 6 bases deep.
If this is significantly higher for xAAAAx than for, say xGAAGx as in the worst scoring case listed above, this may give us a clue of how easy the switch can happen.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
MicroRNA welcoming dangling tail

Salish, you got me inspired. :)

I think what you are saying with the hinge thing is that the microRNA use it as hinge for docking. This made me think about that loose tail of single bases at the end of the design. This is kind of unusual with this many non A’s dangling and not doing anything in one state. Even in a static design. Usually they would go look for some action.

So the winners have this microRNA complementary dangle. Its usually a complement to the early stretch of the microRNA although it can be to the late part too. I see this dangling complement missing in some of the low scoring designs.

Salish99_TOv2_r3_018 (100%)



This happens not just in the 208a lab, but also the turnoff ones.

I simply think this dangling end works as landing spot for the microRNA and when first there, it can force the rest of the design open for full attachment.

Salish99_TOv2_r3_018 (100%)



Often either end of the microRNA attachment are rather weak. I noticed it was here that GU’s are more often welcome, than further in, from the mods of past round winner, where I added in GU's all over the design at turn.

Small mods - what is preferred?


Length of the tail

I have seen the dangling tail - that is not involved in the design when the microRNA is not present, be from anywhere between 4 and 12 bases long. Counting the dangeling bases that seems to pair up with the microRNA, when the microRNA is around.

Zipper 46 - 13 (100%) (Your No. 1 image)



This dangling even happens in the minority solve no 4 and its siblings. The number 4 you were wondering about is one of the rarer but wellworking minority solves that takes a different road.

Sensor 100 (99%) (Your No. 4)


I have a few pictures of the visual difference between them here:

Minority versus majority archetype

This minority still has its line of U’s that are complementary with the microRNA A’s. But the design itself is kind of reversed. It has its static stem in the opposite end of the design compared to the majority of the winners.


Dangling tail early or late

If the dangle is not in the one end of the design, it is in the other. I think there are most late dangles.

Early dangle


Sensor 3, v1 87 (100)%

I wonder if there is a pattern for the dangles. The minority ones seem to have the weaker and shorter miRNA complementary stretch dangle. I think this might be why there are sometimes minority and majority archetypes of solve types for the labs.

I also wonder if there is a pattern for which type of labs has the longest dangling tail. The 208a lab seems to have the longest dangles. Plus I managed to find some designs in the 208a lab which seemingly had the dangle pair up with itself. But again, these had pretty long dangles like 12 bases long.

Here is one of them:

salish99_208a-r3_107x (99%)

Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
MS2 and entropy

Out of curiosity I ran MS2 sequence through Vienna and it seems to be in the high end of entropy for a static stem.



And appears to be less stable at both closing ends. Which isn't too bad, when it has to get moving. :)

Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
The following posts will showcase some interesting data points from the R95 puzzle results.

1. A Series of Mutations

Player salish99 submitted a number of point mutations to uracil, for a sequence in round 95, in the same state 2 riboswitch puzzle. Only those residues unlocked or not originally uracil were mutated.

The following images are a visual representation of the Eterna scores for these mutations, mapped against the predicted secondary structure targets for both the OFF and ON states. It would probably be better to map other statistics, such as KDON and KDOFF to these images, however, the average number of clusters for this entire set was approximately 8 clusters, so the analysis following should probably be considered purely speculative and an example of the usefulness of this data with better statistics.

The color scale is set at 100-80 being green to yellow. 80-50 being yellow to red, and all values lower than 50 being red, except those values without any point mutation data, which are colored grey in this instance.





From a purely hypothetical and speculative perspective (again, the data for these would most likely not be the most robust), we can see that several point mutations to uracil caused a significant drop in score/activity for the riboswitch, including several single stranded regions that would appear to have no obvious effect on the riboswitch or change in secondary structure. In addition, the initial closing base pair of the FMN aptamer is important as well. Finally, in the first helix of the design, disruption of the G-C base pairs appears to also cause a significant change in score.

If we were to apply the M2 destabilizing mutations to some of the "winning" sequences every round, including those that are a part of the MS2 hairpin and FMN aptamer, information such as this would be incredibly useful for understanding the nature of the chip riboswitches, the robustness of these sequences to mutations, as well as a general understanding of RNA ensembles. Especially if the data was based on sequences with far more accurate and precise results.

Here is a link to the histograms, sequences and mutations used for this
https://docs.google.com/spreadsheets/...
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
Thanks for this analysis, Brourd.
Especially interesting to me is the disruption of the GC base pair leading to an instant step change in the scoring.
As for the Uracils, I tried to break them up, so that the hinge effect (as described above) could be analyzed in more detail - this 5-chain U really seems to be one of the cruxes of this particular riboswitch set.
Lars
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
MS2 Round 3 + part 2

I was watching my mods of the high scorers in round 2 of the MS2 labs. And I was wondering. Because a lot of my designs where I did make a higher score than in the original design, then removed themselves from my data set by getting cluster counts so low that I can likely not trust the data.

For the time being I’m primarily interested in cluster count at 20 and above and error rate below 1.4 as Omei set it in his new fusion table + plus Johan's warning that 10 clusters may not be enough.

Fusion Tables by Omei

Round 88 + 93
Round 95

I noticed something else. More among my seemingly fine high scoring designs in the Same state labs had low cluster counts, compared to those in the Exclusion labs. This got me wondering about if there were differences between the labs and their cluster counts.

When I look at the data with Omei’s fusion tables from Round 93 and 95, this seems to be the case.


Cluster count versus lab type and score

Higher cluster count in Exclusion labs


Lower cluster count in Same State labs


The irony is that the labs which has the higher eterna scores in general, are also the ones with the lower amount of clusters.


Switch direction versus cluster count

Since the Same state labs differ from the Exclusion labs, by having different switch direction I used this for sorting, against cluster count. (The Same State labs have MS2 gone in first state, where Exclusion labs have MS2 forming in 1 state)

Exclusion labs left, Same state right


Exclusion labs left, Same state right




MicroRNA’s and cluster count

The microRNA labs also have higher cluster counts in the turnoff labs, but the trend is reversing.

Turnoff labs to the left, Sensor 208 to the right.



So why do some labs gets higher cluster count than others?

However having a high cluster count is absolutely no guarantee of a good score. Quite the contrary.

Now all this got me wondering if there were anything characterizing those designs that did end up with a high score.

I picked out the Same State 2 lab and the Exclusion 4 lab as those were the two labs in each category with the highest average score.

I checked through the ones that had higher cluster counts than 100 in the Same State 2 Lab. And what were true for those were that they were mostly or totally full moving switches. Whereas the main part of the designs in that lab that do end with a good cluster count and great eterna score are partial moving switches.

I checked the designs with above 300 clusters in the Exclusion 4 lab. There were some full moving switches, but they were a minority. Main part of the top scorers of this lab with decent cluster count also were partial switches.

Other differences between these two labs:

Exclusion 4 has the MS2 dangling on the “outside” of the design, where Same State 2 has it more embedded inside the sequence.

What else is different between these labs are that Exclusion 4 tend to have dangling tails, whereas Same State 2 prefers its tails paired with each other.


Sum up

So basically I don’t know why the one type of lab (Exclusion - turnoff type) has more cluster counts than the other. (Same state). I just find it very interesting.
Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
My original assumption for this behavior was that the Same State 2 puzzle was approximately 10 nucleotides longer than the other puzzles, leading to an overall lower yield of DNA clusters as other sequences are amplified. However, I heard that the sequence lengths were standardized in Round 95, so it may not be the case.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
Brourd, where did you hear that sequence lengths were standardized, and what does that mean? Johan's spreadsheet gives the sequence lengths as being different for the various labs. Are you perhaps referring to the padding of the DNA templates so they are all the same length? As far as I know that has always been done, and ever since round 80 (but not before!?), there has been been a definite correlation between sequence length and DNA amplification success.

It seems to me that your original assumption is supported by the data. The R^2 value in the linear correlation between switch direction and cluster size in Eli's graph is 0.121. That mean that about 12% of the variation in the cluster size can be predicted by the switch direction. If I substitute RNA length for switch direction, I get a R^2 value of 0.116. So sequence length is essentially just as good a predictor as switch direction, and has a much better intuitive justification.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
A promising structure-based design pattern

I've observed what seems to be a very strong pattern for constructing good FMN switches. It entails forming, in the unbound state, one (but only one) end of the (bound) aptamer interior loop.


There are two ends to the aptamer loop, which I've started calling the near half-aptamer and the far half-aptamer. The whole explanation seemed too long to merit posting in full here, so here's the link to the document.

I'll include the final graph from the document here. In the R95 Same State lab, after filtering the designs on other criteria that I've observed to be correlated with good switch scores, I got this:


Of the 65 designs that passed the other filters, the average score of the 37 designs that conformed to the near half-aptamer pattern was 91.6, compared to 70.7 for the 28 designs that didn't.
Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
The document link seems to be missing.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Overall trends

Generally most of the MS2 labs so far has primarily favored partial moving switches. However a hallmark for the top scorer in the new Exclusion 5 and Exclusion 6 are full moving switches or close to. These labs didn’t score too well.

The aptamer

Again, the aptamer seems to prefer to have one of its ends stabilized by a static stem. There are only few switches that have switching area around both sides of the aptamer.

I think this is one of the reasons why designs that have their MS2 sequence placed in between the FMN aptamer sequences, tend to have their tail bases pair up with each other and become static stem with each other. (Ex 2, Ex 3, SS2) As pairing the tails with each other, closes up the early end of the aptamer.

However for the designs that have the late end of their aptamer closed, the switching has to happen at the early end of the aptamer. I think what will happen will depend on how much sequence is before the aptamer sequence - the beginning and end of the RNA sequence. If there is little, I think dangling tails will happen. Leaving the switching to happen between the MS2 and one of the tails.

Exclusion 5 and 6

I would have tipped Exclusion 5 in particular, but Exclusion 6 too, to some extent to have ended up with their tail section paired up with each other. And I think in a better world they would have had it. But I’m guessing that the bigger distance between MS2 and the FMN sequence than the previous MS2 labs, may have caused the tail sections to be needed for being involved in the switch to make it happen.

Even in Exclusion 6 that only have 4 bases between FMN and MS2, those extra bases seems cause the majority of the top scorers to go full moving switch - which is bad for chances of getting a lot of winners. The same is the case for the Exclusion 5 lab. The exclusion labs that did best were Ex 2 and Ex 3, which had a distance of 0, and the Ex 1 and Ex 4 had one. I still think that perhaps something like 2-3 bases may work too.

I think the long distance between MS2 and FMN forces the design to go full moving switch, and this makes the tail too weak to stick together as a static stem, as would normally have been a good solution.

The Exclusion 5 lab makes me think that I have been wrong about distance between the MS2 and the FMN. I have been saying it can be too small. While having bigger distance working well for the Same State labs, Same state labs are different from Exclusion labs in a fundamental way.

Exclusion labs

It is as if the exclusion designs can better tolerate distance between MS2 and FMN, if the MS2 is not between the FMN sequences. On the contrary those labs that have the MS2 “outside” the FMN sequences - that is before or after and not in between the two FMN sequences, tends to wants their tail dangle. (Ex1, Ex4, Ex6, SS1)

The cutting line seems to lay somewhere between Brourd’s mod of exclusion 4 and Exclusion 6. The latter has slightly shorter tails and it can’t keep its tail together, whereas the one with slightly longer tail, can.

Dangling tails tend to go mainly C and U to avoid them pairing up. Typically with main part of the U’s in the first tail and more C’s in the last tail. Thats if the tails are not involved in the switching.

Exclusion versus Same State

Exclusion labs seems to want to have the MS2 real close to one of the aptamer sequences. As the labs where more distance were forced Exclusion 5 and 6, didn’t score too well and the majority of the high scorers became full moving switches. I had actually thought that having the MS2 and FMN close were a hindering for higher scores, but it seems that making distance large (4 bases and up) forces a much more severe shift in the design, whereas designs with the FMN and MS2 fairly close, can have an overall stable structure and only switch in a small area. There it will be more of the switching parts moving instead of the full design moving. So if the MS2 and FMN cannot slide, the rest of the design will.

Designs that have MS2 in the 1 state and FMN in state 2, plus have some distance between them, seems to need a fuller moving switching pattern than designs of this type that keeps them real close. As a result they will also have higher entropy - due to the fuller move.

Where Same State lab types actually prefers some distance between MS2 and FMN. All the difference between these lab types is if MS2 is already on in state 1 and needs to get turned off (Exclusion/turnoff labs) or needs to get turned on in state 2 (Same state, Sensor 208a lab type.)

Sum up

I was wrong about the distance thing I earlier said for the exclusion labs. They don't seem to like much bigger distance between MS2 and FMN. I think what really matters, is that weather MS2 is present in 1 or 2 state affects things a lot. Any lab that has MS2 present in 1 state (and FMN in the opposite state), will be harder to solve (MS2 turnoff labs) than those which have MS2 be present in state 2 (MS2 turn on labs). And if big distance is forced in exclusion labs, between FMN and MS2, it causes the lab to go full moving switch which will make the designs even harder to solve.

I think the exclusion/turnoff labs can’t have as much space between FMN and MS2 because they need will often need MS2 gates.

For a design to turn on MS2, it just need some complementary stretches somewhere. And the complementary stretches can easily jump a bit. Hence the bigger distance between FMN and MS2 sequence in the Same State 1 and Same state 2 (MS2 turn on labs) However when the MS2 needs to get turned off, the turnoff sequence needs to be close by the MS2. (Exclusion, turnoff labs) Plus when there is a need for a turnoff sequence, this often calls for an additional turn on sequence too. (MS2 gates). These are close to the MS2, just like aptamer gates are close to the aptamer.

So I think that what distance needed of MS2 and FMN, are to a great extent called for by switching direction.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Turning off MS2

I have been talking about MS2 turnoff sequences earlier. They seems to be using same method of operating in the turnoff labs. Working in concert with the MS2 gate.

In the turnoff labs (Exclusion type), MS2 is on and formed in State 1 and needs to get turned on in State 2.

MS2 is particular fond of having its turnoff sequence after itself or in front. It depends of the position of the FMN. This happens when a FMN sequence is close in front of the MS2 sequence. However when the FMN is close after the MS2 sequence, the MS2 turnoff sequence lands before the MS2 sequence.

Usually this turnoff sequence lands right next to the MS2 sequence. It typically consists of 4-6 bases, although it can in rare cases be shorter or longer. These 4-6 bases are typically complementary to a stretch inside of the MS2. Most of the time it contains an overweight of C’s and U’s. Also what I have earlier called a strong CU magnet segment - although these do not always need to be right next to the MS2 sequence.


Image examples with MS2 turnoff



What is quite interesting here is that the Sensor v3, variant 2 lab, that does not have an aptamer, has a kind of pseudo FMN sequence in front of its MS2 sequence, so it gets similarities to the Ex 3 and Ex 4 labs.




Perspective

One of the exclusion labs that stands out from the MS2 turnoff, is Brourd mod of Exclusion 4. In that lab most of the top scorers doesn’t use a long turnoff sequence for the MS2, neither makes a MS2 Gate. Instead they tend to solve in a style much like some of the Zipper complementary style of the turn on labs like Same State 2. Which I find interesting. I look forward to see if this pattern shows a way of escaping the more fixed pattern of MS2 gates and turnoff sequences.
Photo of jandersonlee

jandersonlee

  • 549 Posts
  • 122 Reply Likes
Thanks for the analysis.
Photo of jandersonlee

jandersonlee

  • 549 Posts
  • 122 Reply Likes
Clean(er) Dot Plots

I don't have a quantitative analysis, but from looking at the higher scoring submissions from the last round it looks like most of them have cleaner dot plots thanI would have expected for a switch. Given Eli's thoughts on entropy and switches I found this surprising.

This seems to go with along Nando's idea for ViennaUCT (assuming I interpreted it correctly) that non-switching pairs should be 100% bound, OFF-only pairs about 96% bound, ON-only pairs about 4% bound, and always unbound NTs 100% unbound. Perhaps it is one reason why ViennaUCT has been doing so well in the switch labs.

For myself at least I plan to pay more attention to the dot plot this round and extra time designing the "fixed" stacks to prevent mismatches.

Anyone else noticed his (or the contrary)?
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Hi JL!

I very much agree with you. I have noticed too. And I was surprised. Since this were not at all how the dots plots of our first switch winners looked. They were generally messy and ugly. I find Nando's idea interesting.

There is one more trend, which is that the 1 state prediction match quite well up with the 1 state estimate and even keep showing the state 1 prediction in the state 2 estimate. Whereas it is often a bad sign if much of the second state prediction showing more than as a ghost in the estimate.

Here is a slide from my switch lab intro:



The cleaner dots is very much an overall trend. Especially for the partial moving switches. For the full moving switches or close to full moving switches, the trend shift a bit, more toward some of switching stems that is supposed to be on in the 2 state showing up more in the 1 state estimate - which will not match the 1 state prediction. Plus the dot becomes more messy. But mostly still not to a degree of past switches. Vinnie has more of a trend of making full moving switches than players.


A point of wondering

- Designs that seems to have most of the MS2 pair up when turned off, have quite messy dot plot.
Photo of Hyphema

Hyphema

  • 91 Posts
  • 25 Reply Likes
I agree with JL’s observation about the cleaner dot plot theory and better scores. My best example of this so far is a mod I did of Eli’s Sensor 3 Turn Off Variant 1 #59 in the miRNA Switch Lab Round 2 Eli’s Lab
I made a single nt mutation at 11 from A to C and the dot plot got much cleaner. This single mutation improved the score from a 47 to an 89. Here is a link to my design My Lab
Here are images of the Dot Plots of both.

Notice the change

imgimgEli’s dotplot” />

Obviously there are far more important things to make a good switch but having a clean dotplot may signify a higher score over a dirty one.
Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
Reducing pairing probabilities for extraneous sequences is an excellent place to start, hyphema. For Eli's design, there existed at least some structures in the ensemble that prohibited the formation of the MS2 aptamer, as indicated by the relatively high kd for the no FMN/miRNA 208a condition.

https://s3.amazonaws.com/eterna/labs/...

Your sequence mutation, on the other hand, had a lower kd for the same condition, indicating a more robust formation of the MS2 hairpin as a part of the MFE structure.

https://s3.amazonaws.com/eterna/labs/...

Granted, simply cleaning the dotplot by reducing all pairing probabilities may have a negative effect as well, given this is a riboswitch and some potential base pairing situations may be unavoidable in solutions.
Photo of Hyphema

Hyphema

  • 91 Posts
  • 25 Reply Likes
You are absolutely right Brourd. There are definitely more points to be concerned about here than just the dotplot. And having a clean dot does not mean you will have a good riboswitch.

A couple points about the dotplot i exampled. The smudgey area i cleaned up would have had a very detrimental effect to the success of the MS2 forming. That can be seen if i were to try to intentionally mutate nts and "darken" the mispairing in that dot i end up easily forming a stem that prevents the formation of the MS2(according to both Vienna models). So in this case removing that mispairing shown in Eli's dotplot did indeed help the riboswitch. Another point to make was that my single mutation was from Adenine to Cytosine. The mutation to C strengthened the stem associated in the formation of the MS2 Hairpin (see image). Left as an Adenine the stem in the ON State would have been a weaker 2-2 loop. As you can see the mutation to C in the OFF State had little effect to its secondary structure.



In summary, the nucleotide 11 mutation to C did a couple things, reduced the mispairing probability (cleaned the dotplot) that would have hurt the formation of the MS2, directly strengthened the formation of the ON State and had little influence to the OFF State. My mods of this very sequence should illicit some light on how important each or all of these points are in round 3 of the lab. Well, hopefully, at least. : )
(Edited)
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
JL bringing up dot plot, it reminded me of doing a revisiting to Melt plot.

The same trend as for many earlier switch winners, with a raise and a dip at early point (left side of the melt plot), still holds for many good switches. (Its a trend but not an always, there are switch winners without.)

However there are a few new trends I thought worth a mention. Here is an alternative slope that I have seen regularly for switch winners. So when I see this in one of my coming design, I count it a good thing, while its absence won't necessarily make me dump the design either:



The microRNA's lab winners seems to have their own melt pattern going. I think it is related to the microRNA unzipping the rather long MS2 gate and I'm guessing it is in a hurry. :)




Just a reminder for those who are new to this. A flat beginning is usually good also, even if there are not changes between the states.


A few points of wondering

- Are there any kind of difference in melt plots between turn on and turnoff labs, when it comes to plots. Exclusion labs more messy?

- I wonder if the alternative raise is related to MS2 Gates? If so this is related to exclusion labs. Oh - moment of realization... This is what makes the melt plot in the microRNA labs, look so different - the MS2 gate. For now I deem MS2 gate as culprit. :) This is what causes this alternative melt plot look. This also explains why they tend to show up in the exclusion type of labs and why I like them in my designs. Because I also like MS2 gates. :)



Background article

This melt plot drop, that were once quite rare, turned out to be specific to switches:

https://getsatisfaction.com/eternagam...
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
Eli, can you extract a correlation between the slop of the dotplot and the score?
(I tried, and failed, so far).
Can we extract the actual data of the dotplot in numbers, rather than just an image, somewhere?
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
I can't say anything from a melt plot alone. Eg if the melt plot is awesome - having a good flat start at left for a couple of squares or a nice dipping slope like many switches seems to like, it says nothing, if the dot plot is really ugly.

Melt plot versus dot plot:
https://docs.google.com/document/d/1Y...

It is more like that there are certain similarities between melt plots of good designs. I usually look most at the first few squares at the left.

I don't know if we can extract numbers from dot plots. I think Jennifer Pearl may have tried.
Photo of jandersonlee

jandersonlee

  • 549 Posts
  • 122 Reply Likes
Dot plots are generated from numbers. Better to use the source. It's in the Vienna code. Nando might have a moment. Alas, I do not right now.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Thx, JL. :)
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
yeah, thought so - i'll add it to my wishlist.
Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
Another addition to this incredibly long thread:

Some Exclusion 4 Observations Based on Salish99's Modifications of Ex4: 344

For the sake of brevity, this post will focus on the highlights.

Player Salish99 once again used a high scoring design and make several strategic mutations to the sequence. The WT (Wild Type) sequence and histograms are available here:

https://s3.amazonaws.com/eterna/labs/...

http://eterna.cmu.edu/game/browse/544...

And this is the histogram for the sequence duplicate that Salish submitted in round 95:

https://s3.amazonaws.com/eterna/labs/...

So, from this, the first observation I would like to point out is the distribution of clusters in each of these histograms. The ensemble of clusters shifts without any significant outliers, the slope peaks near the median and the KD's for both the no FMN condition and 0.2 mM condition are similar.

From this, Salish used a variant of the wild-type sequence and made a C16U mutation to alter the next nearest base pair of the FMN aptamer to a U-G base pair. From this altered sequence, Salish made several point mutations to the apical loop of the helix following the FMN aptamer, for both the WT variant and the C16U mutant.

The numerical data for this is available in this Google spreadsheet.

https://docs.google.com/spreadsheets/...

Here are the histograms for the A23U mutation as an example.

G-U Mutant
https://s3.amazonaws.com/eterna/labs/...

WT Variant
https://s3.amazonaws.com/eterna/labs/...

For the G-U mutants, this histogram and the histograms of each of the systematic mutations indicate that the dissociation constant is lower in the no FMN condition, that Fmax is typically lower in the 0.2 mM FMN condition, and that the distribution and range of the clusters is typically quite high.

In contrast, the WT variant shows a higher kd in the no FMN condition, Fmax for the 0.2 mM FMN condition is typically higher than that of the G-U mutant, and the distribution and range of the clusters is tighter together.

From this data, we formulate the hypothesis that during sequence design, aspects such as the next nearest neighbor of the closing base pairs of the FMN aptamer will have some effects on the final result we see. In this case, a U-G base pair as the next base pair after the locked base pair of the FMN aptamer caused a change in both the dissociation of the MS2 coat protein and its aptamer, and the Fmax cluster intensity in the 0.2 mM FMN condition.

Granted, this is based on a single secondary structure and sequence. It's possible these effects can be mitigated with a different strategy or secondary structure.
Photo of jandersonlee

jandersonlee

  • 549 Posts
  • 122 Reply Likes
The energy model backs up the idea that the nearest neighbor of a closing pair (what I call the "backing pair") can make a significant difference.
Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
Besides the initial free energies of the base pair loops and the resulting change in pairing probabilities, there really is not much that the parameters reflect as different in comparison to any other base pair. For instance, the free energy of the FMN aptamer does not decrease due to the substitution of G-C to G-U or vice versa. It's also possible that the G-U "Backing pair" also affects the kd for the binding of FMN to its respective aptamer.
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
This means that the closest nieghboring base really does make a huge difference - this will make programming this into our new bot challenging.
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
what does it mean if the fmaxFMN_std is 900299 and fmaxFMN_sem is 78659? That the clusters were too low for accurate counting?
Photo of Brourd

Brourd

  • 437 Posts
  • 79 Reply Likes
I would assume that's the standard deviation and standard error of the mean for Fmax in the 0.2 mM FMN condition. Why the values are so high is not something I know, given that would be based upon the individual cluster values. It's possible that is error related to a handful of outliers with incredibly high values for Fmax.
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
Ya, we had them in round 1, too - usually at a cluster count of 1.
Data, then, must be ignored.
(then again, may have another reason in round 3.2)
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Which FMN sequence to catch?

I have been wondering about the Exclusion 2 lab, since I made this drawing of the MS2 turnoff sequences.

It stuck outside the switch off patterns for the Exclusion 1, 3 and 4 labs. In these labs, many of the high scoring designs use a magnet segment of C’s and U’s to switch off the MS2, by attaching to the G section in it, and to help the MS2 turn on again, by pairing with the G’s in the FMN just on the other side of the MS2 sequence, FMN in it.



Image of such typical design:
salish99_ex3_r3_098x (97%)


http://eterna.cmu.edu/game/browse/573...

Notice that the aptamer sequence closest to the MS2 design (before the MS2 hairpin) is what the MS2 turnoff sequence (after the MS2 hairpin) pairs up with in state 1 (left)



Exclusion Lab 2 and the strange design

However in the Exclusion 2 lab, the main part of the high scorers followed a totally other pattern, with the main part of the MS2 sequence pairing up for turnoff and not just a section of it. And there were no involvement of the FMN sequences. In other words a more complicated solution.

However were one design, made by Parushev, that behaved different from the others. It did had the MS2 pair up with one FMN sequence. But it was not the one closest by as was the case with the other of the first 4 exclusion labs. And there were hardly any MS2 gate - just one extra base pair in front of the MS2. This design kept me wondering.

ChP 11-04-2015 #5 (88%) by parushev

http://eterna.cmu.edu/game/browse/573...

In distance of 3D the distance to each FMN sequence, from the MS2 C’s, is around the same, since there is a small static stem abbreviation (left 4 bp stem) However by choosing the far off and early FMN sequence for a pair, the situation around the MS2 gets a little less locked. I think this is important. This may open up to further options for solve. So while this design is in the minority of the highscorers for this lab, I think that it has potential to become more prevalent in next round.

Basically this specific MS2 turnoff sequence for the above design is an abbreviated MS2 hairpin sequence - with most the strongest segments included. G’s and C’s.


Perspective

So it seems that we have a way to avoid creating MS2 gate, just by choice of which of the two FMN sequences we target. The closest by in sequence/3D space or the furthest away.

So now I wonder what is actually working best? Creating exclusion designs where the FMN closest to the MS2 sequence also pairs up with the MS2 sequence, or creating more free designs that has the MS2 turnoff pair up with the furthest away FMN?

My past lab drawings of switching tendencies showed both types of switches, both ones where the MS2 turnoff sequence went for the one or the other of the FMN sequences.

For now closest to MS2 seems to have won. However it also put quite a lock on what possible solutions there can be made. So I wonder if the other may have a say again. At least I will use a good deal of my exclusion slots to swap FMN targets in high scoring designs, to try figuring that out.



Background articles

Turning off MS2: https://getsatisfaction.com/eternagam...
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes

I mentioned in the above post, that another aptamer sequence were target compared to what I have expected:


"However were one design, made by Parushev, that behaved different from the others. It did had the MS2 pair up with one FMN sequence. But it was not the one closest by as was the case with the other of the first 4 exclusion labs."


I have one correction. I said it was the FMN paired up with the MS2, which is not correct. It was the MS2 turnoff sequence that paired up with the FMN1. But still it holds that it was the unusual FMN getting in use, as it was not the one closest to the MS2.


ChP 11-04-2015 #5 (88%) by parushev

http://eterna.cmu.edu/game/browse/573...


However this design actually behave according to another pattern that I have also seen: Namely that there seem to be an overuse of the FMN1 sequence (first half of the aptamer sequence) over the FMN2 (last half of the aptamer sequence) as switching point. Something which I mentioned from round 2.


I think it has only hit even more through in round 3. In the drawings I did after round 2, here are the trends for what hit through for the lab types I drew. With some of them having a bit of new variation added.


Exclusion A typepng


I count only the highest scoring designs, with cluster limits set on 20-30 and Folding error rate at max 1.4. The labs that had many winners I now routinely set at minimum cluster count of 30+.  


Ex1       Type A won - with some variations

Ex2       Type A mostly won - with some variations

Ex3       Type A won

Ex4       Type A won


SS1      Variation of A + B

SS2      Variation of B + new variations


I Exclusion 1, 3 and 4, A type solves all have FMN1 involved. Exclusion 2 doesn’t. But then again, the puzzle that has it is Perushev’s is still among the high scorers. And since I trust the pattern from the other 3 labs over the pattern in the actual Exclusion 2 lab winners, I am pretty confident that modifications of Perushev’s designs or designs that makes the switch involve the FMN1, are going to yield winners.



Background articles


Lab drawings

Targeting FMN1 versus targeting FMN2


(Edited)
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
I suggested in the first post in this section, to try out switch aim from one FMN sequence to the opposite, in some of the winning designs of past rounds. For a lab experiment.

For now having the MS2 turnoff sequence target FMN1 has been successful a good part of the exclusion labs. However it also generally result in a rather long MS2 gate which needs to be broken - which may slow down the switching. So in an attempt to find out if things can be done better, I’m inviting you to take part.

Here is a bit of inspiration for how to do it:



A rather simple way to change which FMN to target with the MS2 turnoff sequence, is simply reversing the order of U's and C's.

UUCC will target FMN1

But CCUU(C) will target FMN2
(Edited)
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
2 questions:

1) Could we either unpromote all answers (and, yes, they are actually all very important to this thread, but the page has become unloadable when post and promotional report both need to be loaded and scrolled through)
OR
could we not break this getsat post in multiple pages? i.e. there are now three, and soon to be four, and on each page the promoted ones show up first, I lose track of where the actual posts begin.
(sorry for all people promoted, you all still deserve it!)

2) The labs have numbers in the spreadsheets, I remember 89 and 95.
But we now had round 2, round 3, round 3.2, now HSa results, round 4, etc etc, and in the results section they still say "DNA template ordered", when the actual results are already out. Could somebody make a Round 1-4 plus small intermediary rounds - to - eternalabnumber conversion table? Thx.
Photo of salish99

salish99

  • 295 Posts
  • 58 Reply Likes
test
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes

A way to improve the exclusion labs?

As of now, the exclusion labs uses the aptamer for turn on in state 2, against having the MS2 on in state 1. MS2 is far stronger than the aptamer.

I wonder if not the exclusion labs would fare better, if each their states were reversed.

As for the same state labs I don’t think a reversal would improve things, rather the contrary. As is, both the aptamer and the MS2 gets turned on in the second state. So there is a pretty strong pull towards turning on and getting the switch moving.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Is dangling overhangs causing high cluster counts?

I noticed that my designs in the lab Exclusion NG 2, had gotten an unreasonable high amount of clusters, compared to the general average of clusters for designs in that lab.

What were different about them, besides that many of them had the MS2 turnoff sequence aim for the last bit of the FMN, were that they had an overhang of C and U bases. Which reminded me, was exactly the case for the microRNA labs, which didn't exactly suffer from bad cluster counts also.

One of my designs (83%) and cluster count 221

http://eterna.cmu.edu/game/browse/5851780/?filter1=Id&filter1_arg1=5958889&filter1_arg2=5958...

So now I wonder if any kind of overhang will cause higher cluster counts or it is mainly C and U ones. I'm guessing at the latter.

These unpaired dangling C and U stretches in one or both states are kind of a hallmark of switches, they are not just isolated to the microRNA labs. I have earlier found that in particular C and U overhangs were in overweight in high scoring switches as the switching strand. Also in past switch labs. Not in all high scorers, but in a great deal.

I have done more extensive notes on the cluster counts in relation to the NG labs here:

Cluster counts
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
High cluster count - low KDOFF

Another thing I find worth noting.

Many of the designs with high cluster counts in the NG labs, tend to have unusual low KDOFF, compared to normal.

Normally high scoring designs will have very high KDOFF numbers in comparison with KDON. There is kind of like a normal range.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes

One of my designs with high cluster counts however did escape the death of the low KDOFF number.


I wondered what went on in it.

It both had some C and U dangles, but more importantly I think, it has the complementary zipper pattern, which has been winning ground in this round of data, both exclusion lab and same state labs.

Fiskers NG3ss - 39    (96.1%)

Mirror stempng

http://eterna.cmu.edu/game/browse/5851791/?filter1=Id&filter1_arg1=5946026&filter1_arg2=5946026


I decided to look into what characterized the designs that managed to get both high cluster counts (beyond 100) and still keep a decent KDOFF (above 100).

What I came to is that there seems to be a relation between high cluster count, high KDOFF, CU dangles and repeats throughout the design of the MS2 CAUG sequence.

I think one of the advantages of this repeat of the MS2 pattern is that if repeated twice elsewhere - eg in the aptamer gate - it allows for grabbing hold of two ends of the MS2 instead of just 1. So I think it may enable a more secure switch, than just a single MS2 turnoff sequence. Also I think it helps reduce the need for a longer MS2 turnoff sequence in the exclusion labs - which again leads to a long MS2 gate that needs to get broken.
 
As my investigation into it got rather lengthy I put the main part in a new document.

High cluster counts + High KDOFF


(Edited)
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
I got added wrong link to the document for the first post in this section. Here comes the right one:

Cluster Counts
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes

Multiloops for the switch


I have earlier mentioned that I thought switches benefited from multiloops in either of the states or in both at once.

“It could look like designs with multi loops have a slight advantage.”


Now there are a good deal more switch data to base that on and multiloops are hugely present in the MS2 labs where we have had best luck.

There seems to be something about having a multiloop in each state, that helps the design escape from making a step on the road towards a full moving switch.

Here is one of the reasons I hate full moving switches. Unless the switching elements are real far apart, the design doesn't really need to be a full moving switch. The partly overlapping multiloops helps keep the switching elements in close range of where they are needed.

Also notice that the designs in lab rounds Ex 5 and Ex 6 mostly have no multiloops in either of their states. And they are also mostly full mowing switches. Whereas anything that has tails long enough to tie up (Ex 3, Ex 4B, SS2), often has multiloops in both states, contrary to Ex1, Ex4 that had really short tails.

Just as Omei mentioned there were something about presence of hairpin loops that prevents snake design. As he said:


"The requirement that there are at least two hairpin loops in both states eliminates “snake”

designs, which seem to be too rigid to switch well."


I honestly think that is a strategy on its own. ;)


Omei's document:
A Structure Design Pattern for Eterna MS2 Riboswitches


Ways to get multiloops happen


To get multiloops happening in the switching area for sure in both states:


  1. It is generally a help tying up the tails at each end of the RNA sequence with each other.

  2. Add a static stem in the switching area.


Those two advice above often ends with the same result - namely tied tails, which often is the static stem added. It isn’t always a small static hairpin loop stem that gets added in the switching area, no, a static neck when involved in the switching area, counts too.


I am inclined towards saying that tying tails should be an always. Although I know we would get a hard job with the Ex 1 and Ex 4, due to their short tails. And I have seen riboswitches that switch at their ends. However these do typically not involve multiloops.

At the bottom of this post, I have a link to a collection of natural riboswitches, as shown in their ON state. While not all of them have a multiloop, many of them do. Especially the bigger ones. And the ones that have multiloops are the ones I suspect are switching inside of themselves. Those without, seems to be switching with their tail regions. But this is of cause guessing for my part.

Pattern sum up


  • Pattern: Multiloops present in both states.

  • Where? The multiloops should be present in the switching area in each state.

  • What do they do? The multiloops helps steer the switch, by keeping a part of the multiloops stable, while allowing the rest of the stems in the multiloop to move. Having 1 or 2 static stems in each of the multiloops, gives the design a stable scaffold to do the switching from and also helps keep the switching confined to a very small area.

  • What does it look like? Two multiloops in either state, with one static stem as overlap between the multiloops and with some of the other stems as switching parts.

  • How: Make multiloops at the switching end of the aptamer, with the static stem being at the side of the aptamer gate which is furthest and most away from the MS2 hairpin. The aptamer gate will be one of the multiloop stems in one of the multiloops and the MS2 hairpin being another multiloop stem. Each state should minimum have two static stems.

  • Demonstration of pattern in my winner from this round based on one of Vinnies designs.


Fiskers NG3ss - 39   Score: 96.1%

Multiloopspng

http://eterna.cmu.edu/game/browse/5851791/?filter1_arg1=5946026&filter1_arg2=5946026&filter1=Id


  • Connection with other patterns: I think the multiloops owes a big part of their magic to the static stem as a scaffold for holding the switch - along with the static end of the aptamer.  Also the multiloops helps bring the switching elements into close range.

Thx to Omei for showing me a way to nicely sum up what I think a pattern does.


Background article

Lab Drawings

Multiloops in switches

Rfam riboswitch picture archive - of bound shape
(Edited)
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes

On Static Stems = Stability for Switching



I have earlier been wondering about if that static hairpin loop stem which seemed to turn up in many of the lab high scorers in the switching area.


I’m really starting to think that this static stem has a function, other than just pack single bases away out of misfolding harm's way. I think it serve a stabilizing purpose. Because it tend to turn up in the same spot. Right after or close after one of the aptamer gates


There seems to be something about multiloops too that seems to be helpful to the switching, since multiloops turn up in both states in a huge number of the switch labs now. Perhaps not because the multiloop itself is a help alone, but perhaps more that in the most successful labs (SS2 and Ex3), the switch has two static stems - the one stabilized end of the aptamer and one static stem in the switching area.


I think this static stem work in concert with the multiloop in the switching area. I think the multiloop together with the static stem, allows the structure to be overall stable enough to support a switch. The static stem being the stable and unmoving part in the multiloop, while the two other switching stems - the aptamer gate and the MS2 hairpin - have a stable foothold to swap around between them (and an eventual turnoff/turn on sequence too). With the static end of the aptamer holding the whole switching piece of art in a stretched arm.



Normal amount of static stems


It seems like the most of the high scoring designs like to have at least two static stems in state 1 and two in the 2.


So preferable both states should having 2 static stems in each state - one static stem at the non moving end of the aptamer and 1 static stem involved in the multiloop in the switching area. There are some lab designs with more than 2 static stems, but I think having 2 static stems is the basis for a good design.


For now Same State 1, Exclusion 1 and Exclusion 4 this is generally don’t have 2 static stems in each state and therefore they don’t have multiloops. But I bet they want it.


In the Same State 1 lab, this is what one of the current top scorer looks like. It doesn’t have the 2 static stems, in each state, which I think is beneficial to get the structure stabilized and switching. So i think it need one more. So here is what I predict that a lot of the winners are going to look like if we are getting one more SS1 lab. (Green pen drawing)


Salish99_ss1_r3_093  (94%)

Same State - Predictionjpg

http://eterna.cmu.edu/game/browse/5736163/?filter1_arg1=5744854&filter1_arg2=5744854&filter1=Id



Color explanation


Pink = MS2

Light blue = static stems

Orange = FMN sequences

Yellow = Salish hinge

Grey sequence - the aptamer gate furthest away from the MS2 sequence

Green and red - switch magnet segments.



Perspective: Where to put the static stem?

I think I can say where there needs to be a static stem. Based on which aptamer gate is furthest from the MS2 sequence.


From how the grey MS2 mirror sequence is placed in the aptamer gate, the grey aptamer gate needs to be furthest away from the MS2 sequence. Which leaves the static stem to land at tail ends. Which is what I prefer it in any case. That makes the tails knot up. :)


So an added static stem doesn’t necessarily need to be a hairpin with loop. It can just as well be a neck/two end tails tied together also. What determines where it needs to be is the position of the aptamer in relation to the neck of the puzzle.


For Exclusion 1 I think the grey sequence should be after FMN2 and then a static stem - which there is too few bases for in the sequence.


Prediction for Ex1jpg


Similar for Exclusion 4 I think the grey sequence should be before FMN1 and the static stem before the grey area.


Prediction for Ex4jpg


When I had drawn it I realized that topscorers in the lab Brourd’s mod of Exclusion 4 already fits under this. The highest scoring design (when counting 20+ clusters and rerun in round 96) score 87%) follows a close to similar pattern.


http://eterna.cmu.edu/game/browse/5807490/?filter1_arg1=5808453&filter1_arg2=5808453&filter1=Id



Pattern Sum Up


  • Pattern: Static stem in switching area

  • Where it happens: FMN/MS2 switches benefit from having a static stem in the switching area at a particular spot. Which is after the aptamer gate that is furthest away from the MS2 or furthest away from MS2 interaction.

  • What problem does this pattern address? The static stem packs away unpaired bases, which is an advantage since our switches have a general dislike of longer stretches of unpaired bases. The static stem is helping stabilizing the multiloop in the switching area, so the MS2 hairpin and the aptamer gate gets the stability needed to interact either directly or with intermediate sequences to turn on or off.

  • What does it look like? The static stem is like a regular hairpin loop or neck made of the tail sequences paired. It can have different lengths.

  • How: Make static stem minimum 3 basepairs long and preferable 4.


Demonstration of same static stem attachein multiloop shared between the states:



  • Which patterns are this pattern related to? The static stem is causing a multiloop to form in the switching area. The static stem is working in concert with the static end of the aptamer, holding the RNA design in position so that interaction between MS2 and FMN and intermediate sequences can take place. The static stem together with the multiloops, helps bringing the switching elements in close range.

Thx to Omei for showing me a way to sum up easily what I think is going on.

Background articles


Link to drawings

Multiloops for the switch

When to make tails pair
(Edited)
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
Reproducibility data

To make it easier for interested players to look into how the same design scores in different rounds, I merged fusion tables from Rounds 95 and 96 for sequences that were synthesized in both rounds. It turns out that there were 600 such sequences, 23 more than the 577 that johana purposely created. The resulting table is here.

I haven't tried to analyze the table in any depth, but here is one interesting graph that gives a feel for the data.



There's obviously a very high correlation between the scores, but it isn't perfect.  In the worst case, the Same State 2 design Rediin score differed by 23 points between the two rounds.

Caveat: In the merged table, there are generally two columns for every column name, one from each round.  The certain way to tell them apart is that whenever you are presented with a list of columns to choose from, the R96 columns all come before the R95 ones.  For the handful of fields of most interest, I manually changed the R95 names by prepending "Previous" to the R95 data.  If you are going to analyze data from other columns, I recommend you make a copy of the table and make the names unambiguous for the columns you care about.