Cooperativity Puzzle Analysis

  • 3
  • Article
  • Updated 4 years ago
This is a new thread for analyzing cooperativity puzzles, starting with Cooperative Binding - double MS2 from R96.

We are very excited about the results, which have been posted in Eterna and also summarized in a PDF (, and Google Doc (

Looking forward to hear your thoughts!
Photo of johana

johana, Researcher

  • 96 Posts
  • 45 Reply Likes

Posted 5 years ago

  • 3
Photo of johana

johana, Researcher

  • 96 Posts
  • 45 Reply Likes
What we are trying to accomplish for our RNA designs is sharper transition from unbound to bound as we increase the ligand concentration. In the first puzzle we used MS2 protein as the ligand and the MS2 hairpin and the binding site. The cooperativity is described by the Hill coefficient, n, which is larger for more positive cooperativity.

In the results we have generated the experimental data and the fits, using the Hill equation.
Photo of johana

johana, Researcher

  • 96 Posts
  • 45 Reply Likes
The number of clusters for the cooperativity puzzle designs in R96 was very low. The median number of clusters was 4, compared with 21 for the switch puzzles, and the distributions look very different.

Since we have few clusters, which results in a larger spread of fit values due to poor statistics. As a consequence, many of the high-scoring designs are probably due to having only one or a few clusters. The blue points in the plots represent the cooperativity puzzle.

Photo of Brourd


  • 477 Posts
  • 86 Reply Likes
It's possible we are once again looking at an issue where length is affecting synthesis yields within the pipeline. In addition to that, isn't there a different and shorter sequence attached to the 3' end of the RNA designs for sequences with lengths greater than 85 residues?
Photo of johana

johana, Researcher

  • 96 Posts
  • 45 Reply Likes
Yes, this is likely related to the length of the designs.

The cooperativity puzzle had the shorter flanking sequences on both ends, which may affect the PCR efficiency.
The clustering on the sequencing chip is also known to be worse for longer sequences.
A third issue is that of synthesis yield. An extra ~20 extra residues means 20 more opportunities to introduce synthesis errors. It is worth going back and see how the yield drops off as each new base is added.

These three issues probably conspire to worsen the yields we see on the chip.
Photo of Brourd


  • 477 Posts
  • 86 Reply Likes
It may be an excellent idea to try and analyze all usable clusters and sequences from the cooperativity lab puzzle in order to determine if specific regions of the wildtype RNA sequences are more prone to mutation events, and then try to minimize the use of these sequences in future designs.
Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes
U repeats

Designs with high amount of clusters, again generally have an insane amount of U repeats and often very long ones. This is the same for the other MS2 labs too and designs with very high cluster counts.

However in the other MS2 labs, the winners - also those with a somewhat decent cluster count on at least 20-30 clusters, seems to actively dislike having longer U repeats and lots of them.

So somehow long and multiple repeat U's help cluster counts - but it doesn't seem to improve the amount of actual winners.

Repeats of repeats

One more thing stands out in the lower scoring designs. They have a high amount of repeat bases in general.

Background posts

U repeats

Repeats of repeats:

Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes
Cooperative lab data

Most of the designs had very low cluster counts. So I put a cut limit on 10 clusters on what I wanted to take serious. Just like I had for the first MS2 lab results. With the intention of raising that number when the future data will allow it.

For most of my designs, I have had the MS2 sequences somehow to be in contact with each other in the OFF state, either for short or long stretches. I had mostly not build in strong preventions against MS2/MS2 interaction. What I had hoped was to use the identical sequences or parts of them for turnoff/brake.

I sorted the data after score and a cluster count of 10. The two designs that got a somewhat decent score were both by Brourd. What they both have in common, is locking up the MS2 aptamer far from each other.

ThBP B  (86%)

ThBP C  (84%)

Looking at these designs, the MS2 are practically prevented from interfering with each other whatsoever.

Omei’s mod of one of my riboswitches that did not do too bad, also had the MS2’s a bit locked up away from each other. While still leaving complementarity for a MS2 turnoff with the MS2 sequences themselves.


What I was often attempting was both switching, around the individual MS2 switch, but also switching between domains.

But overall these lab results got me thinking as this was not what I had expected. I had kind of expected cooperativity between MS2’s to be like a game of domino’s. I assumed that the domino effect of making one MS2 turn on and to get it to spill to the other, had to do with some closeness in space. So that change in one MS2 binding, would directly touch and affect binding in the other MS2. So a kind of more push interaction. Not a seemingly more detached independently binding units.

This all got me wondering about what a real natural double riboswitches would look like.

Natural occurring tandem riboswitches

I was reading about the glycine riboswitch that Johan mentioned in his sum up on the first round of Cooperative switches. Results for Eterna R96: Cooperative Binding - double MS2

So I decided to look that Glycine Riboswitch up and found an image.

Each these repeat riboswitches, appears to be quite similar. Both in sequence and structure. But with the fundamental detail, that both structure and sequence varies slightly between them. The presence of both structural and sequence variation fits with what I would have expected, from what I have seen in eterna lab when there are identical sections of structure.

Another interesting thing is that each switching area in the glycine riboswitch each contains what appears to be a static stem stem. (P2) :)

Each of the glycine riboswitches also have multiloops, despite these kind of aptamer only involves one switching element and not two like our FMN/MS2 ones. I see multiloops turn up in many other riboswitches with a single ligand, as well.

Digital Riboswitches

I read a few papers on the Glycine riboswitch and I found this bit interesting. The aptamers being more separate and them operating cooperatively none the less.

I particularly found interesting that the glycine riboswitch functioning as a digital sensor. Or explained like this: A light switch, where 1 = light 0 = darkness. So extreme and big changes, rather than just turning light up a bit or down a bit. So I’m guessing the other aptamer types can be called analog. :)

Advice for next round

Now we have very few winners and still shaky data for round 1, so take this as only pointers towards what I think will work.

  • Make separate MS2 riboswitch units

However this bit I am very certain about. When you make two separate RNA sections - here two riboswitch sections holding each their MS2 - you should:

  • Make sure that both MS2 riboswitch units vary from each other in structure (can be either stem or unpaired base region)

  • Make sure that both MS2 riboswitch units vary in sequence

So hereby the designing tip is passed on.

Background articles

RNA domains

Huge RNA

Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1025 Posts
  • 332 Reply Likes
Thanks, Eli.  This is a good summary of a rather hard-to-interpret set of data.

Expanding a bit on Brourd's ThBP B submission, compared with its variations: Taken together, ThBP B, ThBP C and ThBP D (which differed only in a few mutations that didn't change the structure) were represented by a total of 30 clusters, and they had an average score of 84.  On the other hand, ThBP A (15 clusters) had an unpaired strand in place of the static hairpin, and it had a score of only 70. This suggests that binding up long strings of unpaired bases into static hairpins may be beneficial, just as it has shown to be beneficial for the switches.
Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes
Hi Omei!

Good point also with the ThBP A.

So there are two factors. That the designs seems to want the MS2 aptamers well seperated. Plus that our MS2/FMN switches often seems to prefer having as little naked unpaired sections as possible. I'm guessing both.
Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes
There might even be a third factor. The ThBP A design differ from the other 3, in amount of U's and length of repeat U's.

I have noticed that multiple and longer repeats of U's tend to raise cluster counts. However they also tend to decrease score.
Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes
How did the Glycine riboswitch get a twin aptamer?

More thoughts about the glycine tandem riboswitch that I have been interested in lately. I put it in this post as the glycine tandem riboswitch is a natural occurring cooperative riboswitch. I have been wondering about how that came about.

I have been reading about proteins in the book How proteins work by Mike Williamson, in attempt to satisfy my curiosity about how proteins are similar and different to RNA. While both do share many characteristics, I am starting to think proteins and RNA really are two very different “animals”.

I was also checking out fluorescence as I read a recommendation (I forgot where) on checking out Osama Shimomura’s life story and his part in unraveling the mystery of bio luminescence over at the nobel organization page. The introduction captured my attention and I have since been watching a huge bunch of videos on fluorescence. He was the first to isolate the glowing protein luciferin and later other bioluminescent proteins.

Fluorescence have ever since been a critical tool for scientists to understand what goes on at cell level and figuring out mechanism of disease. We owe our lab results that is brought to us by green fluorescence, to Shimomura’s determined work.

So I have been watching multiple videos on the discovery of the fluorescent proteins. It has a rather fascinating story line and it ended me with a video on the discovery of different kinds of fluorescent protein later found in nature. Some of the fluorescent proteins were originally found as dimers or tetramers units together and were connected up with some sequence. The scientist had to make mutations in the connection area, to get the units separated so 1 unit could be used instead of 2 or 4.

The scientist had to mutate a long bunch of amino acids in the unit itself to get it working well alone also. So this bit is not only happening to RNA. A protein domains can get affected by mutations in its border regions and weather it has a neighbor or not. Although I suspect a protein domain tends to get less affected than a RNA domain, just guessing that a glowing protein is more specific in its wants than any other proteins. However that is only a guess for now.

My main point in relation to proteins versus and RNA, is that 4 this big and identical looking structures in the Red Fluorescence Protein, would not work in an RNA. Though for sure, this is a beautiful beast.

(From minute 41)

So now I speculate how the development of the glycine tandem riboswitch went on.

For proteins, domain copying have happened a lot of times. Proteins seems to dig reusing both sequence and structure for some domains, and only seem to need asymmetry for making bigger structural switches.

From what I understand, sequence variation mainly creeps in, in the regions where the protein domains attach to each other.

For proteins the sequence in general can be used to predict the structure of the protein. And multiple identical units can often be found in a protein, just linked together and it will fold. E.g. hemoglobin is build of two dimers (twin units) that has each two units. Together they can bind 4 oxygen. 

However many identical units in RNA simply doesn’t fly - without causing serious misfolding instead.

So my headache was just how in the world, evolution managed to get away with making the a glycine tandem riboswitch in the first place. I’m guessing that it has been business as usual with a duplication of the riboswitch sequence - just like many proteins have acquired additional copies of the same domain - by a copy error.

I think it went like this, first copy error and then later structural and sequence mutations acquired over time. The newly made tandem riboswitch would be identical both structurally and in sequence. I imagined them being close by each other, as I think that would be needed for cooperativity to happen. All of these is a real bad starting point, as I have earlier pointed out that identical sequence and structure for several identical units are magnet for misfolding. A misfolded riboswitch is not binding any ligand.

I think sequence and structure variation would be needed from the get go. However what I have a hard time imagining, is all these needed changes happening all in the same go. Perhaps this is why there are so few natural occurring RNA tandem riboswitches in nature?

Sequence mutations should be necessary to get both aptamers work properly and not misfold with each other. Structural variations would be necessary too, to not having too many identical strands with the same sequence, that are also complementary not only to their intended partner, but a whole extra set of matching strands. With two identical sections - in structure and sequence - frequency for potential matches for each strand just got more than doubled.

One thing that could have helped, would be if the individual aptamers had been copied to a place not right next to each other, but at distance and had time to acquire some mutations and structure variation before getting copied in close together.

This would make more sense than a one go copy process, as I think two fully identical aptamers in sequence and structure, will not be working well if close together in sequence. The glycine tandem riboswitch has its two domains fairly close together. I think size of domain matters too. The bigger the domain the harder it will be to make a copy and have it working as next door neighbor. Basically I think the bigger the RNA, the more structure and sequence variation is needed. I think for the glycine riboswitch, the copy process could have placed the copies a good deal apart and then later when structural and sequence variation had already crept in, then copied them back close together.

My evolutionary headache is cured. At least for now. :)

Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes
Adventure in Riboswitch Wonderland

Since my adventure started in the cooperativity riboswitch department, here is where I will put it up. Thx to Omei for discussion.

Switch Elements - the bases of switching

While I’m aware that it is very well possible to create aptamers to pretty much anything via lab evolution, I have started wonder if there is some sequence pattern to the natural riboswitches.

I keep seeing a lot of the same sequences come up. Both in our switches but lately also in the natural occurring ones and those that are engineered. I keep seeing exactly the same 2 kind of sequences pop up. Stretches of purine (G and A bases) and pyrimidine (C and U bases) particularly in the switching area, but also to some extent outside of it and in frequencies that are not normal - at least not for static designs - I have been scolding them quite seriously. :) And I even keep seeing these purine/pyrimidine it turn up at similar spots. As such I find it very interesting, as it suggest that we might be able to use it rationally to make better switches.

RNA is a game of frequencies - and it is about balancing those frequencies against each other. Too much of one, two frequencies  or more frequencies - however good it/they are on their own and it will cause misfolds.

However different types of RNA have their own distinct frequency patterns. Like RNA with short stems - needs high GC base pair frequency and RNA with long designs prefer a more AU heavy solution.

For static designs a normal pyrimidine and purine pattern is that purines primarily turn up in the loop regions in static design, and pyrimidines tend to turn more up in the stem regions. However normally the base pairs are far better of when mixed. Long stretches of either purine or pyrimidine in stems are suspicious - unless the stems are very long or perhaps a coaxial stacking 4 way.

For the switches I see a purine/pyrimidine frequency pattern that breaks the usual pattern. So I say there is a pattern - that's interesting - lets figure out how to use it.

Riboswitches sum up

Pyrimidine C and U bases at active aptamer site

  • The switching seems to be initiated a lot by particular sequences. Either with CU segments at the active molecule catching site or with GA segments instead. Especially the pyrimidine CU segment, with short and flexible bases seems to be very good at catching a molecule.

  • This kind of riboswitch typically contain an aptamer loop with 5-6 CU bases in the loop ring and with two GA stretches in stems, close by in either sequence or space.

Purine G and A bases at active aptamer site

  • But also regularly the aptamer loop bases that are doing the catching are the more rigid and bigger purine GA bases. (Like in the FMN aptamer)

  • This kind of riboswitch typically contains an aptamer loop with 4-5 GA bases - can be spread on either side of the stem attached to the aptamer loop and two CU elements in stems close by either in sequence or space.

Pyrimidine stretches outside of the aptamer

  • The MS2/FMN turnoff sequence and variations of it keeps turning up in the switching area close to the aptamer and sometimes even inside it.

  • Small hairpin loops regularly contains a pyrimidine sequence in the on state, that is likely used for turnoff in the off state. They can typically be seen in a hairpin loop in the on state.

Purines versus pyrimidines - or G and A bases versus C and U

I’m currently learning about alkanes in Khan academy. I have a chemistry book that I try follow alongside.

It mentioned that when alkanes were forming up in rings were less flexible than just the same amount of carbons in a chain. (Page 60 in Fundamentals of Organic Chemistry, 7th edition, International version)

This made me think of that pyrimidines - U and C bases - only have one ring of 6 - though not all carbons, whereas the purines have two rings. One the size of 5 and the other of 6. (For more on pyrimidines and purines check this forum post)

This made me think that the purines must be less flexible than the pyrimidines and as such pyrimidines must have a lower melt point as purines. So I went and checked and really it was so.

Another thought also popped up. I had seen those flexible pyrimidines in action a lot somewhere very specific. In other words, in the switching area of the switches. Not only in the FMN/MS2 labs, but also the solely FMN labs and the TEP labs too. Even in the winner and top scorers of the Theophylline Hammerhead riboswitch lab we did a long time back.

Now the bases, no matter which type, are attached to the same kind of backbone, but I still think how much they fill, has some role to play.

Perhaps something else is in play? I read that when two proteins bind up with each other, they like rearrange themselves such that the hydrogen bonds between them were at similar distance. (How proteins work, Mike Williamson, page 22, figure 1.30) So perhaps this is the explanation of the similar kind of purine bases next to each other, and similar for pyrimidines. Then I would expect to see short U and C’s at the end of long based purine stretches in aptamers and similar A and G’s at the ends of pyrimidine stretches.

Omei: One thing I did want to say as I was reading your recent thoughts, but then couldn't find exactly where you wrote it.  But it was about the strand segments of all purines or all pyrimidines. It makes sense to me because of the spacial arrangement of the base pairing geometry when they form a double strand.  I'll be very interested to see if these always play the same role in riboswitches, or whether that are more of a multi-purpose tool.

I think it is actually both. Both the purine/pyrimidine strands should make it somehow easier to break a stem, but also that these pyrimidine stretches or the purine ones are good for holding substrates at an equal distance. I added a drawing, the one below. Because I keep seeing these CU and GA stretches not only turn up in the stems around the switching area, but also in the aptamer loops themselves. Not as the sole bits of the aptamers always, but as a strong player. I even think that these players have somewhat fixed roles. They are more likely to turn up some places than others, both in sequence and geometric space.

That does not mean we can't likely stray from the pattern that nature seem to like. I just think the natural and engineered riboswitches have a lot in common. Some of the different types are really just negatives of each other. Either they keep the GA pattern in the aptamer ring. Or they keep the CU in the aptamer. The rest is just a matter of matching up and making sure the complementary stretch is there in a stem nearby to be able to turn on and off the pattern in the aptamer. Plus I think there is a trend for the pyrimidine or purine pattern to be longer in the aptamer and then shorter in the stems, probably to make the aptamer pattern win out.

Aptamers of purines or pyrimidines

Similar I had see this C and U sequence pop up in the riboswitch that I had been reading about lately - namely the glycine riboswitch.

So now I started wondering if this pyrimidine CU pattern or conversely its counterpart purine GA, were in action in other natural occurring riboswitches. And I keep seeing it turn up. Either as CU in the aptamer loop directly or its negative GA in the aptamer loop instead.

The FMN piece inside the MS2 hairpin

A while back Omei sent me an image he had taken of a lab he was working on.

"I was working to create an Exclusion 5 lab switch that relies on a pseudoknot forming in the bound state when I stumbled across a pattern that seems very general. It also, in retrospect, seems very obvious, so I am guessing that we already have some data on it.

The basis for the pattern is that the aptamers for FMN and MS2 share the common sequence of AGGAU.  So the general pattern, which seems promising for any lab, would be use the complementary pattern AUCCU in a way that binds with either one or the other aptamers."

Omei’s image

Omei asked if I recognized seeing this sequence in other labs.

I did recognize this little fellow. I mean not the exact sequence. But I recognize it for what it does. Its the MS2 turnoff sequence. And there is a number of variations of it. Like GUUCC, GCCUU and the microRNA labs have other similar but less locked variations, since they have no FMN to conform to. The main ingredient however are the CC's, since they are compatible with both the G's in the MS2 and the twin G's in the FMN. The MS2 turnoff in MS2/FMN in general have 2 C's and then some U's. Not always a G. The MS2 turnoff sequence is often very similar inside the same lab. How strong it needs to be partly depend on how far in sequence and space it has to move.

The case Omei had found is a perfect match between MS2 and aptamer and makes it much more obvious what goes on. It is not only the MS2 turnoff. It is the aptamer turnoff as well.

While I knew that with this turnoff sequence it were often the goal to target both the MS2 and FMN, I had not fully thought through that the FMN and MS2 actually carried two identical sequences. The MS2 has a kind of an mini FMN inside it.

This CU pattern runs through like an under stream in many Exclusion labs. The CU aptamer turnoff pattern also runs like a wildfire through the past FMN labs, even its GA partner. (Periodic repeats) Mostly it is important for turning off the MS2. It is often, but not always involved with the FMN also, as the MS2 C's are also regularly used for directly for aptamer turnoff. Although it regularly is used for both MS2 and aptamer turnoff. Also the more zipper complementary solves, target the aptamer gates instead of the aptamer sequence itself. Basically they do the same as the aptamer and the MS2 sharing sequences -  they just make the aptamer gate strands complementary to the strands of the first stems in the MS2.

The MS2 has the pyrimidine CU pattern hidden up inside it, whereas the FMN aptamer has its purine negative of GA’s.

Actually the MS2 aptamer is magic. It has both purine GA and pyrimidine CU pattern available inside it.

The FMN aptamer is even more magic. Not only does it goes both ways - it can be involved in either side at each of these end. Sometimes even both at once. :)

As I said to Omei, it was as if the FMN and MS2 were meant for each other. Both because they have sections identical - carrying a part of the same switching element - so they could be made to share a partner (Exclusion), but also because they have parts that could directly interact (Same State). MS2 C’s to FMN G’s. However they do care - a lot - about how they are placed in relation to each other. Too far and too close and they are not effective.

So next question was if these pyrimidine/purine strand patterns, was not just an oddity of our ingame switch labs and a few outside ones.

Glycine riboswitch

But why there regularly seems to be additionally many CU and GA stretches in switches, I have no idea. Can it really be that the whole unbound switch is moving in some cases and bigger parts sometimes?

I have earlier been complaining about these specific unmixed base sections in static labs. Bases in stems that are not well mixed, have more of a habit of breaking open and mismatch. Longer stretches of CU’s, GU’s, CA or GA’s are bad. Especially if there are multiple strands of them. Ok, in the beginning I thought that long lines of CU’s were beneficial, because I saw them in longer stems. (I wonder if that kind of sequence has any affect on coaxial stacking or it is mainly loop regions that contribute to that?)

So basically what I’m speculating in is that there are particular sequences that are particularly good at switching, over others. I have earlier been pointing out that the switching seemed to have more repeat bases. Purine and pyrimidine repeats are just a different flavor of repeat.

Loose thought. What if the potential free GA stretches after this imagined turnoff, could start sense the glycine when it is around? Glycine likes to bind to GA stretches anyway. And what if this is enough to make the turned off riboswitch let loose and open the real binding site to the glycine? Could this be an explanation on why so many extra of these seemingly unnecessary GA and CU stretches turn up? I mean, really only the ones in the aptamer are needed.

Theophylline Hammerhead ribozyme and the turnoff sequence

Orange highlights the only sequence we could change.

Winner by AndrewM2A

The highest scoring designs have a high rate of C’s and U’s. And sequences that could look like MS2 and FMN turnoffs.

Notice that the 4’th highest scoring has the exact sequence which is a match for a section in both FMN and in MS2.

What the theophylline riboswitch looks like in Eterna. (Locked bases) It practically has a FMN/MS2 turnoff sequence built in. :) Double U and double C works like a charm to turn off MS2 and when reversing them, one can choose which of the two FMN sequences to target.

Again it have one of its CU segments in the hairpin loop in the on state. Actually this pyrimidine hairpin loop and rather specific sequences of it, is something I see turn up in a lot of switches in the ON state. I generally think it is used for turnoff with a purine stretch in the off state. (I'm not sure why the two sequences are not fully identical)

More unmixed bases. (Orange box) In the past I have been particularly angry at longer GU stretches since they had a habit of splitting our fine single state puzzles and make misfolds happen.

I had been wondering about what it was with that G that kept turning up after, before and sometimes both, for the CU lasso. I think this image shows very well. They works as stabilizers to hold the aptamer when it is having its molecule around.


This is really interesting. The homepage says that the binding of the theophylline makes this small AGG section available for the ribosome and translation. So the dangling section of GA’s are needed for the translation process. The riboswitch is not an island - it is part of a bigger whole. :)

I asked google if ribosomes really did want purine starter sites and found a page by iGEM.

Ha! It really do looks like it.

“Very roughly speaking, ribosome binding sites with purine-rich sequences (A's and G's close to the Shine-Dalgarno sequence will lead to high rates of translation initiation whereas sequences that are very different from the Shine-Dalgarno sequence will lead to low or negligible translation rates.”

This is about bacteria. But really thats also where aptamers and riboswitches have been found. So I’m game. :)

It looks like these iGEM guys are doing a lot of thinking about sticking different bioelements together, with intention of being able to build with it. Could be a rather interesting site for us guys. Looks like we are family. :)

Small switches

I have been doing some color highlighting in the image. A, B and D have the CU sections outside of the aptamer loop area. But the GA purine section in the aptamer. I notice they catch the ligand by making a bow circle around them. Its pretty.

Notice how remarkably similar the hairpin loops are. Actually they carry a variation of FMN and MS2 turnoff sequence - CUUCG - and the G at both ends of the sequence.) Not too sure about the c figure, as it behaves a lot different. But the others I have highlighted what I guess is the switch.

I’m guessing that if anything switches for real in the off state, it will be the aptamer GA’s with the hairpin loop CU’s. Those are the shorter and easier to get moving. The neck of the design is sealed pretty strong off with GC base pairs.

And basically this is what I think these CU and GA stretches are for. Getting the switch move easier. Depending on if GA’s are in the aptamer - then I guess the closest (in sequence if not space) pyrimidine CU stretch are used for riboswitch turnoff. Similar if the pyrimidine CU stretch are in the aptamer loop, then I guess a GA section are used for turnoff of the aptamer when there is no ligand around.

Adenine riboswitch

Another thing I find interesting is that natural riboswitches can bind purines. The adenine switch above was one of them.

That lead me to wonder if not such an aptamer would keep pyrimidines in the aptamer ring go catch the purines. And it looks like it.

Actually there are even more of these CU and GU stretches. That's an unusual high frequency for such a small design. Its not something I would have recommended if one wishes ones design to be static. This is why I recommend crossing at least some of the GC pairs in most of the stems. Which this one actually follows. But still an unusual sequence frequency, especially if the design had to be static.

I find it a beautiful detail that pyrimidines are holding an adenine.

Also it is visible that the two hairpin loops are actually kissing each other. This detail also got mentioned in a paper by Rhiju: Link. I have been wondering if static stems had a function, and they surely do. :)

This odd frequency pattern reminded me of a classic eterna lab, The cross lab.

Back then I actually thought this long CU pattern were beneficial, (Blue green strand) however later I changed my mind, to that it was only tolerated in longer stems and that it was bad in anything with short stems that was supposed to be static. Especially if the pattern went between loop and stem - which is exactly what I see it do in these switches with CU’s in the hairpin loop.

The blue green strand pattern ran rampant in this the cross lab. And I’m suspecting that it is somehow beneficial when coaxial stacking is going on. But I have no idea. I’m guessing that it prevent nearby neighbor stems of pairing up unwanted. Usually two complementary strands wishes to pair with each other and especially if they are close in sequence. Not too close - for a loop to form, but not too far either. If two fitting strands are neighbours with enough space bases to gain a hairpin loop, they are as good as a match. And this anti parallel pattern between neighbouring strands ensure that a strand doesn’t go for the neighbor before when it is meant for the neighbor after. At least that is my guess. So perhaps something similar is in play in this adenine switch as it looks like it is fully capable of lassoing the adenine for a bind, by getting two stems coaxial = energy bonus and two hairpin loops kissing - guessing that's an energy bonus too.   

Further in my riboswitch search, I even found this thing called a G box.


Ah, I think this is what is called a G box = guanine binding. Ha, I think I rediscovered the g box riboswitches. :) So there are a whole category of G box riboswitches - having CU stretches in the aptamer.

If this is the case, then a C box category is needed too :) Since I have seen other riboswitches use purines for the aptamer catches. Like the FMN aptamer.

MicroRNA versus pyrimidine dangle

I think the pyrimidines/purines switch thing are also a reason why these overwhelming pyrimidine dangles are productive in microRNA labs. They are flexible too to sniff out their partner microRNA.

And our microRNA catching designs, likes to have dangling tails of C’s and U’s. Plus a combo of 4+ pyrimidines - (in mixture) seems to raise the yield in cluster counts. They are often present in Exclusion labs designs with +100 clusters. Unfortunately it also seems to hurt the KDoff.

Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes
By the way G magnet dangling tails works too, not just C magnet/pyrimidine tails.
Photo of salish99


  • 295 Posts
  • 58 Reply Likes
From the R98 results
cooperativity influence on cluster formation:

there seems to be a slight though significant dependence of the cooperativity on the cluster formation. In simple terms, the higher the cooperativity, the more clusters are created. While this holds true as a general trend, the bulk of designs scores up to subscore 22, up to which point the general trend holds true as well. As for all subscores higher than 30, it appears the actual values all fall short of the predicted trend. Then again, only five of us managed to get such high cooperativity points, so it is difficult to apply any real data analysis. On the other hand, 100% of these five values are represented by data at >10 clusters, so they are valid designs.
Photo of Eli Fisker

Eli Fisker

  • 2328 Posts
  • 541 Reply Likes

Cooperativity, Round 3

Small sum up of what trends I see now.

Cooperative Binding - multi MS2 (shorter)

  • Not happy about MS2 gates - with stems before the MS2’s

  • Seems to dislike great distance between MS2’s - although they can also be placed too close (few bases distance)

  • Not too happy about static stem between the MS2’s.

  • Not too happy about a neck of the tails either.

  • Not particular fond of GU’s

I have been attempting to sticking in both static stems and necks in these labs. They haven’t been particularly willing so far.

But JR was almost getting away with both a neck and a static stem in the switching area. And it has a cluster count of 95.

Score 87%

Cooperative Binding - multi MS2

Carries many of the trends of the shorter lab.

  • Not happy about MS2 gates - with stems before the MS2’s - but might be more tolerant

  • Seem to like static stem outside of the switching

  • Seem to have a wider range for distance between MS2’s

  • Slightly opposed to necks and static stems between MS2’s, but have a bigger tolerance to it compared to the shorter lab.

  • Still not too fond of GU’s

But as Johan mentioned in his analysis, this lab longer lab had a far lower average cluster count and that a good cluster count seemed to be related with good data quality, so I have probably already said way too much about this one.

GU content

One thing I do think I can say something about is the GU content or rather it not being strongly present. I think the reason why these designs labs don’t need much GU’s, are that the loops that form between the MS2’s already works as a more effective splitter than GU’s. The bigger the loop, the lower the resulting kcal between the pairing MS2 sequences - or whatever sequence one has pairing with each other. Thus the loops are already helping with the splitting.