Pseudoknots and riboswitches

  • 4
  • Article
  • Updated 4 years ago
I haven't been able to track down the exact percentage, but it seems like many, if not most, naturally occurring riboswitches involve pseudoknots. There must be a reason for this. But as far as I know (and I could be very wrong), there has not been much attention to purposely incorporating pseudoknots into our switch designs. So for the current round, I have been trying to do that with my submissions. But I haven't really had any basis for creating specific pseudoknot structures.

But then, it occurred to me that even if players haven't been purposely trying out pseudoknots, we've created and tested so many designs that some have probably been created. And if indeed pseudoknots do have a beneficial affect on switches, they should show up in some of the top designs.

The web server I have found most helpful for pseudoknot prediction is KineFold. Like Vienna fold, submission is easy -- you simply provide your sequence and the server predicts the most probable folding(s), along with free energy predictions. But unlike ViennaRNA, algorithm takes into account the possibilities of pseudoknots forming.

Although KineFold does have an option that might be useful for approximating the effect of bound FMN and/or MS2 molecules, I didn't want to start there. So I focused on State 1 of the Same State labs. After filtering out designs with fewer than 20 clusters, Vinnie's Soteed design (Same State 2, round 2), with a switch subscore of 34.7 (total Eterna score of 94.7) came out on top. So I submitted it to Kinefold and this is what I got:


So KineFold did predict that a pseudoknot formed, with a 6-pair helix running from GU base pair (30, 50) to the GC pair (35, 45). Cool! What was at that sequence of pairs? Why, it was the base segment of the MS2 hairpin! So KineFold was predicting that the majority (but not all) of the MS2 hairpin bases were actually pairing up in the OFF state. That seemed really interesting!

That brought to mind something that Nando has been emphasizing recently -- that knowing the MFE folding (or the entire ensemble) that a switch will take when it reaches equilibrium doesn't say anything about how long the switch will take to reach that equilibrium. It is quite possible that many of the switch designs we have submitted that don't seem to switch much actually would, if only the experimental process gave them enough time.

A switch could take a long time to form the MS2 hairpin when going from the OFF to ON state if the two foldings were such that the RNA had to open up a lot before there was enough freedom of movement in the backbone that it could rearrange itself is a way that would then allow the hairpin to form. But if the bulk of the hairpin was already formed in the OFF state, all it would take is for the bonds that were preventing the final two base pairs from forming to break, and the complete hairpin would just fall into place.

To better visualize this, I ran KineFold's 2D structure prediction through RNA Composer to get some six possible predictions for the 3D structure. Some of the predictions didn't look plausible because they created actual knots in the backbone, but here's one that didn't:


This is Chimera's "ladder" rendition, with the bases colored according to the Eterna conventions. The pseudoknot is on the right. In order to better see what was happening in the knot, I zoomed into the pseudoknot.

I changed the coloring so that the bases that will make up the MS2 hairpin when it forms are light blue, and the bases that are preventing the final pairs of the hairpin forming are magenta. Here it is easy to see how as soon the presence of the FMN molecule makes the blue/magenta helix energetically unfavorable, the MS2 hairpin will snap into place. Intuitively, it seems like the ideal way of constructing a fast acting switch.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 968 Posts
  • 304 Reply Likes

Posted 4 years ago

  • 4
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 122 Reply Likes
this is an unexpected hypothesis. I'd assumed that pseudoknots form in real riboswitches to help produce non-trivial 3D structures with nice 'pockets' to bind small molecules. But this idea is another feasible explanation -- here with the pseudoknot in the small-molecule-free state.

One way to check if the pseudoknot is really there would be to design mutations that would disrupt the pseudoknot (and check if the switch remains constitutively active) , and then mutations that would restore the pseudoknot. You could check the mutants in silico (through kinefold) before submitting in the next round.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Omei, this is really interesting!

Big thx for the fine intro to KineFold. I have had fun playtime at my end. :)

You asked about percentage of pseudoknots in riboswitches. Roughly half the riboswitch sequences that I nicked from a paper I mentioned in the post on base repeats in switches, are known to have pseudoknots too. I know it is likely not a complete set, but might give a feel.

Natural Riboswitches

One thing I have noticed when watching entropy in riboswitches and later ms2 labs, is that there are particular regions in switches that have a habit of lighting up with high entropy.

Besides the often high entropy variants of GGNNGG, or GGNGG, plus shorter variants, there are also often CCNNCC, CCNCC and shorter variants are also often highly active in riboswitches - high entropy areas. This pattern is also in family with the C’s in the MS2 that I have accused of being quite active in making the switch happen by pairing up with magnet segments elsewhere.

This CNNC pattern seems to be more prevalent in the pseudoknot riboswitches compared to the other riboswitches that seemed to have more of the GGNNGG pattern. But they also often have them both.

In particular the C in the loop of the MS2 hairpin are often high entropy area when I run some of the switch designs through vienna. Similar with the GGNNG thing in aptamers, it is often the G inside the aptamer loop that gets the high entropy peak.



Related background post:

Part III - Potential switch pattern?


KineFold and the riboswitches

For the fun of it I ran some of the natural riboswitch sequences through KineFold.

It seems like there are a huge amount of CNNC and GNNG segments that wants to pair with each other. Although not all pseudoknots keep exactly this pattern.

Downstream-peptide motif: Synechococcus sp. CC9902. Alteration: motif
CGUUGAGCUUCCAAUCGAAGCUGCAGUCAGACCCAUGCCAAGCAACGGGGGCGUGGG



Pseudoknot with entropy active GNGG and CNCC patterns

SAM-SAH riboswitch metK: Roseobacter sp. SK209-2-6. Alteration: Normal. Structural Homology Inferred
Sequence: CCUGUCACAACGGCUUCCUGGCGUGACGAGGUGACCUCAGUGGAGCAA



Pseudoknots - knots of unmixed bases

These are the same CCUUCC... and GAAG patterns and various combos of them very often seen in switches in general, where the base pairs in switching stems are not mixed well. Stems with non crossed GC pairs, something which if overused caused trouble in static designs.

SAM-SAH riboswitch: Oceanibulbus indolifex. Alteration: Normal.h
AGAGCAUCACAACGGCUUCCUGACGUGGUGCGUAAUUUUUAUUGGAGCA



Actually what I see is that the beginning and end bases in the pseudoknot is often strong bases. The pseudoknot riboswitches with endloop/endloop interactions actually sometimes do look like kissing loops. This reminds me of a discussion Cody and I had on kissing Loops. We both agreed that the kissing loop sequences pretty much resembled good old stem forming pattern. With GC’s at each end and something else in between. But generally they had a little less of the non mixed pattern and more of the typically stem pattern with GC bases at each end of the stem flipped.


Pseudoknot recipe


Image by Jim Mullhaupt

For the fun of it I brewed up a pseudoknot recipe based on my play session with KineFold.

Short riboswitches with pseudoknots

Put two G’s at close range 1-2 bases apart in dangling area.

Put two C’s at close range 1-2 bases apart in endloop area

Have the bases in between the strong bases be complementary

= voila! Pseudoknot :)


Pseudoknot trends

Ok, there seems to be a sequence distance thing. They can’t be too far apart in sequence. Except if it is endloop/endloop interactions - those can stretch for quite a space.

It also seems pseudoknots mostly have their bases placed outside of loop boost area. Similar to the kissing loops.

There is this funny trend toward G’s ending up in single base area, like hooks, internal loops, multiloop rings and the C’s ending up in end loops. That seems to go for many of the bigger riboswitches too.

If there are no endloop, but gap bases and an internal loop, the internal loop takes the C’s instead. (THF riboswitch folT: Eubacterium siraeum. Alteration: Normal.e)

This should be lucky for our microRNA designs at the moment, as they keeps a dangling tail.

But since this microRNA cause the complementary tail to be mainly blue/green, they shouldn’t suffer danger of turning into pseudoknots with themselves instead of being ready for the microRNA. :)

Trends for contact types:
Short riboswitches (shorter than 60 bases) Dangle/endloop
Middle riboswitches Dangle/loop, Endloop/internal loop, loop/loop
Long riboswitches Loop/loop, gap bases versus endloop, and often multiple knots.

There seem to be more mixed base sequence patterns for the longer riboswitches.

Alternative version - more often used in longer riboswitches with pseudoknots and loop loop interaction

Put 2-3 strong bases 0-1 base apart and make a match elsewhere. Additional bases can be added.

I was actually surprised to see how many strong bases are involved in the pseudoknot formation, since I have seen other much weaker ones. But I guess that is a consequence of that the riboswitches themselves are quite rich in GC basepairs.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
I just found something really cool.

When one has run a sequence through KineFold, one can get it to make and play a video of the folding.



I just ran the one pseudoknot through, that I was looking at anyway:

Name: Purine riboswitch Guanine-sensing xpt mRNA aptamer domain: B. subtilis. Alteration: Normal.

Sequence: CACUCAUAUAAUCGCGUGGAUAUGGCACGCAAGUUUCUACCGGGCACCGUAAAUGUCCGACUAUGGGUGAGCAAUGGAACCGCACGUGUACGGUUUUUUGUGAUAUCAGCAUUGCUUGCUCUUUAUUUGAGCGGGCAAUGCU

The first pseudoknot forms fast. It almost looks as if the pseudoknots control the folding path.
Photo of Brourd

Brourd

  • 438 Posts
  • 79 Reply Likes
Soteed may not be the best design to base this strategy on, given the statistics in the third round.

https://s3.amazonaws.com/eterna/labs/...

Soteed no mod (which was submitted by Omei), was given an Eterna score of 68.

Granted, pseudoknots in designs are a clever way to add base pairs and helices to a secondary structure without increasing the length of preexisting helices. Still, they are tricky to model, and it's even more difficult to apply any of these strategies to the general Eterna player population without significant modification to the secondary structure targets and constraints. I believe I tried a few pseudoknots in the first round. However, none did well, but they were probably more for the sake of curiosity than anything else, with long helices and convoluted systems.
Photo of nando

nando, Player Developer

  • 388 Posts
  • 71 Reply Likes
When a given sequence can score either 95 or 68, supposedly with a good degree of confidence in both cases, can we actually be hopeful that there are any designs to base a strategy on?
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
I found some other trends when looking at the set of riboswitches with pseudoknots, that I mentioned above.

Natural Riboswitches

Small riboswitch pseudoknots 0 to 60

- The knot part placed in the endloop, tends to have a middle position
- The endloop with the knot structure, tends to be big
- The dangle carrying part of the knot, tends to be late.

Middle riboswitch pseudoknots 60 to 100

- The knot placed in the endloop, tends to be placed early
- Middle sized endloop
- Mixed dangle postion

Bigger riboswitch pseudoknots 100 to 140

- Small endloops carrying the knots
- Tendency toward early dangles
- The first knot placed in endloop, tends to have a middle position. If there is a second one, it is skewed towards early position.
- Often multiple pseudoknots

Really big riboswitch pseudoknots 140 to 236

- Endloops can be really big or small + a mixture too.
- Often multiple pseudoknots
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 968 Posts
  • 304 Reply Likes
Thanks for doing this Eli!

Couple of questions:

* What is your source for the predicted folding? Measured (e.g. NMR, X-ray crystal) or predicted ( VienaRNA, FineFold, ...)? Is it always the unbound state, the bound state or mixed?

* When counting base repeats, would the sequence "AGGGGGA" count as 1, 2 or 4 G repeats?
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 483 Reply Likes
Np, Omei! Thx for the fun. :)

My source is only KineFold - though I have run some in Vienna too. So it is only guesswork based on what patterns KineFold lights up in the natural riboswitches with pseudoknots. But I assume KineFold knows a bit about known real pseudoknots, as known structures is often starting point for training of prediction algorithms. I could be wrong.

I just couldn't help but get fascinated, that there seemingly were patterns as I had not expected that many. So it is kind of a play expedition exploring potential area of interest in an attempt to connect some dots.

In this particular case I'm not as much counting base repeats as I'm counting the designs with a specific pattern that happen to contain repeats. Those patterns I took notice of in the set of riboswitches to a degree that I started wonder. So I ran them through Vienna and saw that patterns like GGNNGG and CCNNCC and variations of them often lighted up in Vienna with high entropy, plus these patterns are also often (though not always) present in our own home made switches, in switching area. Its a regular guest in connection with the FMN aptamer - due to the twin G's. Plus build into the MS2 (GAGG).

I simply think they are kind of switch makers. Sequences that makes things move.

However the pattern for the pseudoknots is often slightly abbreviated, often without the repeats. Like:

GNG, GNNG, GNNNG, plus CNC, CNNC and CNNNC.

Those I have in a later column N called CNNC/GNNG