Thx Johan, for the awesome lecture on the dimers and MS2 results! For those who have not seen it yet, check it out here:
Riboswitches On Chip results presentation
The microRNA lab is also related to the XOR puzzle challenge, that also has microRNA’s, here is link to the forum post on that as well.
Our first multi-input puzzle!
Sensor for hsa-mir-208a
- Mir likes to pair in full with lane 2
- MS2 mainly prefers pairing with mir and itself. Usually the first stretch of MS2 slides and pair with itself, leaving the last part of the sequence pairing with the early part of the MIR before its stretches of A’s.
-This lab has more variations in how its winners are solved than the Turnoff v2 variant, so for GU’s there seems to be a pattern for those where the CCCAC stretch seems to form a loop/stem, like having 0 GU in state 1 and 1 in state 2.
- The high scorers really don’t use G or C segments. Instead they have a mirror fragment that is complementary to an early stretch of the MS2 sequence.
- Has static stem that seems to form of otherwise inactive bases. (6-7 base pairs)
- Generally seems to be partial moving switches
Turnoff v2, variant 2
- Mir likes to be part of a multiloop and pair with both late lane 1 and late 2. Thus forming a multiloop with the design.
Actually this behavior of both ends of the mir being split up and the loop section form very much reminds me how the microRNA’s behave in the XOR puzzle. This virtually goes through most of eternabot’s solves. Although here, the middle “loop part” of especially the “yellow” microRNA, gets made into a internal loop and both microRNA “stem stretches” are made to form stems with the design. Similarly for the “green” microRNA, it often ends up having its ends form stem with the design, whereas its middle “loop part“ ends as multiloop ring stretch.
Wuami's doc of Eternabot solves:
- MS2 likes to pair with early lane 2 + self
- Has a C/U segment. It was more often the red stretch in the MS2 hairpin that were targeted, where the C stretch were often partly left outside in single base area.
- State 2 seems to like having more GU’s than State 1, but generally both states seems to like them.
- Has static stem that seems to form of otherwise inactive bases. (10 base pairs) I bet length will largely depend on how many single bases are left over and not part of the switching. Bases not used for switching is generally left better off, by getting tucked away, by pairing with themselves. Except for a few spacer bases in multiloops, internal loops, and endloops and sometimes dangling ends.
- Generally seems to be partial moving switches
Spreadsheet Turnoff v2, variant 2
Turnoff v2, variant
This one is more mixed.
- Mir likes to be part of a multiloop and pair with both Lane 1 and lane 2. Preferable close to the MS2 sequence.
- Mir likes early lane 2 and middle lane 2
- MS2 has no clear preferences with what data there is yet.
- Most seems to be full moving switches
Spreadsheet Turnoff v2, variant
Thoughts on the microRNA labs
Here is a screenshot from the lab list. As can be seen, most of the top scorers follows the exact same pattern for a solve. First a static stem in both states and then a mirror complementarity game of 6 base pair stems going complementary from MS2 to MIR.
Broken more down in details
MIR 208A - variant solve
Now there is also a variant solving style, which is highlighted in the above lab list. Here is a drawing for how those designs fold up.
The minority solves follows a magnet segment style - having short but strong segments do the switching, whereas the majority of the designs follow the complementary fragment style. Or zipper as Nando calls it.
For this solving the mir pair up in lane 1 which is opposite most of the top scorers in round 1. I made a variant for the round 2, where I made the MIR sequence cover Lane 1. (The stretch before the MS2 sequence).
I’m not expecting it to work as well, as I can't make the MS2 turn off sequence also be complementary to the last bit of the mir sequence. I think the mir fragment 1 is far less suited to become complementary with the MS2 complement. And therefore I strongly suspect that having mir pair with lane 1 instead of the top scorers that had mir pair with lane 2, will not do very well.
Sensor V2, Turnoff variant 2
First an image of the design
Now the drawing
Notice how each strand is complementary with the next or almost - there are a 1-1 loop forming in each state in the zipper areas, as can be seen in the first image.
In the other mir lab, Sensor 2, turnoff variant 2, there are similar tendencies, however the strands are not directly overlapping to the same degree as in shown above, but often sliding a base or two between complementary shifts.
However this is not necessarily a bad thing, as this show that if it is impossible getting a direct complementary match going from MS2 fragment to MIR fragment, then one can make the strands slide compared to each other and still achieve the same thing - a working switch.
Notice that it uses the exact turnoff mechanism for MS2 as in the above drawing of the mir 208A lab (when looking away from the states are reversed)
When the MS2 hairpin is turned off, segment 1 and 2 pairs up.
Example with lab design based on Mat's highscorer mod of JL: http://eterna.cmu.edu/game/browse/548...
Drawing of switching mechanism
And a special shout out to Eli, Mat and salish99 for having so many winners in this round!
The trends that I mentioned from last round 208a MicroRNA lab held.
There are two main solve variants. A minority and a majority type. The majority type uses complementarity for switching and the minority one relies more on magnet segment.
More about it here:
I was curious to find out what was normal entropy for microRNA’s. I found a database over microRNA and checked under human microRNA
I ran through around 30 different microRNA’s through Vienna, to get a feel for what entropy I could expect. Most of those I seen wouldn’t be very happy about forming a strong stem with themselves. Most are in a 0.8+ entropy range. I only found 1 with lower entropy (0.2) and most are above +1. And there is a tendency for raised entropy in what ever stem region may form.
They are either a long mix of A,G,U with little or no C’s
And even if they have a more balanced rate of G’s and C’s, entropy is still way high.
Probably makes good sense that microRNA are not too happy about forming with themselves. Means they are ready for action when a fitting messengerRNA or one of our designs come by. :)
I lately found a couple of really interesting videos about microRNA.
Medical Animation – MicroRNAs from Katharina Sophia Petsche on Vimeo.
Friendly scientist presentation style
MicroRNA - Amy E. Pasquinelli, University of California, San Diego from Kavli Frontiers of Science on Vimeo.
For those who want more and who haven't already found the videos I previously shared, here is a playlist with my favorite microRNA videos so far. I'm particular fond of the first two by microRNA pioneer Anna M. Krichevsky.
I have been drawing some new drawings of the microRNA labs. Not as much because the tendencies in my drawings of the 1 round changed much - but more as a way to cast light on different areas that I think can be of potential interest.
A landing spot for the microRNA
I mentioned earlier that all the topscorers in the microRNA labs seemed to have a dangle of single bases hanging around, complementary to the microRNA. Salish got me inspired, so it ended up over in the MS2 switch post, while it was really about microRNA.
MicroRNA tail dangle
I simply think the microRNA uses this complementary dangling spot in the design as a easy landing spot.
Main solve types versus minority solve type
All the majority type of solves, have a loose dangle of a tail that matches to the one end of the microRNA. Normally designs pack away most of their single bases. At least at the ends of the RNA sequence. And single stretches of bases - with a majority of non A’s are usually hid between elements, rather than at ends. If at ends they have a habit of causing misfolds. Especially when there is as much non A content as is the case for these particular microRNA complementary dangles. In the single stranded barcode labs I made with a dangling barcode, they had their barcode tails pair up unwanted with all sorts of things - like nearby sequence or sequence nearby in 3D space - like 5' hook or neck bases. That’s why I found the dangle present in the microRNA’s rather peculiar.
Single Strand Barcode
But the minority solves, don’t have their dangling microRNA complement fully at the end of the section. They have an element in between.
When I mentioned this difference between majority and minority solves to Machinelves, she came to the same conclusion as I: That having the complement stretch inside of the sequence makes it harder for the microRNA to get access, because the design itself would get in the way of the microRNA hooking up.
As she said: I have no idea but a loose guess of why dangling inside is less optimal, is energy interference and blocking from the molecule itself. Things on the ends, are more free.
The energy part reminded me of something Cody said:
Entropy lesson: A tail sequence has higher entropy compared to a loop sequence. This is because a tail can gyrate in more directions than a loop can (which is locked down on each end).
I simply think the microRNA is more free to feel out a potential landing spot for binding in the end of a dangling sequence compared to bind to single inside a loop or a multiloop ring.
High entropy is normally bad when it comes to static designs. But in these microRNA switches I think it actively helps the connection form between microRNA and receiver design. And I think when the microRNA has first attached, it can push open the rest of the design just by the strength of base pairing.
This brings me to something, that I think matters for the amount of high scoring designs. Length of that complementary dangle.
Length of the dangle
I made a lab summary with Omei’s fusion table:
Eterna R95 results
Based on the designs scoring over 94, having 20 clusters and max 1.4 error rate. Counting dangling bases that are complementary to the mir. Round 1 has few high scorers and low cluster count, so I excluded it.
Now the turn-on labs (Like mir 208A) usually have a score advantage over the turnoff labs (Something which has been clear from the MS2 labs also) Anyway, I did a comparison of the round 2 microRNA labs. Just keep in mind that 208 is in a different category from the others.
Which labs came out with a higher count of high scorers than the others?
The labs that kept the longest mir complementary dangle. :)
Mir 208A has a dangle that is typically around 7-12 bases long. (Late in sequence)
The minority solves have 8 + 4-5 dangle bases. 8 gap bases late in sequence and 4-5 internal loop bases early in sequence.
Sensor v3, turn-off variant 1, has a dangle around 2-4 bases long. (Early in sequence)
The minority solves have 5-7 (in a multiloop)
Sensor v3, turn-off variant 2, has a dangle 8 bases long (Late in sequence).
Another thought hit me. perhaps this is why the microRNA labs can tolerate so long MS2 gates, since the dangle is helping untangle the design?
How to make a lab summary in Fusion Tables
It was jandersonlee who taught me how to make a lab summary in Fusion Tables like above, so I can display the labs against each other and compare how many high scorers each lab has.
Here is an intro on how to do it.
Which part of the mir is targeted?
So far it seems that the mir complementary dangle targets a specific region in the mir. In the winning designs so far, the early part of the mir (Mir 1 as I call it in my drawings - and colored pink - shown below).
This can be seen in all the drawings, even of the minority ones. The mir 1 (pink) is aligning in sequence with the purple dangle.
Only exception is Sensor V3, Turn-off variant 1 - Majority, where the mir complementary dangle is at the beginning of the RNA sequence. There the mir targeted is the Mir 2. I wonder if this will be the tendency for all the solves where the dangle is placed early in the RNA.
I think there will be relatively few ways of legally solving each microRNA labs. With the majority of the winners follow the same main pattern for overall structure.
So I think if we get length of the complementary mir dangle right, plus places it at ends of the RNA sequences, we should be making a ton of fine microRNA designs in no time. :)
There may even be a preference to which end of the RNA design the dangle prefers to turn up in and for now it looks like the end of the RNA sequence. But we need more data to know for sure.
I’m also guessing that base in the static area, allows for more sequence interchanging, than a base in the switching area. In particular in the area binding up with the microRNA. At least that’s what I think I learned from my lab mods. Most of the base changes I did in the microRNA complementary region, didn’t improve the original design score, rather the contrary.
Drawings of tendencies for high scorers
Main type solve
Minority solve type
While I was working on my microRNA lab designs, I realized something. So I'm hereby passing the tip on.
The microRNA labs seems to strongly favor pairing up with a complementary dangle of bases at the end of the design sequence (the 3' end). Plus they prefer doing so with their beginning bases (5' end).
And I think this is what microRNA prefers, but in the third lab, this was not the case. So despite we managed to make winners in the Sensor V3, turn-off variant 1 lab, I think that another approach may work better. Or at least be worth a try.
In this previous round of Sensor V3, turn-off variant 1, its dangle land in the beginning of the design sequence instead of the end. But it doesn't land at the start. And I think that the dangles strongly prefer being at sequence ends and not in between elements as the rest of the design then has the option of getting in the way of a proper pairing up with the microRNA.
Inspiration for lab designing
I realized that that it would be possible to move the static stem, from the end of the sequence, so it didn't take up the favorite spot of the microRNA. So I moved it in between the MS2 and the complementary dangle.
Here is my first design following this approach. I have not yet fully achieved to make a longer late sequence dangle at the sequence end, that also shows some strong bases. But that's what I'm working towards. Feel free to mutate and modify as you wish.
Title: Winner reversed engineered
You can read more about the patterns microRNA designs seems to follow, in these earlier posts:
MicroRNA welcoming dangling tail
GoogleImagining the microRNA labs
Position of the MS2 turnoff sequence in microRNA
Last round I attempted move the MS2 turnoff sequence from JL’s winner in the Turnoff lab, Variant 2 to the Turnoff lab, variant 1. I used exactly the same MS2 turnoff sequence as found in JL’s variant 2 winner, from round 2, but with not uplifting results.
Original winner in Variant 2, round 1 by JL (94%)
What variant 1 want - for now round 2
Sensor V3, v1 - 91 (94%)
This exact trend hit through in the rest of the variant 1 lab too. The lab disliked a late positioning of the MS2 turnoff and liked a early positioning. A strong C segment was preferred for MS2 turnoff before the MS2 segment, not after as in the variant 2 lab.
I think it is simply dependent on the position of the MS2 itself in the RNA sequence. If early, then the MS2 turn off is right before MS2.
And if the MS2 sequence is late in the RNA sequence, then the MS2 turnoff sequence lands right after MS2.
This trend show up to a far lesser degree in the non microRNA labs. Probably because there is not a long microRNA sequence that needs to pair up almost in full - which is a very radical change between two states in a switch. In the exclusion labs there seem to be a general preference for the MS2 turnoff sequence to be after the MS2 sequence, unless the MS2 is placed very very early in the RNA sequence. (Exclusion 4).
Turning of the MS2: Memory rule
The MS2 turnoff sequence is for turnoff labs.
The same state labs do not have MS2 turnoff sequence right next to the MS2.
Rather they are like wound up springs. Waiting to get released and turned on. The MS2 is usually bound up in each of its end a good space a way. And usually have to make a switch jump to get turned
Wound up spring in state 1 (left), released spring - MS2 bound up - state 2 (right)
Turn on lab (Sensor 208a) MicroRNA landing spot
The microRNA strongly prefers fully pairing up with the RNA design 6 base to around 27 after the MS2 sequence. This lab had a strong preference for a 5 base pair MS2 gate - containing a GU at the MS2 gate end furthest away from the MS2.
I’m counting on this MS2 gate trend to continue. I expect the same switch mechanism to be in use in future labs, although I know the sequence will need to change slightly, when we get a similar lab but with a different microRNA. As the MS2 gate sequence needs to be complementary to both the MS2 and the microRNA, as already mentioned here:
There are a lot of designs that attempt to make the microRNA pair up with the design before the MS2 also, but they are generally low scoring.
There is only the minority solve type based on one of Brourd’s round 1 designs, that achieves to make a middle solution where the late part of the microRNA pairs up before the MS2 and the early microRNA pairing up with the design after the MS2 hairpin. It actually looks more like the majority type in the turnoff lab solves, that likes a similar and not totally binding up positioning of the microRNA.
So overall I think that the turn on RNA lab is less than pleased about any pairing up early in the microRNA design.
For now I’m guessing it has to do with the MS2 turnoff sequence preference. There seem to be a general preference for MS2 to get turned off with around 4-6 bases after the MS2 sequence. Unless the design is forced to do otherwise, because of a nearby FMN sequence or by early placement of the MS2 in the RNA sequence.
MicroRNA Turnoff labs - MicroRNA landing spot
The turnoff lab prefers having each end of microRNA pair up on either side of the MS2 sequence. The variant 2 lab does well, the variant 1 lab has trouble, as it has a harder time getting lucky having the microRNA complementary dangle on the 3’ side of the design.
The turnoff labs generally seems to have a harder time solving. Not just for microRNA, but for MS2 switches overall.
In the microRNA turnoff lab Version 3, variant 2, it shows by the lab having a far longer complementary dangle to catch the microRNA, than the turnon lab, Sensor 208a. 12 bases in the turnoff against 8 bases in the turn on.
MicroRNA seed region
I was rewatching some of the microRNA videos I shared earlier, when it dawned on me that there were something I recognized.
What the microRNA designs seems to love more than anything, in the round 1 winning designs in 2 of 3 labs, are, dangling a microRNA complementary stretch their 3’ tail to specifically catch the 5’ tail of the microRNA. (The labs Sensor 208a and Variant 2)
Something I described in more detail in the post MicroRNA welcoming tail dangle.
MicroRNA on the prowl
Our microRNA catching designs likes to lay out a single base snare of microRNA complementary bases. 8 bases and sometimes even more.
This is exactly what I have been hearing about how microRNA like to land on a messenger RNA (mRNA).
The microRNA generally like to use its 2-7 first bases (its 5’ end) to hook up with the messenger RNA. This section in the microRNA is what the scientists calls the seed region.
Here I found a fine definition of what the microRNA seed region is:
The microRNA in our labs, is attaching in a way that resembled how microRNA attach to messenger RNA.
Third stubborn lab - Variant 1
The third lab was low on winners in round 1 and 2 and is actively open now for round 3.
On the basis of what I had seen jandersonlee have success with in round 1 - namely dangles - I decided to repeat that strategy.
The tail of dangle will far rather be last in the RNA sequence (3'),
not in gap bases between elements or in multiloop ring.
But since there were not real good space to put it the way that was successful in the two other labs, I decided to go naughty. So as there
were not good space at the 3' end, I put the dangle at the 5' end instead.
I managed to make a winner in - that also had decent cluster counts (20+) - by making a complementary dangle that targeted the “wrong” end of the microRNA (3’) :)
Sensor 3, v1 - 91 (94%)
Though it was possible solving this way, I strongly suspect that that microRNA really prefers saying hello with their 5’ hand and that 3’ hand is far less effective.
So it seems that this tiny microRNA worm behaves as it and its family members always have. :) Ok, since this mechanism originally evolved. Using its 5’ end to sniff out its target messenger mRNA/s so it can silence or mark for destruction. This way it can very fast regulate how many proteins of a certain kind gets made, in a very fine tuned way, so there isn’t made too many.
So even though this microRNA has gotten thrown into a new connection and given as a task to us, to design microRNA sponges, it looks like it goes about business as usual. :)
Funny observation for the second round of microRNA and reporter labs, it is rather hard to make pyrimidine dangles at end of the RNA sequence, without having the sequence misfold and hidden away with no unpaired stretches for microRNA hook up.
It looks like this lab strongly demands G dangles instead. I think those should work fine as well. Main thing is that complementary microRNA traps are laid out and waiting and preferably strong or long ones.
I have made a drawing with colors and symbols for attraction of the R2 labs for one of the things that I think could work well. I colored C stretches green and gave them a + sign for showing attraction and similar I colored G stretches red and gave them a - to show their main attraction.
I did this both for the microRNA inputs and for the design. I think this could be useful for showing what a specific microRNA would be inclined to prefer.
Eg if one microRNA has a C stretch and the other microRNA has a G stretch - then placing them on either side of the MS2 is going to be hard, as then they would naturally want to pair with each other and not let the MS2 go. So you can use their nature to say something about what they will prefer and need.
In this lab where TB A needs to kick out TB B, I take advantage of that MS2 and TB A needs to be gone in the same state (1) and use the TB A complement to help turn off the MS2, in those designs I have called A and B sharing lanes.
Here is an illustration of another route to get to a solve. I have called this design route B before A. Here I take advantage of that MS2 and TB B are directly complementary and use it for MS2 turnoff in state 1, where both needs to be gone.
Notice that I in both illustrations have put the static stem first. Our earlier microRNA lab data has shown that the microRNA mainly prefer to land in the late end of the RNA sequence (3') in turnon labs, which R2 is.
Turnon and turnoff is taking the perspective of the MS2. Here MS2 is needed to get turn on in the later state = hence turnon lab.
MS2 gates in the microRNA labs
Short sum up: My latest advice for single microRNA input labs is to make MS2 gates. The pattern jandersonlee made for first microRNA round, I think I know why it is needed. The long MS2 gate is simply a way to secure that the MS2 is force held from both sides, which means a better chance for turning it of and on. The word change gaming complementarity of letters between the switching stretches, is just a way to get it glide also.
Static stems and the microRNA labs
I think I have been too quick ruling out static stem as having a role in the microRNA labs.
I decided to fuse the Same State 2/NG 2 blueprint with that for the microRNA's and I ended up with this. Second state even has a multiloop. Which I like. :)
I haven't gotten the 4 red G's in a row dealt away with yet. So anyone capable of that, please give it a go:
Here is my sequence for now:
R2 - 2 states model - Analysis
I put up a series experiment to test the presence of a static stem in a microRNA design with two inputs as mentioned above. So how did it go?
Leaving sequence before staying sequence
I have been claiming that A needs to go before B, when A needs to be kicked out and B needs to stay in an exclusion type lab.
In the R2 lab, both sequences share a strong kennel attraction sequence - CCCA. A strong shared sequence is often a good starter for an overlap and lane sharing. Or for making a common sequence in between them.
However in the R2 lab it is far the easiest to put B before A, for making sequence share, but that order is the opposite of what I wanted.
And while I think this is the easiest way to achieve a good score in an exclusion lab, with an overlap between the sequence and have them share lane in this particular order, it is not to say it can’t be done otherwise. It is all really a game of balance. You just have to tip the scale right.
Static stem in the switching area
I have been talking about the potential use of having a static stem in between switching elements, both for shielding elements from each other, but also bringing other elements close that needed be close.
For the experiment I mentioned in the post above, I made an almost winning design.
I placed half of the TB A sequence at one side of a static stem and half of the MS2 at the other side. By this maneuver, they could both turn each other off, but these both half would also both be bound on in another setting in state 2.
By this I was mimicking the reporter labs of the type miRNA-in, reporter-in, that had both their sequences turn each other off in state 1, but both were bound by each their sequence in state 2. A zero to 2 sequence binding step.
Outline from drawing made for reporter labs:
Here the sequence inputs have either a static stem in between them or a short loop.
Stem or loop between sequence inputs for turnoff?
I think there may even be a pattern to what to pick when. A loop or a static stem between sequences that needs to be turning each other off in one state but both be on in the next state.
If you want two sequences disappearing and then both binding, having a stem in between seems to be better than having just a loop in between. At least the mir-in, reporter-in lab with the short reporter sequence, more exclusively prefers having a stem between the inputs.
If you want two longer sequences to go from not being there, to both be binding up, a loop in between the sequences may be more effective than a static stem.
At least the cooperative results seems to suggest that a static stem seems less effective than having just a loop in between the two MS2’s. And these MS2 sequences are longer.
Plus the top scorers in the mir-in, reporter-in lab with the long reporter, where the sequences are of more equal length (similar length to MS2) seems to be more in doubt on whether to go for static stem or loop in between the sequences.
Perspective on order of sequences shared
I can’t make the weaker TB A sequence that should be the staying sequence, to go last in order - whilst sharing lanes. Only opposite. While I can balance the design by adding a static stem and it will probably end up having a winning sibling after this round, judging from the many lower scoring designs there were in my series, I also believe it may be hard to achieve a lot of winners this way.
So I think the way to get real high fold change and a better shot at getting winners, will be go with the strategy with the leaving sequence before the staying sequence, with a MS2 in between and have these spaced well out. Which is exactly what JR did in his R2 winners. I think it is the superior strategy provided that a sequence share after preferred order, is not possible.
Or put another way:
If sequence abbreviating can’t be used - like sequences sharing lanes, MS2 overlapping with one sequence (R2) or both (R3) then the RNA designs with two inputs tends to use most or the full RNA design.
The overlapping strategy where switch elements share lanes seems to be mainly of use if only one variant of the puzzle needs to be solved. I think for logic gate labs where many puzzles needs to be solved with the same sequences, it is going to be real hard having perfect matches for sequence overlap which ever way, the oligos are going to be placed in relation to each other.
Extra added note: Actually I did make the B go before A and 86% was the highest achieved score. I may have placed the design complex at the less optimal end, the design complex seems if not taking the whole space of the RNA, seems to have favorite ends.
So I have moved this design to the opposite end of the RNA in this round to see if I can get a better score. One thing that may speak against it is that the MS2 will get dangerously near to the end of the design, something it has regularly shown to dislike. On the other hand, two input labs seem to be a bit more tolerant to this.