Designing switches for the first round of OpenTB

  • 2
  • Article
  • Updated 3 years ago
When analyzing data from past rounds, we have had some very productive discussions here on the forums.  But we've never really had a discussion that focused on ideas for designing molecules for a current lab.  As the current OpenTB puzzles are the most complex lab puzzles we have ever dealt with, I think it is time to see if we can step up our learning curve by sharing ideas and questions during the design phase.

I will start by following up with thoughts on a specific design strategy, but the intent for the conversation is to elicit diverse techniques.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
  • optimistic

Posted 3 years ago

  • 2
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Designing around Attraction Patterns

A basic requirement for a riboswitch to have a high fold change value is for there to be a strong coupling between the switch's sensor and effector domains, i.e. its inputs and its outputs.  In the OpenTB puzzles, the input domains are the sections of the switch that bind to one of the A, B or C TB RNAs and the output domain (or possibly domains) is the section(s) that binds to the reporter RNA.

For Round 100, Eli and I put together (the beginnings of) a tutorial on a way of designing switches around a few simple "attraction patterns".  The basic idea is to 1) decide on an appropriate coupling (the simpler the better) between inputs and outputs, and  2) find short design segments that can act as the "bones" of the design, binding as units to input/output domains or to each other.  I generally refer to these "bones" as kernel attractors.

That tutorial was written after the Round 98 and 99 puzzles were submitted, but before they were analyzed.  These two rounds were unique at the time because, for the first time, RNA molecules were used for both inputs and outputs.  When the data came back from those two rounds, the best designs had unprecedentedly high fold changes for many of the designs that exhibited these simple design patterns.  (These designs were submitted by many players, who may well have come up with these designs using a different design process.)  So it appears that this design process can be especially powerful when both inputs and outputs are RNA molecules.  This makes sense,  since all the details of binding between the switch, inputs and outputs are the same as within the switch itself.

I'm not to the point of having a full plan for solving the final puzzle, but I wanted to share the work I have done in identifying various possible kernel attraction sequences that should be useful for anyone wanting to try this approach.
(Edited)
Photo of whbob

whbob

  • 193 Posts
  • 58 Reply Likes
Thank you for starting this topic Omei :)  In the "possible kernel attraction..." Google Doc., what are the yellow/tan blocks with the letter R and light blue blocks with the letter Y?  
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
The R stands for puRine, which represents the bases A or G.  The Y is for pYrimidine, which represent the bases U or C.  These are examples of a standard nomenclature for referring to any subset of the four bases.  There is a table in the Wiki that summarizes them all.

My choice of colors is motivated by the Eterna coloring (which is not a standard). Orange for purine is a mix of red (G) and yellow (A).  Cyan is a mixture of green (C) and blue (U).

The distinction between purines and pyrimidines is especially important because the canonical base pairings are all composed of one purine and one pyrimidine.  When designing switches, I often find it useful to consider first just the assignments of Y or R.  After I have an assignment that seems promising from that point of view, I can make the more specific choice of the exact base.
Photo of whbob

whbob

  • 193 Posts
  • 58 Reply Likes
Cool! Another set of data to consider.  I've felt that I understood the "static" attractions and repulsions between bases in stems, but haven't visualized what the serial forces between bases might be.  So, along comes Brourd's solution to A*B/C^2DEC.  Two strong TB-C oligo's and a reporter at the 5' end of the sequence are bound to the RNA sequence.  Along comes TB-A and TB-B, both only 50 nM strength compared to the two TB-C oligo's at 100 nM.  A & B are way down the other end of the sequence.  Somehow, A & B cause the two C's to pull the ejection handle.  Never saw force at a distance like that before in the game.  What's that all about? I see what Brourd did, but it's still magic to me :)
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
As an example of how the kernel attractions can be used, here's a story from yesterday.

I started off thinking about how to best visualize the kernel attraction patterns of a completed design.  So I selected a design I had submitted for the [A]/[C] DEC puzzle and started playing around with alternatives.  Here's the version I liked best:


(See https://getsatisfaction.com/eternagame/topics/ui-view-boxes?utm_source=notification&utm_medium=e... for an explanation of my conventions.)

While looking at the above visualization, I realized that the only thing needed  to convert the [A]/[C] DEC design into a [A]/[C] INC design was to invert the result, i.e. change A/C = R to A/C = 1/R.  Achieving this just required changing the attraction pattern between C and R from the "mutual" Changing Pairs (In/In) pattern in State 1 to the "competitive" Bachelor's Dilemma (In/Out) pattern in State 2.  So I copied the diagram, removed the gray arc that denotes a hairpin, and moved the R sequence from State 1 to State 2.  Then I slid the R sequence back and forth against the design and found a perfect complementary match:


This looked like a very plausible design, needing only energy balancing to turn it into a valid solution.  So I cut and pasted the design sequence into the [A]/[C] INC puzzle page and got this screen:

Whoops! An unwanted hairpin had formed.  But it only took one mutation to weaken that hairpin and get:


Success!

(Actually, to be totally honest, I made 3 mutations, not 1, to satisfy the energy model. It was only when I started to polish the design for submission that I realized only the one mutation was required.)

Still, the visualization basically guided me to create (what looks to me to be) a high quality switch to a brand new puzzle in under 10 minutes -- something I had never done before.  So I thought it was worth sharing.

(Edited)
Photo of Brourd

Brourd

  • 452 Posts
  • 82 Reply Likes
Based on my first solution, from a purely in silico approach, you can treat the system as a two input logic gate, with the true condition being dependent on the presence of input CC, and the false condition being dependent on the presence of AB.

Of course, the inputs differ based on concentrations of the oligo sequences, rather than based on a T/F condition for the oligos themselves.

Also, the first two true conditions require two separate inputs as well: CCA and CCB, requiring a truth condition for the reporter sequence. Essentially, this means that when the A and B oligos act in a concerted fashion, there will ether be a true or false condition, and when they are their individual components, it will need to be always true.
Photo of Brourd

Brourd

  • 452 Posts
  • 82 Reply Likes
Well, for AB/C^2 [DEC] anyway
Photo of Eli Fisker

Eli Fisker

  • 2239 Posts
  • 495 Reply Likes

Here is an intro to one of the hard lab puzzles for the Open TB lab. I have written down the thought process that helped get me to a solve.


Open TB Labpng

Photo of Eli Fisker

Eli Fisker

  • 2239 Posts
  • 495 Reply Likes
Recycling the structure of a good solve

You can often reuse the same structure of a solve between sibling labs, if you just change the one input sequence to fit the new sequence input.

So if you have a solve for the the A/C DEC lab and want a solve in the B/C DEC lab, you just copy it in and update the A input to match the B input instead. It won’t work always, but it will save some work when it does.


Related past labs

Also we have past labs that resemble the Getting started lab. Especially in the RIRI labs you can use this recycling trick.

miRNA-in, reporter-in
http://www.eternagame.org/game/browse/6116602/

Related design group: Sensor A (GBP5) - RIRI, Sensor B (DUSP3) - RIRI, Sensor C (KLF2) - RIRI

miRNA-in, reporter-out
http://www.eternagame.org/game/browse/6116601/

Related design group: Sensor A (GBP5) - RIRO, Sensor B (DUSP3) - RIRO, Sensor C (KLF2) - RIRO
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
I'd like to call attention to a strategy that is generating lots of new (A*B)/C^2 Descending submissions, and that is the use of two separate sequences that bind to the reporter.  In all the cases I have looked at, this is the result of combining an A/C solution with a B/C solution, which I think is an excellent approach.  (To solve a hard problem, first break it up into smaller problems that are easier to solve, solve those, and then combine those to solve the bigger problem.)

Kudos to atanas.atanasov, who seems to be the first player to have submitted a design of this type.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
I've developed a tool for creating compact switches using just the simple Bachelor's Dilemma attraction pattern.  It's very much a work in progress, but since I've "implemented" it as a Google Drawing that anyone who is interested can use, I thought I would share it as it currently is.

Below is a screenshot of https://docs.google.com/drawings/d/12KuDUnA3j5q2S53S5F6B2rA94Uk_8i5RPNC0wpZm7Uk. If you are interested, you should be able to just make a copy of the drawing and try it out for yourself. But be warned, it requires moving the various "playing pieces" around using the draw applications UI, which is not nearly as convenient as that of a finished game.



The first section is the playing pieces.  There is one orange/cyan playing piece representing each oligo (in reverse sequence), with the orange/cyan coloring indicating whether that base is a purine (orange) or a pyrimidine (blue).  

The yellow one represents the design we will be creating.  

The rest of the sections are not necessary to play the game; they are the "player's guide".  Once you understand the rules, you can delete them from your own copy to make space for your solution.

The second section is an algorithmic technique for first expressing the puzzle statement as a arithmetic expression and then translating that into an attraction pattern.  The example is only suggestive; I haven't formalized the general process yet.  But the key idea is that if we assume linearity, the Bachelor's Dilemma pattern (denoted with a "|") can be used to calculate division.  If that makes sense to you and you want to try your own calculation, keep in mind that division is right associative, i.e. X/Y/Z needs to be interpreted as X/(Y/Z), not (X/Y)/Z.

The third section shows a completed solution. The steps I took to get there were:
  1. Decide on the basic arrangement of pieces I needed.  In this case, the attraction pattern is R | A | C, which means that one segment of the design -- represented by "R | A" -- should be able to form a bond with either the R piece, or the A piece, but not both at the same time. So I put one of those pieces above the yellow one and the other above.  (It doesn't matter which.)  The second part of the attraction pattern -- "A | C" -- says that A and C should be on opposite sides of the yellow piece, so I put it above.  But I haven't yet made any decision about whether R comes before or after C, or exactly where along the bottom the A piece is located.
  2. Since I didn't care yet where the switching sequence was going to be in relation to the whole RNA design, I arbitrarily placed the A piece against the yellow one near its middle.
  3. Now I slid the R and C pieces back and forth along the top, looking for a good alignment of contiguous orange and cyan sections.  (What you see in the third section is a pretty strong match, probably the best for this puzzle, but that doesn't guarantee it will make the best switch.)  I've highlighted the contiguous sections that match with red boxes.
  4. Next, I filled in the design bases that are inside the red boxes. These bases are the kernel attractors that I am always writing about.  In most cases, there will be a base that clearly makes the strongest canonical bond with both the upper and lower pieces.  (If the upper and lower bases are A and C, there won't be base that forms a canonical pair with both, so you get to use your discretion.)
  5. Now I had to make a decision about filling in the remaining positions in the yellow piece.  I started start with the R piece, because it is the shortest and hence forms the weakest attraction.  I filled it in with the strongest choices available, meaning no GU pairs.  The length of the A and C oligos are the same, so when it came to a choice between pairing with the A piece or the C piece, I tried not to bias the decision too heavily either way.
At this point, I copied the yellow piece's sequence I had assigned bases to (UCGGAACUUAGUGAUGAGCUGUGUAGCCCAAAA) to the clipboard, planning to insert that base into the A/C INC puzzle.  (I used an EternaScript I had just written that makes it easier to manipulate strings of bases, instead of just individual bases; it also is a work in progress, but feel free to try it out; see http://www.eternagame.org/web/script/7049401/).

When I pasted the sequence in, it did not immediately fold into the correct shapes, and I then had to make manual adjustments.  Basically, I had made the attraction to the C oligo too strong, and had to weaken it.  Important advice: Before tweaking the design, mark the kernel attraction bases with a black ring, and don't change them when trying to balance the energies between states.

The last section of the screenshot shows the changes I made to to turn my initial switching sequence into one that balanced the energies so that NUPACK predicted it will fold appropriately in all states. 

Anyway, feel free to try it out.  If you do manage to make a design from the sketchy directions above, let me know what you think. :-)
(Edited)
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
Isn't that for A/C Dec? A/C Inc is an AR/C switch
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
Hi Atanis!  Yes, you're absolutely right.  This demonstrates that I'm still struggling with the best way to express the isomorphism between the numerical expressions and the attraction patterns that should calculate them. 

I see from your profile at http://www.math.harvard.edu/~nasko/ that you're a mathematician.  Any suggestions?
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
Haha, that is not me, there are many with the same name. I have some math background though. I'll try to write some guide on how use the energies reported by the game to create mods or get an idea of how "close" to a solution you are. It can be useful for slow puzzle as those calculations are simple and can be done in Frozen mode.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 980 Posts
  • 308 Reply Likes
A guide like that sounds great!  I'm sure I could learn something from it.
Photo of JR

JR

  • 241 Posts
  • 20 Reply Likes
Atanas Atanasov - I would like to know how you think you know you are close to a solution on these puzzles. Previous puzzles were no problem and my switch solving strategy/design has always worked but these are throwing me for a loop. But with that waiting time I am sure am  getting a lot of stuff done that I have been procrastinating on. 
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
The short story is that there is this switch inequation that defines in what range the energies of the two competing structures must be for the switch to work. I used Target Mode and for slow puzzles Frozen Mode to calculate the energies of the desired structures and see how close I am to the getting within the desired range. In freeze mode I calculate how much 1-base changes affect the energies and how those shift the balance away or closer to the desired range.



The white number there is the energy of the shape and the red one is the penalty for the bound oligo. The total energy is the sum of the two (note that the white is negative). The penalty is not affected by the structure (in the simulation). This allows me to produce the following equation for A/C or B/C puzzles.
So the energy for state 1 and Shape C I call T1 and the energy for state 2 and Shape A I call T2.
The equation is:
0< T1 - T2 < 1.85 kcal (this is for a switch with 5nM to 100nM)
Of course, getting T1 and T2 as low as possible while maintaining the above difference is a good idea, because that would prevent structures other than T1 and T2 to fold and break things.

Here is a draft of the long story and how to cope with some of the obstacles that game have in Target mode and Frozen mode:
https://docs.google.com/document/d/1EqsStCD5BfcTP8T7TtqHIfIOyQAG97LVmy6YWXe1-qc/edit?usp=sharing
I'll try to simplify it a bit and add a format description of how to get the equation for a generic puzzle.

This works for AB/CC puzzles, there I care only about state 3 and state 4 and hope state 1 and 2 would be ok. I choose some desired shape for state 4 and do the calculation.
The equation there is (I need to double check this):
0 < T3 - T4 < 1.36 kcal (if you have no C in the target shape for state 4)

Any comments are welcomed.
(Edited)
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
I've updated the document with a formal explanation of how the equation is calculated and included the equations for the other puzzle. I added a form of the equation that is more user friendly when used in the game.
(Edited)
Photo of Atanas Atanasov

Atanas Atanasov

  • 42 Posts
  • 15 Reply Likes
I was thinking what are the possible bound oligos for state 4 of the AB/CC INC puzzle. The puzzle does not provide much restrictions, but I felt that there are. I've updated the document with my thoughts on why I think the only two possibilities are ABR and ABRC. I haven't done the exact math for the AB/CC DEC puzzle, but I feel that there might be a similar requirement (without the R of course).
(Edited)
Photo of worseize

worseize

  • 29 Posts
  • 12 Reply Likes
Photo of worseize

worseize

  • 29 Posts
  • 12 Reply Likes
I use note to rewrite sequenses maybe it helps to somebody :