Why we can't solve the unsolvables

  • 2
  • Article
  • Updated 8 years ago
I think I have figured out why we can't solve the unsolvables. I was on to
it already two months ago, I just hadn't yet read about AnticNoise's
energy gap theory. Here is my comments to the challenge puzzle: Possibly unsolvable series 3 - HgcG RNA

Eli Fisker on Tue, 03/15/2011 - 18:28.

There is a very high amount of energy in that little last unclosable loop.

Eli Fisker on Mon, 05/16/2011 - 01:28. (edited the text a bit since then)

I think the energy jump theory might be part of the answer to why we can't solve these unsolvables. As I mentionened above there is a very high amount of positive energy in the last unclosable loop, compared to the rest of the puzzle. (5,7 the lowest possible amount, between nucleotide 24-29). The same is going on in the other unsolvables. When I mentioned this to Mat, he said that D9 had talked to Sneh about the energy on that bugle loop/end.

Link to the puzzle:
http://eterna.cmu.edu/content/possibl...

The lowest energy difference I can get between this loop and the formation of 5 nucleotides, is an energy difference on 1,1. The scale goes up to an energy difference on 5,2.

The puzzle most similar to the HgcG RNA, is RNA in Seamaize 2 (Difficulty level 1). Here the 5 nucleotide formation can be solved with an energy difference ranging from 0 to 1,1. The biggest difference is that it's possible to adjust the energy level in the sea maize loop, as it is one nucleotide longer. And when we can't do that in the HgcG, we can't get the energy difference between the multiloop and the formation of 5 nucleotides in range of the admissable difference in energy level.

In the Bombyx mori 1, difficulty level 1, which have same 5 nucleotide formation, but have a 10 nucleotide long loop, the energy difference allowed, is bigger than for the seamaize puzzle. Which point in the direction, that smaller loop in this case, equals smaller allowed energy difference.

So don't feel bad guys, there's a reason why we can't nail these puzzles.
Photo of Eli Fisker

Eli Fisker

  • 2253 Posts
  • 506 Reply Likes

Posted 8 years ago

  • 2
Photo of Jeehyung Lee

Jeehyung Lee, Alum

  • 708 Posts
  • 94 Reply Likes
Hmm...I'm guessing that the big energy jump at the unclosable loop basically means we can't create a stack that can overcome the positive energy of the loop?
Photo of Eli Fisker

Eli Fisker

  • 2253 Posts
  • 506 Reply Likes
Yes, that's right...

And same problem, sometimes under different disguise, in the rest of the unsolvables.

But as I understand it, those unsolvable challenge puzzles are made over real RNA which is known to function in nature. So why can't we make them? Does it have something to do with the rules set up for the lab? I'm a bit confused here.
Photo of Joshua Weitzman

Joshua Weitzman

  • 93 Posts
  • 0 Reply Likes
The rules in this game don't match nature.
Photo of Berex NZ

Berex NZ

  • 116 Posts
  • 20 Reply Likes
@Eli Good action trying to tackle this. Can you please clarify where the 5 nucleotides are in puzzle 372206, Possibly unsolvable series 3 - HgcG RNA.

In my version, I have trouble in a different area.


In my example, I am having trouble reconciling the area between nt 239 and 246. Would be very interesting if you are having trouble in a different part of that puzzle.

To a certain extent you are right, now this is just purely my interpretation. The Challenge puzzles are loosely based on real RNA sequences, although I believe moreso for the shape of it.

As a caveat, eterna is based on the Vienna algorithm. And yes there are flaws, thats why we have encountered puzzles like Prevotella and now the Unsolvables. We are figuring out where the gaps are in the algorithm. Yes its not perfect, but its one of the best algorithms out there in this field. Welcome to the leading edge of science. :)

Additionally you might find while modelling real RNA, eterna can model most of them, but there are sequences out there it cannot model yet.

If you so choose to accept the challenge. Thats one of the reasons why we are here, to help enhance those algorithms. To deduce the rules that nature is using.
So for now we are focusing on the most common pairs (Watson Crick and wobble pairs) before we start to undertake the more complex pairs.
Photo of Eli Fisker

Eli Fisker

  • 2251 Posts
  • 506 Reply Likes
Hi Berex!

Thanks for the explanation. Unfortunately we are having trouble at the exact same spot. Had it been two different spots, then we sure could have nailed the bastard, with collective effort. Sorry, I somehow got wrong nucleotide numbers on.

The five nucleotide structure I'm talking about is: nucleotide 239,240,244,246,247. I don't know what such structure is called. But it is exactly in this kind of structure, combined with a small loop, with an ending of 3 nucleotides (is it too small to be called a loop?) where trouble arrives. If this loop follows the exponentional curve for expected values, which it apears it does, when going from bigger loops towards smaller loops, then there is no way we can close it. (compared with other puzzles with same 5 nucleotide structure, but bigger closing loop) If we want this part of the puzzle done, we would have to ad an extra nucleotide to the closing loop, then we would at least be in a range where we might be able to close it.

Challenge accepted. Love being on the edge of science. :)
Photo of Eli Fisker

Eli Fisker

  • 2253 Posts
  • 506 Reply Likes
Possible unsolvable series 3 – HgcG RNA

I made a part of the puzzle in the Puzzlemaker. It is not solvable here either.





Here is the dot bracket structure if anyone want to try it in the puzzle maker them selves.

(((((((......(((...).)).......)))))))

If I make the structure just one nucleotide longer in the loop, then there is no problem.

Note the puzzles overall positive energy. As I understand it, a puzzle needs a negative energy at a certain level before it start folding and being able to finish. Which is one of the reasons why the cub scouts designs does so bad in the lab. Not only aren't the basepairs strong enough to stay together, there is an overall too low negative energy for the puzzle to fold.





The dotbracket structure:
(((((((......(((....).)).......)))))))

I also had the idea that lengthening the arm might might relieve the pressure on the unsolvable reagion. But it don't appear to be so. Instead I discovered that adding more to the design, made the negative energy go up. Which is why all the big unsolvable designs have overall negative energy. Its just when looking at the problem region this energy problem becomes obvious.





Dot bracket structure:
(((((((......((((((((...).))))))).......)))))))
Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
It seems like the problem we're having in this area of HgcG RNA has to do with the comparative energies assigned to the desired structure versus the six-nucleotide loop that is the alternative structure:

HgcG RNA in target mode:


HgcG RNA in natural mode:


As you can see, the free energy of the six-nucleotide loop is 4 kcal whereas the free energy of the two loops making up the desired structure is 6.2 kcal with this sequence.

In order to solve this section of the puzzle, we'd need to find a six-nucleotide loop design that gets assigned an energy of at least 6.1 kcal (the lowest I can get the two-loop structure, with 5.7 kcal in the triloop and 0.4 kcal in the single-nucleotide bulge).

As to why Nature can solve this puzzle but we and the bots can't, I think it's just a matter of the imperfect energy model. I'd be interested to know what the natural sequence in that area of the RNA is, but without knowing that it's going to be hard to figure out where exactly the model is wrong. My suspicions are that it's either in the energy values given to triloops or in the treatment of single pairs in the model (as I understand it, since the energies are based on quads rather than pairs the only contribution of a pair like 240-244 is reflected in how it affects the values of the two loops it separates; perhaps there should be some additional bonus?)
Photo of Eli Fisker

Eli Fisker

  • 2251 Posts
  • 506 Reply Likes
Hi Ding!

Yes, this structure seem to prefer to even out the energy, by creating a loop, rather than staying as it should. Triloops, that's a great term, I'll adopt it.
Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
As an aside, probably the closest I've come to solving this section of HgcG RNA has an energy difference of 1.2 kcal between the target and predicted structures:

target:


natural:


Pretty much anything else I do (like substituting in AU or GU pairs for some of the GC pairs) seems to have an equal or worse energy effect on the desired structure as it does on the six-nucleotide loop that's being predicted instead of the desired structure.
Photo of Ding

Ding

  • 94 Posts
  • 20 Reply Likes
Gosh I miss the edit function here so I could just edit in further ideas that occurred to me right after posting rather than spamming the topic with new replies ;)

I just did a search on triloop energies and found this paper:
http://pubs.acs.org/doi/abs/10.1021/b...

I don't have access to it, but am wondering if there's anything in there that might end up being added to the Vienna Turner 1999 energy model and would change predictions for structures like HgcG with triloops enough to make them solvable?
Photo of Eli Fisker

Eli Fisker

  • 2251 Posts
  • 506 Reply Likes
This sounds really interesting. I print that when I get near my university.
Photo of alan.robot

alan.robot

  • 91 Posts
  • 36 Reply Likes
I extracted the table of interest from the paper Ding linked above (look at the last two columns, delta G triloop experiment vs prediction). Note that delta G (free energy) for triloops are, in general, very positive (unfavorable) due to the very high entropic cost (delta S) of forcing such a rigid loop to form. The difference between the Turner model (which does not have any triloop data in it as Ding points out) and the experiments are ~0.5 kcals/mol. So that's the most that can be attributed to an inaccurate energy function.

Photo of Eli Fisker

Eli Fisker

  • 2251 Posts
  • 506 Reply Likes
Thanks for extracting the data and making sense of it. Couldn't have done that. Had an idea though that sharp angles weren't a thing RNA liked to fold. :)

This findings of Ding mean, that we might be able subtract 0,5 kcal from the positive energy level in this triloop. I hope that this might be just what's needed.