Nature of the bots

  • 3
  • Article
  • Updated 7 years ago
I been thinking about why it is the bots fail certain puzzles. When it comes to the strongest bot, INFO-RNA, untill now there is only 3 types of puzzles it fails. It don't like zigzags and it don't like really big, symmetrical puzzles. Then for some reason, half-circles made of circles - Iroppy says about the bots: ”it is like they have no guide to the obvious boost points.” (in the crop circles)

I think the bots dislikes pattern not commonly found in the nature, as this is all their inbuild algorithm knows of so far. The bot generally don't like sharp angles too – by this I mean strings take a very sudden turning. Nature like things smooth and curvy. Even when it comes to rocks - it just takes a lot of a time.

Others have an oppinion on why some of the bots fail certain types of puzzles? This could be an interesting discussion, now the eterna crew wish us to point to lab puzzles that beats Vienna and Nupack.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes

Posted 8 years ago

  • 3
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Now there are more types of puzzles Info bot fails. But I'm starting to see a new pattern. Infobot failing puzzles that SSD-bot can do. This is interesting.
Photo of Quasispecies

Quasispecies

  • 100 Posts
  • 9 Reply Likes
This page briefly describes the bots. INFO-RNA apparently starts by trying to find a sequence that minimizes free energy when the molecule is held in the target conformation (I'll call it the MFE sequence). My gut says that the MFE sequence is probably a bad place to start when solving structures that are highly symmetric or full of closely-spaced loops. Here's my two cents:

Zigzags and symmetric designs seem to be rare structures. Not many sequences fold into them. Those that do might be separated from the MFE sequence by a considerable distance.

The MFE sequence in a symmetric design probably has a lot of repeats and alternative base pairing options. The MFE sequence for design with many closely-spaced loops probably has the potential to form fewer, larger loops of even lower energy.

To find a sequence that folds to the target, you need to make several changes to the MFE sequence. I would bet that most close neighbors of the MFE sequence are less similar to the target structure than the MFE sequence itself.

What if the algorithm accepts or rejects random changes to the initial sequence based on whether they bring the new sequence closer to the target structure? Considered individually, most changes on the path from the MFE sequence to the properly-folding sequence fold to something further away from the target structure than the unchanged sequence. Depending on how the algorithm is designed, the bot could get stuck in a local minimum of structural similarity.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Hi Quasispecies!

Thanks for your fine explanation. Things make a bit more sense now.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
I'd like to resurrect this thread and give a suggestion. We haven't been looking at the bot algorithms (player puzzles) nearly as hard as we've been looking at the labs. I'd like to think that there's almost as much information to be gleaned there as there is from the labs. To illustrate, we know generalizations about the bots: they don't do well with zig-zags, they don't do well with bond-sparse folds, etc. But that's really not specific enough to learn anything from.

After talking with Eli Fisker, Starryjess, Edward, and a few others, I feel that it would be beneficial to start a puzzle maker's collaborative. The purpose of the Collab being testing bots for specific errors and trouble-points when solving RNAs. Anyone interested?

Also, it's curious to me that Infobot failed this puzzle when SSD and Vienna bot killed it.
http://eterna.cmu.edu/eterna_page.php...
Photo of stevetclark

stevetclark

  • 15 Posts
  • 1 Reply Like
Im interested
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
If anyone would like to join me, I'm currently testing how the bots fare against structures formed by short, repetitive sequences.
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
I'm in, I'm still trying to build 'simple' sequences the bots' can't manage but humans can. mismatching so far seems the biggest bot confusion
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
Agreed. I suspect that the bots also lack a separate protocol for handling bases that are meant to be unpaired. The result is that RNAs with many loops and sparse bonds; Those with sharp bends and, as a result, large bulges; And those with repetitious structures, especially repetitious loops, will defeat the bots. One of the big differences that I know a human has is that it generally won't mess with the bases that are meant to be unpaired, unless it has a specific reason to. Judging by the bot's lab submissions, the bots don't suffer the same discrimination.

For reference on my findings thus far, check out the thread titled 'The problem with random bots'.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Thoughts from the recent discussion today. My focus is on puzzles that stomps InfoRNA.

It is my perception that too much symmetry and too much asymmetry makes the bots go mad.

Paramodic mentioned something: For example, they (bots) seem to also have issue with structures formed by repetetive sequences, so not just symmetrical structures, but repetetive ones.

I think Paramodic is on to something. Repetitative sequences, that might be the key to part of what makes bot fails.

Just like Dings mirrored snowflakes, are symmetric, they are also also repetitative.

Big symmetric puzzles are energeticly pressured, at least Dings snowflakes were, as the strings were relatively short and close to each other.

But small and asymmetric puzzles stumps bots too. Like Kudzu. Here the strings are very short, and the structure very energetic pressured.

I'm just thinking, what the big symmetric puzzles and the small symmetric puzzles that stomps bots have in common, are relatively short strings.

But well smaller symmetric puzzles stumps InfoRNA too. Especially if mirrored on more than one axe. Wonder if there is something there? Brourd's clothespin spring are mirrored around two axes.



Ding's snowflakes 4 (all of them) were mirrored on more that three axes. Notice the mirroing on the smaller arms too.



I'm think mirroring itself is a problem for the bots. As in a mirroring puzzle, too many regions are similar, which again makes bigger chances for mispairing, if arms are similarly solved. Mirroring in itself can also put energetic pressure on the puzzle especially if elements are close together and strings are short. The puzzle becomes like a handful of tightened bows, wanting to release the energy somewhere and pushing the structure apart.

With mirroring around axes, comes sharp angles. Sharp angles is often a problem for the bots. Like 90 degrees or smaller. The more two strings are bent close to each other, the harder to solve for us and bots don't like them either.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Thanks for your comments. I really like your observation that SSD-bot and Vienna fails on the fractal structures. Which reminds me of the title of an paper I once found: The fractal nature of RNA secondary structure. I must admit I haven't read it. I was just searching for fun to see if anyone had taken a fractal approach to RNA design. And the answer were yes. Thought you would like it. Click on the text and it becomes bigger and readable. It is just an intro.
Photo of jandersonlee

jandersonlee

  • 549 Posts
  • 122 Reply Likes
The real question to me is: are the bots failing at something that folds in the wet lab, or is the EteRNA game model too lax, allowing puzzle shapes that do not exists in nature - like the all GU 2-2 loop.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
I'm inclined to think the latter, but does not have the mathematical or science background to say that it is so. So this is just my personal feeling.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
To cite what Paramodic said during our chat debate on bots and their puzzle solving: For example, they (bots) seem to also have issue with structures formed by repetetive sequences, so not just symmetrical structures, but repetetive ones.

InfoBot just failed my whole christmas series: Christmas bird and Christmas special 1 & 2.

There certainly are repetitative sequences in these three puzzles. :) They are build on a small pattern, a double bulge structure I found, in Paramodics (RNA) GC only puzzle.

Maybe repetition is why lots of 2-2 loops stumps bots too, allthough it propably is not all of the explanation in that case. A few 2-2 loops alone is usually not enough to stump InfoBot.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
I strongly feel that the bots don't have a special or separate protocol for handling unpaired bases, or for boosting. Therefor, I think that's why loops tend to stump the bots.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
A new observation: Infobot seems to have a great deal of difficulty with puzzles featuring 1 NT bulges. The details require a lot more fleshing out at the moment, but it seems to follow this pattern:
-The bulges are on the same side of the fold as each other (ex: Left side)
-The bulges are close to an internal loop
-The bulges appear on sequential or adjacent strings.

As evidence, I present the puzzles '600!', '(RNA) Cytosine Free', and 'Comet Tail'. 600 is where it really caught my attention, since Vienna and SSD bots knocked this relatively simple puzzle out of the park, but Infobot failed. Bear in mind that the (RNA) puzzle may be contaminated by the structure formed by a lack of cytosine, as infobot also failed the uracil free puzzle. I'd love to see this explored more. Any ideas why Infobot gets stumped by the bulges?
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Good observation, Paramodic.

I been thinking it is not all about size. New combinations of elements might also stump bots.

My puzzle A bulge and 1-2 loops made InfoBot time out. New combination of elements

Brourd's Small and easy 2. This one required a new and surprising boost solution.

And those puzzles are very small, like Comet tail as you mentioned.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
Now to be fair, there is a large, repetitious puzzle involving many 1 nt bulges called 'the fishhook' that SSDbot solved, but Vienna and Infobots couldn't. I think that SSD bot is exceedingly good with repetitious structures, though I don't quite understand why yet. Infobot seems to be hit-and-miss with them, and Vienna obviously has the most trouble with them.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Very well noticed. Yes, I think you got something there. SSD bot is good with repetitious structures, compared to Infobot and Vienna. I have been wanting to know for a long time what was the cause of why SSD bot solved puzzles that Vienna and especially Infobot failed. Thanks.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
Looking at it, I think it's safe to say that infobot seems to handle loop-heavy puzzles better than the other two do. As such, it still baffles me that it would have such a hard time with bulges.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
So, I launched two test puzzles- 'InfoRNA bulge test' 1 & 2. Both feature series of same-sided bulges. The first is a series of two bonds, one bulge, two bonds, one internal loop, two bonds, one bulge, etc. The second is slightly more chaotic in design, but features longer strings of bonds, multiple bulges on the same string, and bulges after loop closures. All of the bulges, except for one on the second puzzle, are 1 NT and same-sided. Infobot failed both puzzles, Vienna failed one, and SSDbot passed both.
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
I created 123and4 bulge puzzle which took a puzzle the bots had all succeeded in (called 'very easy') and added all 4 bulges (I had to cut a few stacks shorter as I was hitting the 85 limit) but the bots all failed that puzzle

next attempt will be the same puzzle with the 4 bulge removed - if that succeeds I'll add it back as a 1/2/3 bulge and I can keep iterating until I find the most complex version of that puzzle the bots can solve.
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
the bots all failed that too - so one more bulge removed - rinse repeat
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
A clever idea, sir.
Photo of Freywa

Freywa

  • 41 Posts
  • 2 Reply Likes
As a further note, InfoRNA has recently caught up on trying to solve my (historical) Kyurem puzzles. There are 19 of them: 17 in the main series, a 2-2 loop test puzzle and a non-lab puzzle which combines Kyurem 9 and Kyurem 10.

InfoRNA managed to solve 1 and 2, but I never expected it to fail everything else (except the 2-2 loop test and Kyurem 9). Hence, I propose an InfoRNA-foiling structure: corners and multiloops placed close to each other.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
Quite the observation, good sir, thank you. Personally, I think that it still has to do with the bulges, even though your puzzles feature rather large ones as opposed to the 1 nt bulges I've been playing with. You may be on to something with the multiloops, though. I hadn't considered testing with those. Now that I think about it, having bulges in sequence with internal loops (particularly of the uneven variety) seems to also mess with infobot's success level.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
That is a great question, JL. I will look forward to get the answer on that one.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
It's definitely a valid question, but I don't think the bots are right. eteRNA exists, in part, because Vienna can't solve for structures that do exist in nature.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
To get the answer on that, we could throw in the puzzles in the beta lab and try solve them to our best knowledge. If nature likes our solutions, then the bots are wrong. And if nature does not sanction the puzzles, then the bots are right and the energy model wrong.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
I'm game. I just need to find a simple enough 60 NT structure that at least beat infoBot.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Just pick one of the small puzzle mentioned in this post, a lot of them are shorter.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
I would like to take this moment to DECLARE SHENANIGANS. Eli Fisker's 'Two Bulges and a 1-2 loop' puzzle defeated Infobot when it was ran in the player puzzle section. I copied and then utilized the same structure, but facing the opposite direction (bulges that were on the left are now on the right, etc), and Infobot solved it. What is this tomfoolery?!
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
Not only did infoRNA solve it, but it did so at a horrifically fast speed, something like .9 or .09 seconds.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
I think I figured it out. Infobot only fails on puzzles where the bulges are on the left side of the watch hand, so to speak. In other words, when the bulges are set up so that the endloop of the RNA has a clockwise trajectory relative to the stem. Case and point: Eli Fisker's 'two bulges and a 1-2 loop' player puzzle ( http://eterna.cmu.edu/eterna_page.php... ). Infobot and Vienna both failed, where SSD succeeded. I then took the structure and mirrored it to have a counter-clockwise or right-wise trajectory so that I could post it in the player project beta and not get told I was making a duplicate puzzle. That's when something really crazy happened: Infobot solved the puzzle in .09 seconds. It was truly a confounding result- why would infobot fail one puzzle, but be able to solve that puzzle's exact mirror?

It's because Infobot has more trouble with bulges on the 5' half of the fold than the 3'. I don't know why, but this is supported in almost every simple bulge-using puzzle that infoBot has failed. Go on, look and see for yourself. I think we've found a much more specific 'what' to our question. The next order of business, I think, is determining the how and or why, if you all agree that we've pinned down what causes the failures.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Hi Paramodic!

Yes, there seem to be a difference on puzzles based on which way they are turned. I have reversed some of Brourd's and mine version were easier to solve as a puzzle. Which proves that Brourd publishes the hardest version possible :) I have also reversed some of my own. Depending on it's direction, it is easy to the bot or sometimes stumps it. Somehow direction seem to matter. I haven't figured out the why. But I like what you are on to.
Photo of paramodic

paramodic

  • 77 Posts
  • 7 Reply Likes
Thank you, sir. What I've noticed is that as long as the curving arms follow a clockwise path, the bots have a much worse time trying to solve the puzzle. What I mean is that it's not just some directions for some puzzles, but clockwise with puzzles containing 1 NT bulges.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
That is actually really cool. Good noticing.
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
right and left handedness looks to apply to the "rnassd" bot if you look at the puzzle "2 bulge turned other way" - though I suppose that might be random chance so I'm now trying 123 bulge - with 2&3 reversed
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
Hmm - wondering why bots treat very similar puzzles differently

http://eterna.cmu.edu/eterna_page.php...
and
http://eterna.cmu.edu/eterna_page.php...

are very similar and 2 out of 3 bots failed for both
but a different bot succeeded in each case - why ?

is it just random ?
if it's random and we want to test this more accurately is it possible to 'fix the random seed' that the bots start with - so there is the option when testing similar shapes to have them all start on the same series of random bases/changes.

if it's not random - what is the underlying difference between these two puzzles ?
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Hi Ed!

Though the two designs are seemingly similar, there is difference on the two puzzles. Tilted-baby has adjacent strings, angry puppy has three nucleotide between each arms. Also there is difference in number of nucleotides.