Eterna dreams

  • 3
  • Idea
  • Updated 8 years ago
I would love to have all the past lab submissions, synthesised in the lab. It's an old wish of mine. I have been discussing this with Brourd and Mat and both think this is a good idea. Actually if it was not for them discussing this topic and including me, I would probably still just have been "complaining" about lack of data with regular intervals in the chat, instead of writing about it. Thanks to you both for ideas and feedback on this.

I think there is a lot more to be learned from the designs we have already submitted to the lab - those which did not make it through the voting process. In most of the lab series we need more data to uncover interesting tendencies or back up the ones we already have found.

I will illustrate with an example, why I would like all the data. I made a spreadsheet, where I looked at energy conditions around the neck area. My data criterion was that I would look only at working necks. I had the feeling that a energy pattern were at play in the neck area. I was interested finding out how the placement of energy around the neck helped facilitate making the neck work.

This energy tendency I was onto for months, before I wrote a post about it. I was waiting for more data to see if the tendency I spotted, were confirmed in the other labs as well.

One look at my data and it becomes obvious, that in the labs with few rounds or few working necks, that there is too little data to draw clear conclusions from. I couldn't get clear tendencies from the star- and the finger lab. (The data spreadsheet originated from this post.)

Having the previous lab submissions folded and scored, would solve my data problem. Then I might be able to predict more tendencies about what is the right energy condition around a neck, for it to be successful. If all the submitted lab puzzles had been synthesised, I could have written my post and theory about it much earlier.

Another positive side effect is, that there will be much more near twin designs, where only one or a few nucleotides have been changed. Those designs have already proven to be very valuable, when it comes to see what pays of to do and what does not.

Broud mentioned that he thinks it would definitely bring in more players. The chance to have ones design synthesized, not having to sit back while other players always have their designs picked.

The more results we get, the more we learn from our submissions. The more we learn, the more we can contribute to science.

More synthesized slots would also means we could have a new game area, where we could test negative hypothesis. To see if designs we think will fail, actually do fail.

Fx. the lab “Things to test” had a new element, the 2-2 loop. It did not behave as usual. We were not sure how to make it work, therefore we experimented a lot. Things that haven't worked before, just might. I have earlier described why filling multi loop rings and loops with blue nucleotides were a bad idea.

In second lab round, I on purpose made a mod of Mats successful 94% scoring round 1 design, with pure blue inside the 2-2 loop, to rule out that this would work in a design. I also added an extra GC-pair, which I for the sake of the purity of my experiment, probably shouldn't have. But as I suspected the blue 2-2 loop didn't work. And for the next two rounds people dared not vote on me. :)

If we had a playground in the lab where we can test negative hypothesis, we could test things like this, unpunished. It would even be cool, if we got points for failing in there. People could be allowed to bet, for or against if an experiment will successfully fail. As Mat says: I think the designs would need to be submitted into their own voting category for the idea to work fully.

Experiments with negative hypothesis could lead to finding patterns for why certain things don't work - sort of the rules of the misfolds. If we can find the rules for what for sure won't work, we are well on the way to discover more rules about what works and what is to be avoided.

So for Eterna past, present and future – more slots, please...
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes

Posted 8 years ago

  • 3
Photo of Adrien Treuille

Adrien Treuille, Alum

  • 243 Posts
  • 33 Reply Likes
It will happen. Seriously!

One thing I've been thiniking about is suppose there are, let's say, 10,000 syntheses per week. Right now, that's more slots than we have submissions, but that will quickly cease to be the case. There will always be slots scarcity, no matter how many we have. We need to find a way to allot slots when we we have slots-a-plenty, but voting is definitely no longer going to cut it.

Any thoughts?
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 122 Reply Likes
To reiterate what Adrien said ... This will happen, and it will be amazing.

It is going to take some fairly challenging experimental innovations, but rest assured that we are ordering the DNA, as we speak, and hope to have all the kinks worked out of the pipeline in the next 3-6 months. Its a big undertaking, but a top priority for my lab and for EteRNA.

Now the question for the players is -- with that much throughput (10,000 slots/month, perhaps per week), how are we going to analyze all the data? How can we "publish" our insights? We have some ideas, but we are most looking forward to yours. Please put them here, and I think we should plan another 'chat summit' meeting in late october where we discuss this not-too-distant and incredible future...
Photo of Quasispecies

Quasispecies

  • 100 Posts
  • 9 Reply Likes
10K/month? Per week? Holy smokes. I've thought about posting a strategy based on a library of structural subunits, but I thought the lack of synthesis data (currently) would make it of limited use. That's a lot of RNA, but it would be great for the strategy. I'll try to post it this week when I'm not about to fall asleep.
Photo of mat747

mat747

  • 130 Posts
  • 38 Reply Likes
Hi rhiju, Adrien

"10,000 slots/month"
Ok. lets say we have 10,000 slots per month.

What will be the percentage of slots given to players ?

How long will a lab cycle take per target shape/Round ?

Rhiju

"We have some ideas"
What ideas, could you post them.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
In my ideas for how to structure the comming big heap of RNA data, I mentioned that near twin designs were very valuable.

That made me think about this. I would love to have a search mechanism, like on internet browsers, that gives you pages similar to the one you already have. So instead of related pages or articles, give me related designs.

Eterna game sort of already have this function, as it gives scores to designs that don't gets synthesised, after the similarity to those which does.

So I would like a function that can give me designs similar to the ones I want near twins for. Or just give me similar designs, that fulfill certain criterias.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Hi Adrien and Rhiju!

Seems I'm going to have my wishes fulfilled... and then more. :)

Wow, you guys are nuts! But in just the right way. :)

You asked about our thoughts, so here is mine.

RNA LIBRARY SYSTEM (I liked that Quasispecies mentioned the word library)

With that many slots, we need a system (maybe more than one) to sort the results of the designs, in a way that makes them easy to find or drag out the data we want to look at.

We need to be able to search for data from specific criteria, just like in ”Current lab”: Eg. This number of GC-pairs, intervals minimum 12 GC-pairs and maximum 18 GC-pairs and combined with this amount of free energy. Just much more advanced.

It should have groups like in ”Current lab”, but with more categories, as we know more about the RNA after it has been through the lab, than when it enters. We should think through which kind of extra data this gives us, and try to set up categories for it.

This rna library system should also be able to register which part of the elements in a RNA design are working. Is the neck working, do the smaller outer loops folds as they should, how many of the strings are working and which.

ENERGY

I know the energy numbers we have on the RNA, comes from an energy model, and that we can't be sure it looks the same in the real folded rna. But I would love to be able to compare energy in specific spots, or for whole elements (like a neck) on RNA.

Just like in the example I gave above in the post. I would want to compare the necks that are working. Then I would want energy shown for spot 1, 2 and 3 in combination with the element ”working neck”

Spot 1: energy level inside multiloop, spot 2: energy level inside of hook and spot 3: the collected energy inside of the whole neck.

I should also be allowed to drag out a certain portion of this data, intervals of it. I would also like to be allowed to compare the same spots in different lab shapes.

We would also need a system that somehow can be used to cross examining things between the different lab shapes. As I'm fasinated with necks, I would love to be able to compare working necks from one lab to those of another. I know many of them have different lengths. But some are of same length. I would also like to be able to compare designs, that have multiloops of the same size.

DESIGN STRATEGIES

We will have to rethink our RNA designing. Here is one idea: Change in one puzzle, equals changes in many.

I could imagine taking up an old lab puzzle, and lets say, drag all the existing designs for that. And then through one puzzle make changes at chosen spots, eg in the multiloop, and change all the basepairs in the multiloop to GC-pairs that turn in the right direction. And then synthesise all the designs with that specific change. Sort of to test if it improves the statistic for correctly folded designs.

Or same situation, different strategy. I could throw in a succesfull neck in a group of old synthesised designs, and then see if it improves the overall succes of the designs. With many more slots comming, I guess that my strategy ”Catalog of necks” would pose no problems. :)

ENSEMBLE ALGORITHM

I guess most of the many slots, are going to be the work of our ensemble algorithm. If I understand it right, then having tons of synthesised RNA, will help teach the algorithm learn what is right and wrong.

EXTRA IDEA

What about some sort of inbuild spreadsheet in Eterna, where we can drag the data in from the ”RNA library”. Also I know too little about making graphs. Make the tools easy for us to use, like you have done so far and we'll give you scientific results :)
Photo of Adrien Treuille

Adrien Treuille, Alum

  • 243 Posts
  • 33 Reply Likes
Eli,

Thanks for this fantastic set of ideas. I've suggested that we go over them in our next developers meeting, and I'll let you know what ideas come out.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Hi Adrien! Happy to help. This sounds promising. I will look forward to all this new eterna playing with old eterna.
Photo of slydog

slydog

  • 19 Posts
  • 1 Reply Like
Perhaps the lab focus could be expanded to test proposed hypotheses. Player would submit hypothese they would like to test. A first round of voting would select which hypothesis to pursue. Players would then propose one or more puzzles to test that hypothesis. Other players could also propose puzzles for the hypothesis. Then, players would submit different designs with a goal (this will work because ... , this won't work because ...). We could vote first on which puzzles are a good test case for the hypothesis, then which solutions best add new information.

This is a bit loose, but I hope you get the idea which is something more than the simple "let's try to synthesize this design."
Photo of Adrien Treuille

Adrien Treuille, Alum

  • 243 Posts
  • 33 Reply Likes
That sounds like an awesome idea, Slydog. We have been thinking on similar lines, but haven't worked out the details yet. Don't worry, though, keep playing and stay abreast of the forums! We won't do anything without a lot of discussion with the players!
Photo of mat747

mat747

  • 130 Posts
  • 38 Reply Likes
Hi Adrien, slydog

Adrien
Last week you suggested the idea of testing "Negative Hypothesis" designs"

I think the best way for people to explore new ideas is to have those designs in their own category where there are no points given. (designs would be still scored)
The "Negative Hypothesis" and "proposed hypotheses" (slydog) could be tested on the same Lab puzzle in a similar way to the Bots/Ensemble are now with their own categories/slots.

The Hypothesis designs could be displayed/listed in the Lab together with Main designs, similar to the player puzzles area, having separate "Lab candidates only" and the main puzzles.

In the lab when a player has completed their design and submit it, He/She could be giving a chose to select which category they want to submit the design into Hypothesis or Main.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Hi Slydog, Adrien and Mat!

Mat and I have discussed the negative hypothesis and points lately. Here is a short outline from a chat the other day, to underline Mat's point.

mat747: back to the testing - me and d9 were asking for away to test before things like "negative hypothesis"
Eli Fisker: Good.
mat747: i think the best way for people to explore new idea is to have category were where is no points getting
mat747: the designs would be still scored
Eli Fisker: There could be a point to that. There is a certain shame, when one is doing something that does not work. So fear of failure may prevent some experiments
mat747: yes and no fear of losing points
Eli Fisker: Exactly. I'm against the idea of losing points, for the same reason. Fear is a motivator, but not in here.
Eli Fisker: More people will stay, if we don't punish, and they will have a nicer experience. This is a game, we should have fun.

So negative hypothesis could be sort of like Market strategies, where we don't get point either.
Photo of slydog

slydog

  • 19 Posts
  • 1 Reply Like
Hi mat and Eli,
Good ideas. Absolutely hypotheses could be (proposed and) tested in a separate section of the lab design competition using the current design competition as a vehicle. Lots of ways to incorporate testing of hypotheses into the lab.
For the design competition part, from postings 6 months ago, it seems that a change to an Elo system is in the works. An Elo system is a rating system that incorporates scoring that can go up or down, a form of penalties and rewards for failures and successes. Like you, I don't think that would be appropriate for hypothesis testing, but for the design competition, I think it's right on. With such a system, or something incorporating penaties, participants would be much more thoughtfull when voting. As it is now, it's stupid not to vote for the higher scoring designs because only those will be synthesized and that's the only way extra points can be earned. That's more game-playing than I like. Carefully reasoned choices is what counts. Incorporating both incentives and penalties would, over time, encourage players to become much better at analyzing designs and understanding what works.
- Sly
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Sly, you are absolutely right of the unscientific use of some of our votes now.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Sequence search tool

Another thing I would like to do in a new RNA library system, is to search for a sequence in a string. I would like to be able to hunt down problem patterns.

A sequence tool would make it easier to find propable problem sequence - sequences with a high failure rate. The same will be the case for suspected succesful patterns.

Also I would like to be given statistical data for such a search. With huge masses of data, to be given statistic tendencies, will be useful.

Problem patterns sometimes do work. With sequence search, one might be able to answer this question:

Why it does work sometimes, what conditions must be present for it to work and why does it not work most of the time?

I would hate if we have to write in letters in a line. (like in the present sequence saver in lab) I would like to be asked how long is the string/pattern I want to search for. Then I should be given a visual string of the wished for length, where I can color in nucleotides with the mouse, just as usual for puzzles and lab. This will make the sequence search tool easier to use for non-biology students.

It would save one to look through all the designs, that are not relevant to the specific thing one is looking for.
Photo of mat747

mat747

  • 130 Posts
  • 38 Reply Likes
Hi Dev

rhiju asked
"Now the question for the players is -- with that much throughput (10,000 slots/month, perhaps per week), how are we going to analyze all the data?"

I think my idea for "Computationally selected elements" could help other players with the right UI do what I have been doing visually for sometime now.

Mat
Photo of Quasispecies

Quasispecies

  • 100 Posts
  • 9 Reply Likes
mat - i posted an overview of the sequence fragment / substructure library thing that we were talking about the other day. It's on another thread. Take a look and tell me if it's similar to what you had in mind.
Photo of mat747

mat747

  • 130 Posts
  • 38 Reply Likes
Hi

Yes, very similar to what I have been using.

The ""Computationally selected elements" idea is abit different in how I think the "substructure" could be determined.

Mat
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Clarifying comment to my idea for a sequence search tool:

I would like the option to search for both single stranded and double stranded patterns. And the single stranded part of the search tool could search for patterns in loops too.
Photo of Quasispecies

Quasispecies

  • 100 Posts
  • 9 Reply Likes
Could you elaborate on how you would define substructures or elements, mat?
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 122 Reply Likes
What all of you are talking about is "real science".

What would you think about players also writing short scientific publications making hypotheses and testing them? I am talking with some people about setting up a track in PLoS Currents here:

http://www.plos.org/

these tracks are for disciplines where so much stuff is happening so fast (e.g., influenza, disaster science) that the normal scientific review process is slowing down progress. If we can get our experimental throughput up and get you all to write micropapers, I think we would merit such a track.
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Hi Rhiju!

That will be great. As you say, we are already talking real science. And we sort of have started writing short science articles in the GetSat area. So with a little push, like help on how to structure things and feedback, we should be ok. :) I'm in!
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Mat was asking into the writing of science papers. So I asked Jee to hear more about it. Here is the chat for all of you:

Eli Fisker: I was wondering. Rhiju asked if we wanted to write micropapers. Could we get an example to inspire us [16:47 GMT]
Eli Fisker: Telling us what should be in, in which order and so on
jeehyung: http://knol.google.com/k/hilary-placz...# This is one example
jeehyung: We are not entirely sure what exactly topics of micropapers will be. I think they would resemble strategies. For example, you could state a hypothesis "the optimal percentage of GC pairs is 60%"..
jeehyung: then use hundreds of synthesis slots to test designs with wide GC pairs range..
Eli Fisker: Yes, market strategies, but I also have testable theories in my getsat posts
jeehyung: analyze your results, and publish the results if your hypothesis was right.
jeehyung: Yes GetSat post too. Strategy was just an example, players should be able to do whatever experiments they want to do with EteRNA/synthesis pipeline
jeehyung: That's our goal
Eli Fisker: Sounds great :)
Eli Fisker: Ok. But to make it really simple for us. We haven't tried before, so could we have an example with headlines, like hypothesis here, test here and so on, else people will be unsure what to do
jeehyung: And there'll be a big renovation of whole website for that, so it won't be like just using GetSat / Strategy - there'll be more formalized structure to do this
jeehyung: Yes - we are concerend that not many people are familiar with scientific papers
jeehyung: We are brainstorming about that..we even thought of having a dedicated editor who would help players to write papers. But that's quickly going to become overwhelming..
Eli Fisker: Yes, I'm been trying talk a good player (no names) into writing getsat posts, and I would love to have him write science papers too. But he says he is not sure how to start
Eli Fisker: I can see why you would want us to write for our selves :)
jeehyung: I'm thinking of maybe having a template people can start with. It's not optimal since many papers will sound alike,
jeehyung: but it would be a good starting point
Eli Fisker: Yes, that sounds great. Maybe a couple of different. But that will be a big help
Eli Fisker: Just like headlines, and headwords
jeehyung: I think players will catch up though. Good papers will be accepted, and some will get rejected. Soon there'll be player strategy guides to get papers accepted.
Eli Fisker: :)
jeehyung: (I personally can't wait to see that : ] )
Eli Fisker: This is so much fun
Quasispecies: hi all
Eli Fisker: hi Quasispecies :)
Quasispecies: how's it going?
jeehyung: hi Quasispecies
Eli Fisker: good, sort of back again after a short holiday, and you?
Quasispecies: good, about to go out for some some soccer in a bit. thought i'd see what excitement is brewing on eterna
Quasispecies: i like that idea for a "strategy guide for getting papers accepted"
Eli Fisker: Yep, that is a great idea
Quasispecies: there was a lab i worked in a couple of years ago that pinned something similar to the wall. it was completely snarky and non-serious, of course
Eli Fisker: :)
Photo of Eli Fisker

Eli Fisker

  • 2222 Posts
  • 484 Reply Likes
Having our own Eterna RNA fold

Mat mentioned an old idea of his, I liked it so much, that I resurfice it here. He have earlier suggested having a tool with eterna energy parameters.

His latest thought on the subject is: If we get 10,000 (slots pr.) round we could get enough to make a eterna energy parameters, that could be used with rnafold, it time.

With the new synthesis possibilities, this might be a more realistic scenario.

As Dimension9 says in this post:

I have a feeling that if we are to have a truly predictive tool for synthesis success, we will have to develop it ourselves.

I think these words are very true.

Clollin also said: I think any one tool or even metric within a tool is not enough - u have 2 verify all aspects of many tools and see a trend occurring.

I, like Dimension9, think what we do in Eterna is too different to what the other RNA tools are designed for. We will need a tool specificly designed for Eterna.