Lab submission 'evolutionary tree'

  • 4
  • Idea
  • Updated 5 years ago
I would like to be able to see a higher archival parenting of lab submissions, to be able to see where 'mods of mods' come from (and on the flip side, how designs are adjusted down the line)
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes

Posted 5 years ago

  • 4
Photo of bekeep

bekeep, Learning Researcher

  • 98 Posts
  • 20 Reply Likes
Interesting idea.  I think this goes hand in hand with improving the lab interface, generally.  The "one-design-at-a-time" is a really inefficient way for people to navigate submissions.  It would also make it clearer where ideas come from (which would be helpful for IP issues).
Photo of Astromon

Astromon

  • 204 Posts
  • 30 Reply Likes
This is a great idea. I would also like to be able to go back to the beginning while looking at someone's designs so I can see the line of thought that helped design the RNA. Just get the site to record all lab submissions and delete un-submissions work in the labs. the program will automatically record all painted nt's in the labs and delete all but the submitted designs that get synthesized.
(Edited)
Photo of rhiju

rhiju, Researcher

  • 416 Posts
  • 125 Reply Likes
@LFP6 (or other players), do you have a draft of what this looks like for previous labs? We'd like for this kind of figure also in publications from eterna, but don't have a dev with the cycles to put together, color in a pretty way, interpret results. 
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes
Hmmm... I could give that a shot at some point.
Photo of rbierman

rbierman

  • 5 Posts
  • 0 Reply Likes
This reply was created from a merged topic originally titled Ancestry System (Mod-of).

Formalizing the 'mod-of' system.

It could be beneficial to create an ancestry system in labs, where mod-ing information is automatically stored for the player.

The current idea is that when a user presses [view/copy] on a design from the voting screen, their mod will automatically be linked to the original design in a child-parent relationship without any user intervention.

Can anyone comment on this system, propose a new system, spot any problems/difficulties?

Thanks!
-Rob
(Current Das Lab rotation student)
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1026 Posts
  • 332 Reply Likes
This could be another useful clue, but it won't be definitive.  The clearest example of this is where a player decides to make a mod to a design from a previous round. AFAIK, there is no way to do this using the UI's [view/copy] button.  What happens instead is that a player will cut and paste the sequence to get the design into the current round.

I certainly use cut/paste a lot when I am submitting, for various reasons.  If I am devoting a chunk of dedicated time to lab submissions, I might or might not start with with one invocation of the [view/copy] command and I might or might not actually use the copied design as a basis for my first submission.  In the past, there was never any reason for me to care about leaving an accurate "ancestry" trace.  But, if this is implemented, and I understood the algorithm for deciding how a parent was going to be assigned to a submission, I would at least try to structure my submissions to support that effort. But I'm also sure I wouldn't always succeed.

BTW, if you're going to go ahead with this, you probably ought to also think about sequences other than a straightforward copy/modify/submit.  Certainly, the sequence copy/modify/submit/modify/submit/modify/... 
is common.  Would you mark all these submissions with the same parent? Or make the parent of each one the previous submission?  Would it change anything if the player resets the sequence at some point?  What if they do modify the sequence with a cut and paste? (For example, I may shift the location of a subsequence by cutting the current sequence, rearranging it, and pasting it back.) No decisions you make are going to get it "right" every time, but the more intuitive the rules to players, the better the data will be.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1026 Posts
  • 332 Reply Likes
You asked for other ideas.  How about if you broke the designs up into "types" (i.e. all the Exclusion 4 puzzles might be one type) and treated each type separately.  For each type, collect all the designs (across all rounds) and sort them by ID.  Since IDs are strictly increasing over time, this orders them in time.  Now, for each design, go through all the designs that were submitted earlier and find the one with the smallest Levenstein (i.e. edit) distance.  If that is less than some cutoff value you've chosen, assign a parent/child relationship.  If not, consider the design to be "original".

This certainly isn't going to do a perfect job.  But my gut feel is that it would be fairly robust.
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes
Perhaps the copy/paste data could be adjusted to include the design ID that it was copied from? Just a thought.
Photo of nando

nando, Player Developer

  • 393 Posts
  • 74 Reply Likes
@LFP6: the pasted sequence may not come from a copy operation from the applet, so it doesn't seem feasible.


As for the lineage crossing rounds, the existing platform would actually have a quite natural way to work with that... if we hadn't stopped using it for reasons quite mysterious to me.

If you carefully check the design browser, you will find a column called 'Round', lately always filled with 1's. A long time ago though, labs used to be "restarted", and their 'internal' round number incremented. This had several advantages: the applet and server could check whether your new design was already synthesized in a previous round. And you could directly start off from some design prepared for round 1 and submit for say round 3.

The only issue I can see is the fact that loading a design browser that already contains 2000 designs for a certain target when you want to create new designs for round 3, might get slowish...
Photo of rbierman

rbierman

  • 5 Posts
  • 0 Reply Likes
I like Omei's idea of using the IDs to ensure parents are actually older than the children.

For Nando's concern: Maybe only the top 100 scoring designs from the previous round would be displayed by default and a player could choose to load more, or search by author/title.
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes
@Nando: Bringing back rounds would definitely seem like a good idea to me.

For the issue with load speed, the current system hasn't seemed like the best option to me. I would not load all designs, instead load, say, 100 at once (or let the user choose) and have a 'load more' button at the bottom (or pagify, or whatever). It's not really useful to 
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1026 Posts
  • 332 Reply Likes
@nando: Re losing the use of rounds, I think that happened when projects were introduced, at a time when there wasn't much communication between devs and players.  There may be an assumption in the database schema that labs are nested within projects, i.e. you can't have the same lab in multiple projects.  Or, more likely, there is just no API query right now that can retrieve the data for a lab across multiple projects, but that John could create one without much trouble.
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes
@Omei: Here's how it's currently dealt with. In the past_labs query, you actually see the "puzzles" (sublabs) list as separated by rounds, like so: "puzzles":[{"round":1,"puzzles":["6116601","6116602"],"playable":true}]
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1026 Posts
  • 332 Reply Likes
@LFP6: I'm puzzled.  In response to the past_labs queries I've tried (e.g. http://www.eternagame.org/get/?type=past_labs&skip=0&size=10), I don't get any "puzzles:" property.  Can you give me an example query where you do?
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes
Oops, apologies! I meant past_projects. past_labs doesn't actually work any more anyhow for newer labs.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1026 Posts
  • 332 Reply Likes
Got it.

The "round" in this response is not the same as the round that Nando mentioned.  In fact, I'm not sure what this "round" represents.

If you look at the past_projects query for the two NG projects, you'll see that there are the same six lab names.  (Labs are called "puzzles" in this response -- the naming is not consistent across the API.)  But the 12 lab IDs are distinct, and all of them are considered to be Round 1.  If we were to use the method Nando alluded to, where the same lab ID could be used in multiple synthesis rounds, and the round was a property of the lab object, we would see the same set of lab(puzzle) IDs in both projects.  In that case, a query for solutions of that lab ID could return all the solutions across all rounds, with each solution having the appropriate round property value.
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes
That's not quite my understanding based on what I've seen. From what I gather, there is not a new project per round. Instead, new sublab IDs would be added with the next subsequent round number.

There are older labs that use multiple rounds if you're interested, it's just a matter of finding them.
Photo of rbierman

rbierman

  • 5 Posts
  • 0 Reply Likes
I might be able to help. I have a list of puzzles with the same secondary structures that occur over multiple rounds. Is this what you mean?
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1026 Posts
  • 332 Reply Likes
@LFP6: Can you show me an example that illustrates your understanding?

@rbierman: What I think we're discussing right now is what changes it would take in the game code and/or database design to let players easily see all the solutions for a "puzzle" (also called lab or sub-lab in some contexts) in the Eterna UI.  If and when a decision is made to do that, then your list will be very useful for updating the database for puzzles that weren't originally coded that way.
Photo of rbierman

rbierman

  • 5 Posts
  • 0 Reply Likes
Great, thanks for explaining @Omei!
Photo of LFP6

LFP6, Player Developer

  • 639 Posts
  • 109 Reply Likes
@Omei:

Here's a lab that actually uses rounds as they were designed: http://www.eternagame.org/web/lab/3553469/
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 1026 Posts
  • 332 Reply Likes
Thank you.  This is the first time I have seen both levels of "puzzles" properties used in a project.  I've wondered what they were intended for.  For the discussion, here's a selected section of the response to your query:


So you're right that two rounds are encoded here.  But this way of representing round number in the API doesn't really address the problem of retrieving designs from "equivalent" labs across rounds.  If you expand each of the lower-level "puzzle" properties, you see things like

and

Notice that the two nids for "Motif Assembled GAAA tetraloop binders - Shape 0" are different.  This means there is no existing way to get designs from those two labs into the same browser window.  Before projects were introduced, we had instances of multiple rounds within the same lab nid.

For all I know, there is still a reasonable way to set up a new project without requiring a new set of lab nids, in which case we can again have exactly what Nando was suggesting. But I don't think there have been any examples of that since projects were introduced.
(Edited)