How can we handle Big amounts of data

  • 2
  • Idea
  • Updated 8 years ago
Player Projects are looking really exciting but the amount of data coming out of them is going to be difficult to assimilate.

So here are some suggestions (that came out of a chat) for ways to let players (and devs) look at the data coming out of them.

1) Search on shape notation with wildcards - so if I want to find all structures with shape notation ((..(...)*)) or *((...))* or similar.

2) Eli also mentioned searching for base sequences (that search more or less exists)

3) Edward_Lane: Search on 'loop energy' and overall energy, and dotplot density, and meltplot curve shape /steepness ?
Eli Fisker: Yep and I want to combine things [2:38 PM]

Edward_Lane: So 'advanced search' - find me all sequences with at least one loop that has energy exactly equal to "-3", a total energy of "-30 or more", contains the shape ((...(...))), contain at least 60% Adenine, and the dot plot error is low (density of grey not in target area 90%)

Eli Fisker: Yep, I love the idea of advanced search [2:40 PM]
Eli Fisker: But I'm mostly interested in energy at specific spots [2:39 PM]
Eli Fisker: But overall energy would be great to search for too [2:39 PM]

4) Eli Fisker: I hadn't thought on using meltplot, but I could imagine that could be helpfull too, to be able to search for a meltplot curve on a certain degree in a certain square [2:40 PM]
Edward_Lane: or 'flat at start' for x boxes [2:41 PM]
Eli Fisker: Exactly [2:41 PM]

5) Edward_Lane: oh and obviously the option to have all the search results appear in a sortable table with A->Z or Z-A "ordered by" options for each of the criteria [2:42 PM]

6) Another option that might let players flag up interesting results would be to let people
comment and "vote as interesting AFTER synthesis results" - giving another searchable value (find results where X people think this result was interesting) - though that might just mean some results get overlooked?

7) When you then get a particular set of designs - you also want the option to show all other solutions to the same lab(s).

8) You might want to consider also including a search for 'similar labs' where the 'difference in shape notation' is less than X puzzle builder button clicks (I think that describes adding/subtracting individual/pairs of bases pretty much anywhere).

There are many more options for things that you might search for - but most of those searches are already used and should be continued GC percentage, melting point, etc
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
  • probably slightly over excited

Posted 8 years ago

  • 2
Photo of starryjess

starryjess

  • 35 Posts
  • 2 Reply Likes
Maybe we should also be able to search based on how well the design synthesized -- if, say, we wanted to see all designs that scored better than 90%, or worse than 80%, etc. This could be combined with other searches to narrow the results.
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
I'd kind of assumed that the synthesis score was in the search listings, but I've spotted another criteria I'd like to search on

whether each particular bot solve the puzzles or not (and if more bots are created whether they can solve it too)
Photo of starryjess

starryjess

  • 35 Posts
  • 2 Reply Likes
Oh, you're right. We can already search that way. Hehe, nevermind! I guess I thought we were starting over from scratch.
Photo of Edward Lane

Edward Lane

  • 139 Posts
  • 8 Reply Likes
another search criteria that might be of interest

percentage of shape notation that matches the synthesised shape results (the blue and yellow plot) - and then also 'length of longest continous section of matching values'

which is a slight variation on the synthesis results - perhaps one section of an otherwise badly folded design synthesised perfectly - so perhaps that can be useful somewhere.