New synthesis candidates selecting system

  • 30
  • Announcement
  • Updated 8 years ago
Dear players,

In last few rounds after the public launch, we have witnessed a few shortcomings of our current voting system for selecting synthesis candidates. Because now we have hundreds of lab submissions now, it is extremely hard to go through all designs to pick out the best one. In most cases, people are intimitdated by the number of designs they have to look through, and often decide to vote on designs that already got lots of votes ("snowballing") or only look at single metric (such as free energy) to vote. Devs had a meeting about this today and came up with a conclusion that the voting system is not suitable for the system like us, where we have massive number of candidates.

Instead, we are now thinking of applying Elo Rating System for synthesis candidate selection. If you saw the movie "The social network", you'll recognize this system right away as this was used for the FaceMash system. In this system, users are continuously asked to pick the better of 2 candidates. Each user decision create a partial ordering of candidates, and the system tries to come up with a total ordering out of all partial orderings trying to minimize inconsistency. In the end, we will be synthesizing top 8 designs in the total ordering.

Instead of "voting", there will be a "review" button, where you'll be asked to pick the better of 2 randomly picked candidates. In the review interface, you'll be able to do full comparison of 2 candiates - you'll be able to see their statistics and interactively play with both deigns. You can do as many "reviews" as you want, and you'll be rewarded by how many "correct reviews" did.

The system has many great advantages. First, every design will be reviewed by someone. We can setup a system such that designs that didn't get reviewed are more likely to be chosen for random review candidates and it'lll make sure very design gets reviewed. Second, one-to-one comparison will allow players to make more in-depth decision than having to go through hundreds of designs. Third - the quiz like quality of the review will stimulate people to learn & ask more before they make decisions.

The system does have few issues. The biggest issue is that it would reduce social aspect of candidate selection. For example, it'll be now hard to tell people "vote for my design ABCDE, I think it's cool" in chat to promote your design. We plan to solve this issue by still having the "design browser" so you can still browse through every submitted designs and review a specific design, and even start doing "reviews" relevant to that specific design (i.e, fix one candidate of the review to be that specific design). Also, we could allow people to leave comments on each design when they review, so people who come later can see them.

The details still needs to be worked out. For example, how are we going to reward people based on their reviews? - do we say a review is correct or wrong only if 2 candidates in the review are synthesized and can be compared? If not, how can we rate reviews that involve non synthesized designs? We are still working on these questions and it may take some time for us to come up with a final system, but we wanted to throw this idea out to EteRNA players and see what players think.

EteRNA team
Photo of Jeehyung Lee

Jeehyung Lee, Alum

  • 708 Posts
  • 94 Reply Likes

Posted 9 years ago

  • 30
Photo of Berex NZ

Berex NZ

  • 116 Posts
  • 20 Reply Likes
In a perfect world, maybe. But you dont want to discourage people from voting. its already hard enough to get people to spend all their votes. From my experience people only use 3 or 4 votes, out of their available 7. The Elo system works better, with the more votes it has access to. Even if its random voting, the best they will get is 50%. Until more useful factors turn up to evaluate designs, I don't think a negative points system will benefit anyone.

If we were to implement a negative system, I'd make it a lot smaller amount, like 50 or 100.

Anyways, my two cents. Up to the devs.
Photo of Chris Cunningham [ccccc]

Chris Cunningham [ccccc]

  • 97 Posts
  • 13 Reply Likes
You don't want to encourage random guessing! If a random guess has an expected value of a positive score, then you are encouraging it.

Edit: maybe you didn't notice "down to a minimum of zero?"
Photo of Berex NZ

Berex NZ

  • 116 Posts
  • 20 Reply Likes
Standard game design, people have got to be feeling they are making progress. If you make the risk the same or higher than the reward, people won't take risks to make the judgement calls.

Yes you are right, I don't want to encourage random guessing, but the greater benefit is more lab participation. A balance has to be struck between validity of decisions made versus lab participation.

I did notice down to minimum of zero, to me, that would guarantee also encourage random guessing, cos they would have nothing to lose.

Its up to the devs. I'm happy enough with how its planned at the moment.
Photo of JRStern


  • 42 Posts
  • 2 Reply Likes
Well, I have only minimal experience with this kind of pairwise rating system. It has faults, but also virtues. The faults are not easily fixed. When you are presented with two bad cases, you have small incentive to proceed, and I suggest the system has small reason to reward you, yet maybe those two faults offset each other? And maybe there needs to be a max number of votes per round. But then, you'd like to be able to rank your comparison both by certainty, and by the quality of the better entry, and submit just your N best. Again, these two shortcomings offset - just vote a lot, and let chance be your friend. But, what about when you learn better and would redo your vote if you could? Aha, well, them's the breaks, I guess. but then one has to wonder how much rationality is even involved.

I'm generally skeptical of crowd-sourcing anything. It's a good way to get some random motion, and even a little progress out of that random motion, but it is not an optimizing process.

I'm not even aware of where one goes to see the lab results, I mean yes the leading model, but not an analysis of why it won. Is there even such a thing? And if not, then I wonder what the point is?
Photo of Chris Cunningham [ccccc]

Chris Cunningham [ccccc]

  • 97 Posts
  • 13 Reply Likes
I've been lobbying for a place to officially discuss specific designs for a long time, and if I remember correctly, there is talk of them implementing that feature soon.

Other than that the only place I know of where people discussed what they learn or know are in the chat window and in the discussion thread for one specific round. But that was an awkward discussion because GetSatisfaction is a crappy platform for an open-ended discussion, so I personally haven't tried to replicate it.
Photo of dimension9


  • 186 Posts
  • 45 Reply Likes
Hi JRStern,

I can sympathize with your doubts and misgivings - I have them too - this is new and strange to all of us. However, I would respectfully counsel assuming (if you can find it within) an attitude of patience and open-mindedness behind the inevitable and fully understandable skepticism and doubt.

After all, the only way to really see how this all will work out, for either good or ill, is to at least give it a chance, so we might as well put our best foot forward into it.

Once it's in place, the strengths and/or weaknesses will manifest themselves very quickly, I'm sure, and I'm equally certain that whatever does not work for the Players, will quickly be changed or removed by the Devs. After all, their ultimate success depends on us as Players being engaged and happy with the system they create.

So, here's hoping all our fears and doubts turn into pleasant surprises.

Best Regards,

Photo of chris_english


  • 3 Posts
  • 0 Reply Likes
The problem, viewed from a empirical point of view is the throttling down of the number of experiments resulting from the hypotheses. And this suggests that the problem has little or nothing to do with 'voting systems' however constructed, but rather with the underlying economics of the grant funding of this adventure that limits the number of syntheses to eight per lab round.

It just seems that experiment, writ large, might have a larger failure rate, precluded perhaps in this case by economics, than the results thus far derived. So I guess I'd say if you had limited dough, don't change a thing, but also be wary when making claims about the value of crowd sourcing. As much as we love to participate.
Photo of mat747


  • 130 Posts
  • 38 Reply Likes
Preview on new lab 1 - the comparison game

Minor error with the display of number of the picks made, When the number gets to double digits number it covers part of the
"picks" word.

OS Vista sp2
Firefox 3.6.14

Photo of Jeehyung Lee

Jeehyung Lee, Alum

  • 708 Posts
  • 94 Reply Likes
We have just posted some preliminary results from the new lab on "The Star"
Photo of Berex NZ

Berex NZ

  • 116 Posts
  • 20 Reply Likes
Thank you jee, for posting the Elo results up.

Below is the data for the top 40 Ranks. I've added the standard information back onto each design line.

If you want to see the whole set for everyone, you can find them at the following link.

PLEASE NOTE: if there is a line of N/A's just means the designer of the lab, has since deleted/removed their design.
Photo of merryskies


  • 40 Posts
  • 3 Reply Likes
Questions about the new voting system:

About how many comparisons per round will a player be expected to complete?

How much time will players have to do the reviewing? one week?
Photo of JerryP70


  • 6 Posts
  • 0 Reply Likes
I think the voting system should be conducted the same way with people voting on there favorite designs with some changes to the voting screen. For the current round have the Designer, Title, and Votes fields empty so voters can't pick designs based on a popularity contest. Also, if a player lists his or her name in the Description field the design is no longer allowed to be synthesized. Once the voting is complete then those fields can be displayed for further analysis by the other players.
Photo of igorcov


  • 2 Posts
  • 0 Reply Likes
To me it's non-sense to make the decision based on voting. Since this is science (not politics), the decision has to be made by some algorithm.

What should be the criteria?

First question that has to be asked is whether all the candidates have the same chance to succeed. If yes - random pick, maybe with addition of member ELO in the context.

Member ELO should be updated after some of his sample has passed the lab test and got some scoring.

If there are some relevant data that could be taken into consideration, they should be applied in the selection algortihm as well. For example, if it is known that lower energy means more chances to have a stable RNA at the end, that should be accounted in the algorithm.

Make the algorithm transparent, allocate like 50% to randomisation and 50% to some clear criteria and go with it.
Photo of oolong


  • 34 Posts
  • 5 Reply Likes
The main thing I don't understand about the whole voting thing is the 3 round process, that doesn't wait to take into account the results of the previous round/s before proceeding with the next. There are so many designs now, do we really need 9 from everyone with no real reason to change from the beginning ones? Maybe we need to go with fewer designs each, like one each, or let higher point scores or previous lab success buy more designs, maybe. Right now, many of the designs for the current lab look pretty similar to the ones from before. If we had fewer designs each, would we work harder on individual ones? As a newbie who has little chance of actually getting anything synthesized, is it worth it to spend a lot of time and thought on a design, since I get the 'thanks for coming' points for any design? Is it worth it to spend a lot of time trying to figure out if some obscure design looks better than one with a lot of votes, since as far as I can tell I can only get extra voting points for something that gets synthesized? I'm feeling pretty ambivalent about the labs as they are now. I came in at the end of Sept and things were going along very quickly and I thought this was how the labs always went. The way they are now is pretty boring.