Ha, I knew it. :) The ribosome is a switch and in the small 5s rrna puzzle I have been seeing it the whole time. But Jeff’s new arc plot booster (Arc Plot Single State) allows the showing of it.
5s rrna puzzle as seen in Vienna2
The 5s puzzle has two sets of repeat GC’s that can rearrange themselves plus a crossed GU that is a hallmark of a weaker area - allowing for a switch.
Jeff’s booster currently allows seeing a single state sequence as both its target and its native state.
As he says: It's not 2 states in this case. "State 2" is the puzzle target. State 1 has the MFE/natural + dotplot/pairing probabilities data, but state2 is just the target shape. So far it's a hack of the 2-state switch version to 1-state puzzles.
Arc plot of the 5S puzzle with Jeff’s update of the arc plot booster
Newest script version: https://eternagame.org/web/script/9199920/
Omei: @eli, your observation that the arc plot of 5S looks like a riboswitch has really caught my attention.
At first, it seemed unlikely, as the 2D and 3D structure of the E.coli ribosome is well established, and though it does shift around some internally as it adds an amino acid, a refolding (in the way we think of as switching) didn’t seem plausible.
But the 5s is on the outer surface of the ribosome, so that if it did switch foldings, it would probably fall off. Without the 5S, the ribosome doesn’t work. Hey! A master ON/OFF switch!
Crossed GU’s in switches
The small 5s ribosomal RNA puzzle has one more thing worth noticing. The crossed GU's.
Crossed GU’s popped up at high frequency in our winning switches in the logic gate labs, mainly in real long switching stems. The crossed GU's are a switch maker. For the small 5s puzzle they are making an unstable point near some of the repeat C's and G's, to increase their chance of getting moving, while still being stable enough to make bind also.
Crossed GU’s that also reform across different states, image taken from background post below.
Background post: Use of GU in two input labs
Crossed GU’s scattered over the entire ribosome
I was watching the secondary structure and the sequence of two ribosomes from Noller lab. Escherichia Coli and Thermus thermophilus. Both small and big subunit. Link to the images from Noller lab.
There were crossed GU's scattered over the whole of the ribosome to a degree I wouldn’t have expected.
I think the ribosomal switch mechanism goes as follows. Watch for crossed GU's. Typically in their vicinity you will find two sets of stems with longer repeat C's and G's. Either the crossed GU will be in between the two GC stem sets (as in the 5s puzzle) or the GU will be at the end of a small hairpin loop with the switching repeat G's and C's nearby.
E.coli - Large subunit
E.coli - small subunit
Amount of GU's following the size of the switch?
Simpler switches are fond of using GU’s in the switching area. Bigger switches with more inputs starts to use crossed GU’s or just more GU. Really big switches may run amok in crossed GU’s. I even found a set of triplet crossed GU’s in the ribosome. :)
Strong sequence and structure conservation across species
Something that has got me awestruck was how much of the ribosomal sequence everywhere that were actually conserved when I compare ribosomal rna from two different species.
Especially in loops but also in stems. Structurally the ribosomal rna of the two species were also rather similar.
Excerpt from the Noller lab images of the large ribosomal subunits from Thermus thermophilus and e.coli:
Thermus thermophilus versus e.coli
E.coli has more crossed GU’s than Thermus thermophilus. The crossed GU's in ribosomal RNA typically turns up in two particular places - outer periphery (not in Thermus thermophilus) in small hairpins or in long hairpins.
Thermus Thermophilus live in a much harsher environment and as such has a much higher GC content. It also have fewer of the crossing GU's. But it lives in high temperature, so it is probably more vulnerable to stress, so it kind of makes sense. According to Wiki it has an optimal growth temperature at 65 °C.
Conservation of crossed GU’s across species
When comparing species, the crossed GU's often turns up in the same places despite different species.
However some species have structural differences like with shorter stems at some spots compared to other ribosomal sequences (no crossed GU's) or regions that were prolonged and had longer stems compared to other ribosomal sequences (added crossed GU's).
The ribosome, switches and uneven energy distribution
I have earlier been wondering about why there were such an uneven energy distribution (long stretches of repeat C's and G's) in the ribosome - compared to our small single state puzzles.
However if the ribosome is viewed as a switch it makes perfectly sense. In our previous switches with a growing number of inputs, trends towards uneven energy distribution (kcal). In itself a result of more base repeats.
The riboswitch has uneven energy distribution - the ribosome is a switch.
Background, see the section Uneven energy distribution
Ideas for how to go about the ribosome challenge
Try take a look at both sequence for e.coli and sequence from one other species. Eg. Thermus thermophilus as I did above.
Heads up, the Noller drawings of the ribosome seem mirrored in relation to our lab puzzles. Correction - it is rather us that is used to see RNA puzzles in a mirrored fashion in relation to science papers. :)
Then watch for the areas that have most conservation and avoid changing these. Opt instead for the regions with most variation. That way we up our chance for changing something that is less essential, but still raise our chance of making beneficial mutations.
The rationale is that I assume that Thermus Thermophilus is already rather stable as it has to work at much less favorable conditions than e.coli is already working.
We could do this in different ways.
Delete some of the crossed GU’s that are not present in Thermus Thermophilus, in e.coli also, with the intent of making e.coli more stable
Simply change some of the stem bases in E.coli that are close to similar to those at same spot in Thermus Thermophilus, into some that is upgraded in stability. So GU to AU or GU to GC.
Delete repeat C’s and G’s in loops that are not conserved
Add more GU’s in longer stems - in hope of adding more flexibility to the ribosome
Delete the crossed GU’s in general - despite this will probably not be helpful
From the strategy of choosing the path of least change to get an optimization. I wish to find out if we can transfer some of Thermus thermophilus good characteristics over and into e.coli.
To do our best on the ribosome challenge I think we need to better understand the surroundings of the RNA we are playing with.
If you are all new to ribosomes here is some introductory material that Omei and I have collected.
Getting the structure of RNA and protein
First a little background on one of the methods by which structures of molecules like RNA and proteins are obtained - x-ray crystallography. Here a couple of fine short videos by The Royal Institution.
I am also fond of this much longer video of theirs that goes more into detail and add a history perspective. By no means obligatory, just for fun.
When scientists have managed to get the structure of a molecule, they add them to a structure database. There are many different kinds of databases, some with RNA and others with protein. Some that have the sequence and others that have the structure also.
I will mention a few databases that I from what little I know, consider the most important. The biggest I have found so far is the PDB - Protein Data Bank. Don’t be fooled by its name, it holds RNA sequences too. ;) Then there is the PDBe that is the Protein Data Bank in Europe. Last but not least RFAM because it contains what is dearest to my heart - RNA.
Protein databases and how to view ribosomes in 2D and 3D
I have been playing around with watching the 5s rRNA puzzle and trying to understand how this small RNA piece relate to its surroundings. Both the other ribosomal RNA it is bound to and the ribosomal proteins that are binding to it.
Here comes parts of my journey that I hope will get you started with using database tools to look at the ribosome in a new way. Thx to Omei and jandersonlee for their comments and shared insights along the way.
I found a E. coli ribosome in the PDBe
By clicking the ribosome image (green box) I could browse a long range of visuals of the e.coli ribosome with different of its components highlighted.
I have found a image of how 5s binds in this ribosome.
It has its neck part sticking out and the two arms hugging the ribosome.
I was wishing to see the 3D models in greater details - being able to move the ribosome around and zoom in on details.
Omei: Eli, go to the main page for the entry, e.g. http://www.ebi.ac.uk/pdbe/entry/pdb/6hrm and click on 3D visualization
Eli: Yay, when I hover over I get base number. Now I need to figure how to get protein name or name on the ribosomal rna.
Lol, I can see through the ribosome
The proteins must be all the colored curls
Omei: The controls for moving, zooming, ... are much different than I'm used to.
Dragging with the right mouse zooms and dragging with the middle mouse button translates.
Eli: Ok, got that now
Omei: BTW, http://www.ebi.ac.uk/pdbe/entry/pdb/5IT8 is the wild type E Coli ribosome.
(I had picked a random E. coli ribosome when I searched the database. However it is smarter to pick the wildtype (here meaning a untampered version) instead of whatever version of the E. coli ribosome that scientists - or eterna players ;) - have messed around with.
Omei: This is available also at https://www.rcsb.org/3d-view/5IT8/1. I find that 3D viewer a little easier, mostly because I am used to it.
Eli: Thx, Omei. It's pretty. Any way I can see details like 5s or L18?
Omei: Yes -- the trick is to how to find them. :)
Please guide me
Omei: I don't have any good technique.
Eli: That's okay. Show me what you got
Omei: OK. Found it. Go to https://www.rcsb.org/3d-view/5IT8/1. It will be just out of view on the lower right. Use the mouse to rotate the ribosome just a little and you'll see it.
Eli: I have the puzzle rotated this way, but I am not sure I understand
ah, it is the 5s
Ok, so it is the loose ends sticking out a lot. At least I should be able to recognize it again for the future :)
And sigh, I wanted the image to stick name tags on (if anyone figures a way to call such a feature - please spill)
And I wish they kept the whole 5s in the same color
Omei: The "strand" name is DB, for what that's worth.
Eli: Thx, now I'm sure I found it
Eli: The ribosomal proteins are like staples around the ribosomal rna
Omei: BTW, you can change the coloring scheme with the choices on the right.
Eli: I like by chain.
Ok, so each part has a 2 letter code. I wonder if there is an overview of that?
There are some educational resources to use their data.
How to see what is binding to a ribosomal RNA
I am interested in which ribosomal protein/ribosomal rna binds where in the 5s rrna.
Here are the ribosomal proteins that I have read should have connection to 5s: L5, L18, L25
Eli: Ha, I managed to dig out domains when it comes to ribosomal proteins.
Some of the ribosomal proteins have an L before their number, but some of them also have an S. I don't know what the S signals.
Omei: I would guess "small" unit.
Eli: Ah, now I know what L means also (Large unit)
Eli: Ok, so from the names of the proteins associated with the e.coli 5s, 5s is bound to the large subunit
Eli: Found an image where I can see that L5, L18 and L25 can bind at the same time:
I also figured how to pull the rRNA components
The trick is to pull the RFAM database when you want to see RNA and the other two databases when you are interested in seeing the protein parts.
I took a look at the InterPro database that provided the ribosomal protein notation. I can look up individual proteins there and get a description of their function. Eg. L25 binds to the the 5s rRNA. https://www.ebi.ac.uk/interpro/entry/IPR029751?q=L25%20coli
I also by the way found the following fascinating quote: Many spontaneous E. coli mutants lacking one or two ribosomal proteins have been found and characterized (see for review). However, no mutant lacking any of the 5 S rRNA-binding proteins was found. This was indirect evidence that 5 S rRNA-binding proteins are important for ribosome functioning.
I recall reading somewhere that the function of the ribosomal 5s proteins were to protect the RNA from getting eaten by RNases.
Eli: I find this sequence repeat in the 5s rRNA truly fascinating. The two “arm” regions have a sequence stretch of 8 bases that are entirely identical.
Also I finally found what I wanted. A visual of the binding sites of the different 5s ribosomal proteins.
They cover the most of the 5s rrna. But I find it fascinating that exactly parts of the identical switching areas are available - that is when the ribosomal proteins are gone.
Image from here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC344668/?page=4
So basically the L5 protein is the final cap on the 5s rRNA. I think you got your master ON/OFF switch for the ribosome with 5s, Omei :)
Now this paper is old and I think I have seen even more of the bases in 5s included in binding with the ribosomal proteins. So this is not the final say in which bases in 5s rRNA is bound somewhere outside.
Eli: I am considering how I can use the fact that some proteins bind specific places to the 5s. My immediate thought is that if I can avoid touching most of the bases that are involved with protein bindings, then that should be better. So like trying to avoid mutating the arms.
Omei: You're certainly right about doing our best to be smart about which bases to not modify.
But the general belief is that it is safer to modify stems than loops. The reason for this is that the loops are not just “hanging” out -- they are the source for most the tertiary bonds that stabilize the 3D structure of the ribosome.
I think RNA/protein bonds are generally weaker than RNA/RNA bonds or RNA small molecule bonds.
Jandersonlee: Probably true that stems are safer regarding leaving bonding sites unchanged. And 180 degrees out from my prior strategies.
Ribosome Mapping Challenge
We are currently 1000 lab players with access to the lab. Cheers to all the new lab players :)
We are only to have 24 full ribosomes made in lab. I have been thinking about how to raise our chances of success. Here is what I have come up with.
I have tried to understand which bits of the 5s rRNA is covered in protein and as such which bases to avoid modifying. (I’m not sure I got all bases covered correctly yet) However if I should do this for all the ribosome sub parts, this is going to take a lot of time. Likely till long after the ribosome challenge is over.
On the other hand if you pick whichever part of the ribosome you fancy and try figure where it is bound up to proteins, and more players do the same, we have a chance of making a roadmap of the much more limited part of the ribosome where it may best pay off for us to mutate.
Let’s pool our skills and efforts to raise our chances of success with the challenge so we modify where it has the best shot of having an effect.
Zama shared an article with me that mentioned the replication speed of E. coli. It is a fast replicator, it can replicate itself in about 20 minutes. However it isn't the fastest one around. So I decided to check. I found a few candidates. I couldn’t find secondary structures of them though. One I did managed to find a set of its ribosomal sequences for in NCBI.
I did this search: https://www.ncbi.nlm.nih.gov/nuccore/?term=clostridium+perfringens
After this I limited Molecule type to rRNA which left me with 8 ribosomal sequences.
The particular ribosome I am interested in seeing is clostridium perfringens. Its organism should be able to copy itself in 6.3 minutes. I thought it might help inspire us to cut some corners. :)
Ok, Clostridium perfringens should have a real short genome which off cause would help. Still I would like to see if there is something special about its ribosome. I asked the ribosome expert on ways so I could lay my hand on the structure overview.
Rhiju: laying out the ribosome in two dimensions is a hard challenge!
i had to manually do it for E. coli, but this took a lot of time and thought because I was trying to preserve some of the relationships seen in the 3D structure:
Rhiji’s ribosome drawing, for best resolution, follow the link below
It would take about another 4-8 weeks of development to get that RiboDraw package to automatically draw out ribosomes from other organisms.
I'd have to code it to accept sequence alignments that map which nucleotides in the E. coli ribosome map to nucleotides in the ribosomes of other organisms (such alignments are available on the Gutell comparative RNA website, I think). Unfortunately at the moment I don't quite have the time to do that 'last step'. ;)
The package that Noller uses is called xRNA, I think; and the one used by the Georgia Tech groups (http://apollo.chemistry.gatech.edu/RibosomeGallery/) is called RiboVision. They may have the tools you need to get quick secondary structure diagram.
Eli: All these visuals are beautiful and I'm already a huge fan of spacing the ribosome as Noller does.
What I like about Rhiju’s ribosome drawing is that I can see where everything is connected up.
If that could be a hover over function, so one only got the bases one marked it would be really helpful.
Also I realize that the drawing the ribosome in two dimensions is way out of reach with my current skill set. So I pass on the challenge.
Crossed GU’s or mismatched GU’s
I have already mentioned the crossed GU’s in the 5s rRNA puzzle and the scattering of crossed GU’s across the entire ribosome. I wish to share the following discussion that Omei, jandersonlee and I had about it afterwards.
Omei: Looks like the most common, but not universal, term in the scientific literature for crossed GUs is "(symmetric) tandem GU motif". I found a bunch of articles using that, with and without the word symmetric.
I think I saw something with wobble GU
But I think that just means single GU
Omei: Here's a very relevant one, though it isn't obvious from the title. It shows that the crossed GU can be either stabilizing or destabilizing, depending on the the adjacent pairs.
(First and most important side available, the rest of the article paywalled)
i.e. it's a non-nearest neighbor effect.
Eli: The crossed GU's don't always have the same surroundings in the ribosome, but they will often have one GC pair or more
Omei: Unfortunately, they didn't measure all the possibilities.
Eli: I think this will be an area that it will be interesting mutating in relation to the challenges
I also like the paper term GU mismatches
Omei: Oh. I just realized this paper is dated 1991, and is undoubtedly incorporated into the 1991 Turner energy model. The Turner 2004 model significantly changed these. (edited)
Eli: Still interesting. Most of the crossed GU's are turning the same way
when viewing 5' to 3' orientation
with the U coming first
When I view the noller images of both small and large subunits for e.coli and Thermus Thermophilus.
They are taking orientation that the paper call the most favorable for free energy
Ha, funny enough the crossed GU's in the logic gate designs do the same. :)
Here is my favorite sample, because the crossed GU's reform across different states.
Jandersonlee: @eli by crossed GU, do you mean 5'...GU...GU...3'? or just 5'...G...U...3'?
Ah I see in the examples above it is mostly UG...UG
Jandersonlee: In Vienna2, UG...UG is the stronger order and in natural mode it is shown as paired even though the UG...UG pair has a positive FE, the surrounding pairs can keep the stack closed:
In NuPACK, UG...UG is even seen as paired:
Not so for GU...GU in either model:
Correction: Vienna2 and NuPACK both consider GGUC...GGUC and GUGC...GUGC to form a paired stem. It's Vienna that predicts a 2-2 inner-loop.
But Vienna pairs CGUG...CGUC
Eli: Thx, Jeff! So Vienna2 and Nupack favors the same orientation as Omei's GU mismatch paper claims, plus it is the same crossed GU's that mostly turns up in the ribosome and in the logic gate winners.
Jandersonlee: It seems so. Both NuPACK and Vienna2 consider UG...UG to be a weakly paired stack cell with better free energy than GU...GU
Eli: Interesting. Main part of the GU's of same orientation in the ribosome subunits of e.coli and the logic gates lab winners are of the opposite persuasion. Meaning there are more GU...GU than the opposite. However it is not nearly as clear cut as the case with the mismatched GUs. There are just more of both types. I have been wondering if the double sameturned GU's are having a similar function to the mismatched GU's. There are quite a bit of those as well.
Jandersonlee: Both Vienna2 and NuPACK predict that *some* XGUY...Y'GUX' sequences will form a weakly bonded stem section where X pairs with X' and Y pairs with Y'. That may be sufficient if you need an easily switching stem. XUGY...Y'UGX' being slightly stronger is predicted to form a bit more often.
Jandersonlee: And yes, GG...UU and UU...GG can work as well.
Vienna2 seems to rate the strength of AUGU...AUGU, AUUU...AGGU, AGGU...AUUU as the same, and all stronger than AGUU...AGUU. GUGC...GUGC is rated slightly (-0.2 kcal) stronger than GUUC...GGGC and GGGC...GUUC and much stronger (-3.0 kcal) than GGUC...GGUC. NuPACK ranks them in the same order, but with slightly different relative strengths. So most of them are viable alternatives, with GU...GU being the most fragile/switchable.
Eli: Ah, GU...GU being fragile/switchable, makes great sense for the logic gate labs as they turned up in the longer stems that needed to switch/move. And even for the ribosome. :)
Also of consideration is the context in which the crossed GUs occur. In E. Coli 5S, the natural mode (left) has a different crossed GU pair UG@40+UG@80 than the target shape (right) UG@80+UG@95. both have "magnet" C* and G* runs nearby. This suggests that the 5S molecule may be switching in response to the presence of other target molecules as part of its function and that the UG...UG...UG...UA sequences are to help make these sections more able to switch. While we can probably play with base pairings to make the natural state *be* the target state (and some designs have done so) it is probably also worth considering some designs that simply "favor" the target shape more strongly without necessarily eliminating the natural form (or altering it too dramatically). This is perhaps also true for the larger 16S and 23S. After all, the E. Coli ribosome *works* as it is with it's seeming "switch-like" behavior.and perhaps it needs some of that switching functionality to actually *make* it work.
As we begin watch the ribosome not just in 2D (in game) but also in 3D, you will come to realize that there are many ways RNA and proteins can interact. Demonstrated with this e. coli PDB entry, just changed the visuals of how the ribosome gets displayed and did a zoom.
The green sphere is a magnesium ion
The red dots are water molecules
Omei mentioned the following in one of the posts above, when I was wanting to avoid modifying the RNA bases in areas where ribosomal proteins were covering the ribosomal RNA.
You're certainly right about doing our best to be smart about which bases to not modify. But the general belief is that it is safer to modify stems than loops. The reason for this is that the loops are not just “hanging” out -- they are the source for most the tertiary bonds that stabilize the 3D structure of the ribosome.I think RNA/protein bonds are generally weaker than RNA/RNA bonds or RNA small molecule bonds.
I know when RNA binds with itself - as in a stem - then it does hydrogen bonding.
I asked Rhiju about how RNA and proteins could interact.
Rhiju: "well, RNA and proteins can interact in lots of ways -- hydrogen bonds, van der walls, 'salt bridges'. depends on which interface you look at. all types occur in the ribosome."
So I think we need to understand what type of bonds and interactions are possible, to better deal with the ribosome challenge. So I have dug up a bunch of videos that will help introduce the topic.
First the finest introduction to element interaction that I have yet got:
Oxygen by Hero Khan
Now to hydrogen bonding
Fun and easy intro to water and hydrogen bonds
Properties of Water with the Amoeba Sisters
This video explains the why behind hydrogen bonding
Dipole-dipole by Bozeman Science
Intro to different types of chemical bonds
Types of chemical bonds by Crash Course Chemistry
Fun and easy protein folding introduction
Protein Structure and Folding by Amoeba Sisters
This video goes more into details on protein folding. It also gets into disulfide bonds
Protein structure by Professor Dave Explains
Here is an excellent visualization of bond types in proteins.
What is a protein by RCSB PDB
Wan der Valls forces
How do geckos defy gravity? by Eleanor Nelsen
Beside an option to view ribosomes in 3D with tools for visualization inside a database, there are also tools just made for visualizing molecules.
What I like about RiboVision is that it has a gamelike split up screen. First shoving the RNA sequence (primary sequence) at top, then having a window showing the fold in 2D (secondary structure) and last having a window at the right showing the 3D visual.
Things I particularly liked in RiboVision was the 1D field that showed if a particular nucleotide was on the surface of a protein and potentially easier to change. The larger the value (top) the closer the base is to the surface.
Image shot from this fine demo video RiboVision by Lan Wang:
There are two more intro videos made by the makers of the program.
Short comings of ribovision:
- The window 3D visualization in Java isn't running for me
- As far as I'm aware it is only showing the RNA part of the structure.
- There are only few ribosomes in the database.
Which tools do the scientists use?
I have asked Rhiju what tools scientists are use for watching ribosomes. He mentioned Chimera and Pymol. They are both very beautiful tools.
Advantages with Chimera is that it is made by one of the protein structure databases, PDB and as such it should be complementary with their data formats.
UCSF Chimera: Basics by RCSB ProteinDataBank
It also comes with the advantage that at least a few eterna players have some familiarity with it.
Here some Chimera introduction guides made by EteRNA players.
Quickstart with Chimera 1.7 by Nando
3D RNA visualization. Into on how to make things look familiar in Chimera.
Introduction to Chimera for EteRNA Players by Omei
Want to watch one of your lab designs in 3D? Here is a complete walkthrough of how to do it.
Bare Bones Guide to Creating 3D-RNA Structure Models Using UCSF's Chimera by AndrewKae
Also check out AndrewKae's guide to Chimera.
From what I see so far, it seems as if Chimera doesn't have nearly as many functions as Pymol and as such may be easier to learn using. This is just my first impression, so I will be happy to have player inputs on this.
Pymol comes with the downside of not being entirely free to use. There is a free version, but it seems to take some programming skills to make use of. I want us to preferably end up with a solution to our 3D visualization problem that will be accesible to all eterna players.
Visualization programs I: PyMol Tutorial by Kamali Sripathi
I'm not sure what will be the best road for getting our hands on a ribosome 3D visualization tool for EteRNA players. I have shared where I am on that journey in the hope that starting the discussion will help us towards making a good descision.
What I personally would like from a 3D visualization tool
- Be able to call a specific base number position as I find it in the eterna game 2D view and be able to see exactly where this base is in 3D.
- I will like to be able to see the name of whatever protein chain I hover over inside a ribosome. I want its naming like L31 or s14 or whatever it is called.
- I wish to be able to see what alternative ions, etc are bound exactly where. So that we can know what may behave as special places in the ribosome.
Any evidence anywhere to the contrary?
I wish to highlight that Omei has made a most useful booster for summarizing the mutations that differ from the starter sequence of the ribosome puzzles.
Summarize Mutations (version 1)
Here is how to use it:
1) Make your own copy with Start from copy
2) Remember to make script type into booster so you can call it with the lightning symbol when you are in the lab puzzle. And state that this is your copy of the script in the title.
3) Submit script
4) Refresh your lab puzzle page to read in the booster. Make changes to your puzzle and call the booster under the lightning symbol to summarize the mutations:
5) Copy out the mutations and post in your puzzle description
Installing Chimera, loading a ribosome and finding a specific base
I have installed Chimera and it was quite easy (at least for me).
Just go to their Download site and pick whichever package that fits your computer system. When the program has been installed, you will end up with a black screen.
Chimera is made for visualizing molecules from from the ProteinDataBank. So to load a ribosome, I just went to PDB and searched with a e. Coli ID number I had found earlier. (6HRM)
This later turned out to be a version that had its small and large sub stabled together and not the laboratorium version K12 that Omei had found and that I had intended. Anyway, since this was how I started out, I will do the demonstration with this example but recommend that you look up 5IT8 instead.
Fetch a ribosome
To load the ribosome in Chimera, I opened the File menu and choose Fetch by ID...
Then I entered the PDB ID number here:
Wait for loading...
Then I waited several minutes - remember that the ribosome is a monster in size - and I got this beauty:
I read this bit Omei’s Chimera intro, page 4, to get familiar with the controls.
Plus watched these fine small video tutorials made by RCSB PDB
Highlighting specific bases with a command
I got stuck on highlighting specific bases with a command instead of hunting it inside the ribosome on the screen, so I ended asking Omei for help. Here comes a brewdown.
Eli: I wish to be able to get RNA base eg. 1415 highlighted inside the ribosome somehow, without me having to search around for it. How would I do? Is there any command I can write?
Omei: Ok, there is a way.
Do you have a command line showing below the big window?
(If you don’t have that, you can call it like this: Tools, Command line, Raise)
Omei: OK, you can use that to select atoms/bases/... and then do whatever you want.
But the syntax is complex enough that I have to look it up each time. I'll find the URL.
http://plato.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/frameatom_spec.html is the doc for how you specify specific sets of atoms.
Omei: Can you make a screenshot with a label describing what you want to select?
Eli: Ok, I highlighted U 1235 chain 1
Omei: Did you want to do something specific with that base? Or did you just select one at random?
Eli: I just selected it at random, as it takes so long to find the ones I want to look at
I like chimera. But one thing I find confusing is that both the small and large subunit is in one big chain
Omei: Shouldn't be unless you loaded a model of one that has been artificially joined.
Eli: I loaded the version you shared with me, I think
Omei: What is the PDB ID?
Omei: 6HRM - E. coli 70S d2d8 stapled ribosome
Eli: oh, hehe
Omei: So it has been joined into one chain.
Also, note that upper/lower case seem to be significant.
We can still use it as an example. Or you can switch first if you want.
Eli: No, it is fine to use as example. It will take minutes to load the right one
The right one is called 5IT8
Omei: OK, since you only have 1 chain, the specification by base number should be something like :1-200
So try select :1-200 in the command line.
Eli: :1 for chain, right?
Omei: No, the : indicates base numbers.
Ok, have put that in the command line
Omei: If you have multiple chains, select :1-200.A would specify the chain a chain A.
Did you press Return? If so, was anything selected?
Eli: A whole lot was selected
Eli: It looks like protein were selected too
It looks like much more selected
Omei: If you want just specific base numbers, try something like select :1-10, 12, 17, 23-30
Ok. It is still selecting a lot of proteins:
Omei: Ah. Each protein is a chain in this model, too. So you do have to specify the chain to get a specific RNA.
Eli: This is how I earlier selected chain. Only chain 1 and 3 is RNA
Omei: Also, the PDB page where you downloaded the file should also describe what each chain number/letter corresponds to.
Eli: I found this quick reference:
Not sure how to make that symbol
Omei: : followed by .
Ok, found the PDB page with the chain number
Just scrolled down on this page and changed from proteins to nucleic acids: https://www.rcsb.org/structure/6HRM
Omei: Now you can use the command line for things that aren't available through the UI. But for the most part, I use it for selection and then use the GUI menus for whatever operations I want to execute.
Eli: I'll probably do the latter also. Just being able to locate the intended bases faster will be a blast.
Omei: You have enough to do that now?
Eli: I can select the full chain.
I am not sure how to select single bases
(We switched to search for the 5s rRNA in the process - has chain 3)
Omei: select :.3 Need the : in front of the .
Eli: Yes, got the 5s RNA
Omei: Ah. we want chain 3 for those positions to make sense.
Got it figured out.
When there are commas in the list of bases, need to add the .1 to each group of bases. Try select :1-10.1, 12.1, 17.1, 23-30.1
Eli: Yay, I think this is it:
The bases fits the marking.
Omei: Looks good.
So you see the more general pattern.
Eli: Yes, I don't need to specify chain first, just do it after each group of new set of bases
Just as it is when I hover over bases in the ribosome (every base number got a dot with a chain number or letter afterwards)
Omei: You're probably going to run into more complicated situations, where you want to select bases from multiple chains at once.
I.e. for RNA/protein interactions.
Eli: I bet you are right. However for now I am more than happy
Omei: We can figure out the syntax if/when that happens, but selecting one part from the command line and then tweaking the selection manually is probably the way to go.
You know how to add or subtract from the current selection, don't you?
Eli: We can add additional bases and just add a different chain number
Yes, it takes a comma separating each group of bases or single bases
Omei: Manually, ctrl+shift will add/subtract the new selection, rather than replacing it.
Eli: Ah, you mean changing the chosen bases in the command line, in the ribosome afterwards.
This short post is made at the recommendation of Omei:
In the context of the ribosome, there are regions of the secondary structure that are pseudoknotted. However, Eterna currently does not have a method for rendering pseudoknotted secondary structure. Given this, players don't have the ability to reliably modify these regions, which ultimately affects their ability to completely contribute towards the project of redesigning the ribosome. However, there is a method of displaying pseudoknots (and potentially other long range interactions) in the Eterna game UI.
As seen in the above screenshot, the pseudoknotted helix can be shown in the structure of the second state of the game UI. While it's not the most elegant of solutions, it does provide a way to see these structures without needing to further modify the game UI. Furthermore, you can show other tertiary interactions layered into additional states. This would give players the ability to "see" long range interactions, and potentially modify the base-base interactions that occur.
I would recommend these puzzles as an advanced subset, separate from the main puzzles that are published on the site.
Turned out it was lucky I started with the stablede ribosome I first picked. Because it loaded fast and without trouble. When I later tried open the 5it8 that Omei suggested, I ran into a lot of trouble. Here comes the part of my journey with getting the ribosome opened in Chimera.
For the quickest way to load the 5it8 ribosome, just jump to the bottom of the post. If you want alternative ways of watching it, read the rest.
First of all I ran into trouble with opening the ribosome at all. It took a real long time and I got nowhere.
This time however I could load the 5it8 entry - took more than just a couple of minutes. Better to say that it took less than an hour. ;) I like to save a fresh version of the molecule so I can quickly load it again if anything goes wrong when I play around with it.
I loaded the ribosome with this option
However it is looking really weird. It is much more stretched out. It doesn't look like the same version of it that I saw in the inbuilt viewers in PDB or PDBe.
It is like there is something wrong with the display of it.
Ok, I must have picked some other ribosome. I was looking over a list of alternative e.coli ribosomes in hope of getting wiser. I haven't been able to replicate the reading in. It is like this bit, that I see in other molecules that I can load, is missing:
I tried reload the ribosome to see if I could replicate. I had trouble reading in the ribosome. It is
not counting up as reading in.
Eli: Ok, I did managed to get it to count up by changing settings to ignore previous cached data.
Also now I'm absolute certain that I got the 5it8 entry read - plus I did it exactly as I did for the first stabled ribosome.
However the ribosome is still looking stretched out and weird. Not globular
But I don't know if I can trust the visual. I would be very happy if you tried load. However good news is that I managed to load the entry I wanted to load. It took really long time loading. And first time it took so long compared to what I expected that I aborted the search - leaving cached but unfinished downloaded data. So that was the problem.
Big lol. I know what happened.
I have got an image from a crystallization of two ribosomes.
I recognized the characteristic long dangling protein tail, twice.
Having a double ribosome would also explain why it was not so easy to load
I have seen somewhere in the menu where one could choose how many ribosomes to see at once.
Omei: Yes, 5IT8 takes a very long time for me to load also. I've started the process now so I can potentially be of more help.
Eli: Thx. I got so happy that I started to play a bit around and it got buggy so I couldn't save. I managed to reload it and do a save. So that will be my first recommendation for you as soon as you have ribosome showing.
Omei: Hi! Did you find a way to do what you needed?
Eli: No, I can't remember where I found the function to reduce what I got from the crystal. I just know that it is there
Did you get the ribosomes loaded?
Let me try something.
Ok, this works. It may not be the simplest.
Start by holding down the Shift key and dragging the mouse to draw a selection box over, say, 2/3 of the ribosome you want to hide.
Eli: It is just moving around when I drag the mouse
Omei: My mistake. Use shift+control
Something like this.
Eli: Ok, working
Omei: (shift+control selects by area on the screen)
I didn't say, but you should start so that the two ribosomes are separated as much as possible, so you don't inadvertently select part of each.
Eli: Got it
Looking similar to yours
Eli: Let me guess, your strategy is hide
Omei: Now, your up arrow key will "expand" the selection.
Give it time to redraw before you expand a second time.
Seems that expanding the selection 3 times selects the rest of the one copy of the ribosome.
Once you have selected what you want to hide, you can use the GUI.
e.g. Actions/Ribbon/Hide and Actions/Atoms/Hide.
Eli: I even got something in the second one colored in after an expand. Some small molecules
Omei: This time when I did it, a couple of proteins didn't get hidden. So I just repeated the process on those.
You'll probably just have to fiddle with it until you have hidden just what you want.
Eli: Ok, I can work with that. Thx. And should I find the easy way sometime again I will tell.
I did it by accident and ended up with a whole row of crystals of one. I could control it and make it 1, 2,
Omei: A lot depends on how the authors have structured the PDB file.
Eli: Ah, so perhaps the entry I played with then made it easy
Omei: That's my guess.
Eli: It was a monomer and I just got the option to choose how much of the crystal I wanted to see.
It annoys me greatly that I can't find it, but I'm happy that you helped me.
Omei: Ah. You're probably thinking of the Select Chains of the Model tool.
If you are only interested in seeing the RNA, that would be simpler, since there are only three chains.
Eli: No, I wasn't thinking of that, but that is an excellent idea too.
I was literally able to decide how much the crystal I was seeing and I got a long row of ribosomes. But my memory may have played a trick in that it may not have been chimera, but either pdb or pdbe
Omei: Well, let me know when you find something useful. I'm certainly no expert with the tool.
Eli: I remember the colors different and a white background
And thinking, it probably will be easier seeing just the RNA. I was just so focused on the proteins as they were new to me. But for looking at lab I will mostly need RNA.
Omei: You can use Tools/Depiction/Color Actions to turn the background white.
Eli: Oh, hehe
By the way, how do I select 3 chains at once?
Because then I can invert and get everything else colored for removal
Omei: To add to an existing selection, use shift-control-click.
Omei: So use that plus the up arrow to extend the selection as needed.
... or Select Chains of the Model tool.
Eli: I have trouble selecting the 5s (chain DA) it is so far down my screen that I can't get to it
Omei: You can't scroll?
ah, I think I can use the expand trick. Just highlight a base and expand to grab the structure
omei: Weird. Here's what I have.
Oh, you were referring to the 3D view?
Eli: Here is what I have
But now I managed to find a triangle that would allow me to go further down
I however this way can only highlight one at a time
How did you get your window
Omei: Ah, that's Select/Chain -- different. (I can use my mouse scroll to scroll that one though.)
Eli: Ok, where do I find Select Chains of the Model
Omei: Favorites/Model Panel
Eli: Oh, I like I can select by both bases, but also component names
Actually I can see I can make this way a full ribosome
Lol, will be messy. They are not split in two, but mixed. Anyway, I much rather go this way
Wow it is really small now just with the RNA
Naked RNA ribosome...
I wouldn't have ended up with just the RNA, had I just had luck getting the ribosome as I wanted in the first place. I think this simpler view can end up an advantage in the long run
Saving a sample of it.
Eli: By the way, I found how to see more than one ribosome in one go or how to choose between the ribosomes.
There are two entries here.
I may have found a simpler road
We will need to get the specific ID of the Biological Assembly 1 (the structure that the authors of the paper submitted)
So something like 5it8.1 or 5it8.biological assembly 1
I found this background article on biological assemblies in chimera: http://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/biological-assemblies#Anchor-BioUnit
This is extremely well explained. I will recommend reading this. It will explain why it matters quite a lot which of the ribosomes we delete.
I bugged Rhiju as I wasn’t sure how to get hold on the Biological Assembly 1 data.
Rhiju: hi eli -- the only way I know how to do this is to download the "biological unit" file from the PDB, which is curated by the authors (and not always 'correct'). sorry i don't have better advice!
Eli: Yay! I managed to load just the one ribosome that I wanted to load.
So it turned out there was an option to get it under Download. I downloaded the file and opened it through chimera. It took a while, but not nearly as long as downloading a double ribosome.
Recently Omei made a booster that is really helpful for grabbing all the bases one have changed in one of the ribosome puzzles as to dump them in the title of ones design.
Introduction here: Useful booster counting the mutated bases
I regularly found it tedious locating specific bases in some of the larger ribosome puzzles under layers of overlaps. This made me dream of having a booster that also allowed me to search for specific bases and have them highlighted.
Omei caught my wish and Jandersonlee followed up with a booster capable of searching and marking bases.
I realized that these two scripts could do other things when together. I did a crude fusion of them and jandersonlee did all the work of making them work better together and also updating the markers after change of mutations.
It allows for copying out the mutations for publishing a puzzle. But it also visually marked these mutated bases at the same time. This will make it easier to locating the mutations also when looking through designs submitted by others. So I expect this to help later for analysis.
Example of highlighted mutations in a design, plus a copy grab option.
Here is jandersonlee's latest script version: Mark Mutations (v0.6)
Here is his description of the booster:
When you run the script it computes the current mutations, marks them, and asks if you want to loop repeatedly. If you click OK, it will continue to poll checking for changes every 0.5 seconds and redo the marks. If you click Cancel, it will stop and also stop any current polls. The current mutations are displayed with the prompt in a text field that you can copy/paste. So if you are not auto-marking, you can run it once to mark the current mutated bases and optionally copy them using Cancel; if you are auto-marking and want to continue or want to start, click OK. A one-stop shop for marking and copying the mutations. Note, it *does* change the marks at least once, so better to use Omei's original script if you want a list of the mutated bases without changing the marks.
Omei's original script: https://eternagame.org/web/script/9231373/
The corresponding bases in our puzzles:
56U is 48 in Part 1, already locked
723U is 164 in Part 2, not locked
1306U is 386 in Part 3, not locked
1319A is 399 in Part 3, already locked
1468 is not in our puzzle
If anyone want to correlate further bases, the puzzles begin at 8, 559, and 920.