logo
QUARNA
(QUArtets in RNA)
iiit_logo
  1. Overview
  2. Features available
    1. Get Quartets Here
    2. Search quartets with specified residues
    3. Quartets present in different databases
    4. Topology wise searching
  3. Output formats for different topology
    1. Output for linear quartet
    2. Output for star quartet
    3. Output for cyclic quartet
    4. Output for semi-cyclic quartet
A. Overview

QUARNA is a web portal, provides information about Quartets in RNA. Nucleotide base quartets are group of 4 nucleotide bases which interact with each other via base pairing and observed recurrently in different functional RNA molecules. In the present work we have classified quartets according to their belongingness to a particular topological class. A comprehensive nomenclature scheme has been proposed based on topology, identity of the four participating nucleotide bases and geometries of constituent base pairs.
Web-portal will help RNA researcher to identify and annotate base quartets in a given pdb file and will provide complete lists of different type of quartets present in specified datasets of known RNA structures. Details of different datasets are available here.

B. Features available

Figure 1 shows the home page of QUARNA portal. We have implemented some features in QUARNA which help to search base quartets in multiple ways. The functionalities of the 4 sections marked with number 1-4 in figure1 are elaborately described below. At the top QUARNA home page, a navigation bar named Quartet Atlas is available (highlighted in a red square box in Figure 1), where user can get a overview of all available quartet varieties in nature. Separate lists are available for different topologies. Quartet Atlas also provides an idea about occurrence frequency of each quartet type in all four dataset considered in our study.

...
Figure 1. Home page of QUARNA portal. Four sections marked with red coloured encircled numbers are showing four different features available in QUARNA.

1. Quartet Finder
User can access Quartet finder program through this section. Here user can mention a pdb file name in the prescribed box (marked as (I) in figure 2), the program will fetch the pdb file directly from RCSB-PDB database and search for quartets in it upon clicking the "Find quartets" button (marked as (III) in figure 2). At the top right corner of the box there is a link for a test example. Clicking on this link a default pdb file “3tvf”will be considered for testing purpose. In figure 2 for example we have chosen the test example. This program may take some time to run completely depending on the size of the pdb file. Once completed two more options named “View” and “Download” will appear beside the find quartet button. If the user already have the pdb file in his/her local system, he/she can upload the file to the server through the browse option given (marked as (II) in figure 2) to get the same results.

...
Figure 2. enlarged view of section 1 (Quartet Finder) of Figure 1.

This quartet finder program detects all quartets in the given pdb file and generates four separate output files named as linear.txt, star.txt, cyclic.txt and semi_cyclic.txt for linear, star, cyclic and semi-cyclic topology respectively. These output files contain instances of all the quartets belong to the respective topological classes along with the detailed information about their location (nucleotide numbers and chain names) and base pairing geometries between participant nucleotide bases. The program also annotate each quartets as per our proposed nomenclature rules and report the proper name of each quartet in the same output files.
Download option: Clicking on to the “Download” option user can download these generated files in .txt format within a zipped folder. The details of the downloadable output files are described below.[?]
In Figure 2 we have given the pdb file name 3tvf, the output will be generated and downloaded as 3tvf.zip which can be extracted for further use.

3tvf.zip will contain 12 output files:

  1. .pdb file (3tvf.pdb): the original uploaded coordinate file in pdb format
  2. .out file (3tvf.out): the BPFind output file, which provide list of base pairs
  3. .cor file (3tvf.cor): the modified coordinate file, where residue numbers are reassigned by bpfind
  4. .dat file (3tvf.dat): Information about secondary structure of the RNA chain
  5. .fasta file (3tvf.fasta): RNA sequence written in FASTA format
  6. .nup file (3tvf.nup): Base pairing information, used while running NUPARM software to calculate base pairing parameters.
  7. .hlx file (3tvf.hlx): Information about helical regions present in the RNA structure (3tvf.pdb)
  8. linear.txt: List of linear quartets present in the given RNA structure (3tvf.pdb)
  9. star.txt: List of star quartets present in the given RNA structure (3tvf.pdb)
  10. cyclic.txt: List of cyclic quartets present in the given RNA structure (3tvf.pdb)
  11. semi-cyclic.txt: List of semi-cyclic quartets present in the given RNA structure (3tvf.pdb)
  12. triple.txt: List of triples present in the given RNA structure (3tvf.pdb)

Out of this 12 files 6 files (.out, .cor, .dat, .fasta, .nup, .hlx ) are generated by BPFind program. User can use these files as per requirement and .txt files are corresponding to triples and quartets detected by our program. The output format in each of the .txt files are explained in Section C.
View option: If the user click onto the “View” option, the same quartet lists which are mentioned above, will appear in a new tab. Before showing the result page a dialogue box will appear, which suggests users to continue script if any warning comes while loading the molecular image in the JSMOL window provided in the view result page. Figure 3 shows the result page generated by selecting view option. At top left corner of the result page, there are four separate tab corresponding to four topologies. In figure 3 as the “Linear”tab is selected, the list below the tab showing different topology names, contains all instances of linear quartets in the pdb file 3tvf. The quartet instances shown in the lists are named according to the proposed nomenclature rules. Here nucleotide numbers of each residues are written before the residue name and chain id is written within a bracket follwing the residue name. In the right hand side JSMOL window initially the whole pdb file will be loaded. Then user can visualize any particular quartet(s) by selecting that/those quartet instances from the list and choosing any one of the three options - “Locate”, “Context”, and “Restrict”, provided just below the list. These options are highlighted within a red box in figure 3.

...
Figure 3. The result page generated by selecting the View option after completion of Quartet Finder run on pdb file 3tvf.

Figure 3a explains the functionality of the Locate option. Selected quartet instances from the list will be highlighted in cartoon view within the whole structure by selecting “Locate”button.

...
Figure 3a. Function of Locate option.

Figure 3b explains the functionalities of the Context option. Clicking on to the Context button, the selected quartets will become highlighted in ball and stick view and their surrounding residues within a 10 Angstrom radius will highlighted in wire frame view. In this view user can understand the structural context of the selected quartet(s) inside the whole RNA structure.

...
Figure 3b. Function of Context option.

Figure 3c shows the function of Restrict option. Here user can select only one quartet at a time to get a restricted view of that particular quartet. User can understand the interaction geometry and hydrogen bonding pattern by zooming in the image.

...
Figure 3c. Function of Restrict option.

Some of the JSMOL functions are given below the JSMOL window. Users can use other JSMOL functions, by clicking the right button of the mouse on the JSMOL window. User can follow a standard JSMOL tutorial, if they want any further manipulation of the structures as per their requirements. For rotating, shifting or zooming in or out user can follow the standard JSMOL mouse manual available at http://wiki.jmol.org/index.php/Mouse_Manual.


2. Search quartets with specified residues
We already have identified quartets of different topology and different base combinations and geometry present in 4 aforementioned dataset.[?]
In this search option user can mention any four residue of her choice and also can specify the topology and the dataset names from the drop down list (Figure 4). The search will return a list of quartets composed of that four specified residues.

...
Figure 4. Enlarged view of section 2 (Search quartets with specified residues) of Figure 1.

Here also at the top right corner of the box, there is a link for Test example. For testing purpose it will automatically fill the options given in the box with default examples. Here we have selected the test example option in figure 4 we are searching for all instances of Star type quartets present in X-ray dataset[?] composed of two Adenine (A), one Cytosine(C), and one Guanine (G) residues.
The result will return a list of different star quartets made up of A, A, G, C residues, along with the pdb file names, chain names, residue numbers and proper annotation. The search result contains different quartets like A(ACG) WWT/ssC/HST, G(AAC) SWT/swT/WWC, C(AAG) hsC/ssC/WWC, A(ACG) HHT/ssC/wsT and many more. All these quartets have mentioned base composition but, they have different geometries.
The result can be downloaded as text file by clicking "Download" option as well as can be viewed in the web interface upon clicking the "View" button. Figure 4a displays the result page generated by selecting "View" option. At the top the drop down list marked as (I) contains the list of all pdb files, where quartets with A,A, C and G base compositions are present.

...
Figure 4a. Result generated in view option of “Search Quartets with specified residues”.

The output format of each type of quartets in downloaded text files are explained in Section C.


3. Quartets present in different databases
We have considered 4 different datasets to explore and analyze the varieties of different quartets. The 4 datasets are - (i) HDRNAS non redundant dataset, (ii) NDB non-redundant dataset, (iii) X-ray structure dataset from RCSB-PDB, (iv) NMR structure dataset from RCSB-PDB. Details of the 4 datasets are given here.

...
Figure 5. Enlarged view of section 3 (Quartets present in different database) of Figure 1.

In this section user can download the whole list of different quartets present in above mentioned four datasets. Figure 4 shows the enlarged view of section 3 of figure 1. Clicking on to a dataset name shown in figure 5, user will be able to download a .zip folder named with the name of the selected dataset. This folder contain four text files named as linear.txt, star.txt, cyclic.txt, and semi-cyclic.txt corresponding to four topological classes, which gives full list of all the instances from each topological class, present in the chosen dataset.
For example, HDRNA_quartets.zip contains all quartet instances present in HDRNAS non-redundant dataset. Information about pdb file name, residue number, chain names and proper nomenclature of each of the quartet instances are reported in the downloaded text files.


4. Topology wise searching

...
Figure 6: Enlarged view of section 4 (Topology wise searching) of Figure 1..

Figure 6 is the enlarged view of section 4 shown in Figure 1. This section provides a topology specific searching option. For example we can go to search in linear quartet tab. Figure 6 represent the searching options provided for linear quartets. In a linear quartet there is one central base pair and two terminal base pairs. Through this search option user can retrieve a list of quartets having a specific base pair either in central or terminal position. User can select two bases from the first two drop down list given, and then can mention a partcular geometry as W:W, W:H etc. In this search also user need to select a dataset. The result will provide information about the selected dataset only.

...
Figure 7. Linear quartet finder. This search option helps to find linear quartets from selected dataset having specific base pairing geometry in it.

For example, in Figure 7 we are searching for linear quartets present in HDRNAS dataset, which have A:A W:W Trans base pair as the central base pair. In this search user will get a list, where central base pair is A:A W:W Trans but terminal base pairs can be of any geometry, like GAAU SHT-WWT-HWT, CAAG ssC-WWT-HST, AAAU HHT-WWT-ssC and many other varieties. Pdb names and locations (nucleotide numbers and chain) are also given in the output.

...
Figure 8. Star quartet finder. This search option helps to find star quartets from selected dataset having specific base pairing geometry it in.

Similarly, in Figure 8 we are searching for all the instances of star type quartets present in HDRNAS dataset where central base is adenine and atleast one of the base pair is A:G H:S Trans.
In Figure 6 there are two other options, "Search in Cyclic Quartet" and " Search in Semi-cyclic Quartet". The searching methods are very much intuitive and almost similar to as for linear quartets and for star quartets respectively.
After completion of the query run a "Download" and "View" options appear in these cases also. User can download search results as text files by choosing the "Download" option and can view the quartet lists with molecular images in a similar way shown in Figure 4a.


C. Output formats for different topology

Output for linear quartet:

...
Figure 9. Output format for linear quartets. Here blue coloured numbers are line numbers and encircled red coloured number are column numbers.

In figure 9 the first line “----------- 2GCV-----------” denotes the pdb file name in which the following linear quartets are observed. In this file 3 quartets are observed named as 1, 2 and 3.
Corresponding to each quartet there are two lines shown as 1a and 1b, 2a and 2b etc. First line (line a) represents the identity and position (nucleotide number and chain id) of four participating bases and second line (line b) represents the name of the quartet which is assigned by our program as per the proposed nomenclature scheme.
In Figure 9 for line 2a, column numbers are also given and meaning of each column is described below.

Column 1. Nucleotide number of one of the central base ( let's say base 2). Here 37.
Column 2. Identity of the base 2. Here G.
Column 3. Chain id of the base 2. Here B.
Column 4. Nucleotide number of one of the other central base ( let's say base 3). Here 52.
Column 5. Identity of the base 3. Here C.
Column 6. Chain id of the base 3. Here B.
Column 7. Base pairing interaction between base 2 and base 3.
Column 8. Nucleotide number of one of the terminal base ( let's say base 1, which is interacting with base 1). Here 104.
Column 9. Identity of the base 1. Here A.
Column 10. Chain id of the base 1. Here B.
Column 12. Nucleotide number of one of the other terminal base ( let's say base 4). Here 128.
Column 13. Identity of the base 4. Here G.
Column 14. Chain id of the base 4. Here B.
Column 15, 16 and 17 are giving information about the nucleotide number, base name and chain id of the base 3 (one of the central base) with which base 4 is interacting.
Column 18. Base pairing interaction between base 4 and base 3.
The name given in line 2b is generated following the nomenclature rules given in terms and definition page.

Output for star quartet:

...
Figure 10. Output format for star quartets. Here blue coloured numbers are line numbers and encircled red coloured number are column numbers.

In Figure 10 also the first line “----------- 3R8S.pdb-----------” denotes the pdb file name in which the following star quartets are observed. In this example 4 quartets are observed named as 1, 2, 3 4 and 5.
Corresponding to each quartets there are two lines shown as 1a and 1b, 2a and 2b etc. First line (line a) represents the identity and position (nucleotide number and chain id) of four participating bases and second line (line b) represents the name of the quartet which is assigned by our program as per the proposed nomenclature scheme.
In figure 10 for line 2a, column numbers are also given and meaning of each column is described below.

Column 1. Nucleotide number of the central base ( let's say base 1). Here 480.
Column 2. Identity of the base 1. Here A.
Column 3. Chain id of the base 1. Here A.
Column 4. Nucleotide number of one of the terminal base ( let's say base 2). Here 505.
Column 5. Identity of the base 2. Here A.
Column 6. Chain id of the base 2. Here A.
Column 7. Base pairing interaction between base 1 (central base) and base 2. Here W:WT.
Column 8. Nucleotide number of the another terminal base ( let's say base 3). Here 476.
Column 9. Identity of the base 3. Here G.
Column 10. Chain id of the base 3. Here A.
Column 11. Base pairing interaction between base 1 (central base) and base 3. Here H:ST.
Column 12. Nucleotide number of the another terminal base ( let's say base 3). Here 499.
Column 13. Identity of the base 4. Here U.
Column 14. Chain id of the base 4. Here A.
Column 15. Base pairing interaction between base 1 (central base) and base 4. Here s:sC.
The name given in line 2b is generated following the nomenclature rules given in terms and definition page.

Output for cyclic-4 quartet:

...
Figure 11. Output format for Cyclic-4 quartets. Here blue coloured numbers are line numbers and encircled red coloured number are column numbers.

In Figure 11 also the first line “----------- 1XJR-----------” denotes the pdb file name in which the following star quartets are observed. In this example 1 quartet is observed named as 1. Cyclic quartet can be represented as the combination of two triples, where terminal bases are shared by both the triples.
Corresponding to each cyclic-4 quartet, thus there are three lines shown as 1a, 1b and 1c. First two lines (line a and line b) represent the identity and position (nucleotide number and chain id) of participating bases in two triples respectively and the third line (line c) represents the name of the quartet which is assigned by our program as per the proposed nomenclature scheme.
In figure 11 for line 1a and 1b, column numbers are also given and meaning of each column is described below.
For line 1a:

Column 1. Nucleotide number of the central base ( let's say base 1) of triple 1. Here 19.
Column 2. Identity of the base 1. Here G.
Column 3. Chain id of the base 1. Here A.
Column 4. Nucleotide number of one of the terminal base ( let's say base 2) of triple 1. Here 31.
Column 5. Identity of the base 2. Here C.
Column 6. Chain id of the base 2. Here A.
Column 7. Base pairing interaction between base 1 (central base) and base 2. Here W:WC.
Column 8. Nucleotide number of the another terminal base ( let's say base 3). Here 20.
Column 9. Identity of the base 3. Here C.
Column 10. Chain id of the base 3. Here A.
Column 11. Base pairing interaction between base 1 (central base) and base 3. Here S:SC.

For line 1b:

Column 1. Nucleotide number of the central base ( let's say base 4 of the quartet) of triple 2. Here 18.
Column 2. Identity of the base 1. Here G.
Column 3. Chain id of the base 1. Here A.
Column 4. Nucleotide number of one of the terminal base ( let's say base 2) of triple 2 . Here 20. This terminal base is shared by the triple 1 also (base 3 of triple 1).
Column 5. Identity of the base 2. Here C.
Column 6. Chain id of the base 2. Here A.
Column 7. Base pairing interaction between base 1 (central base) and base 2. Here W:WC.
Column 8. Nucleotide number of the another terminal base ( let's say base 3). Here 31. This terminal base is shared by the triple 1 also (base 2 of triple 1).
Column 9. Identity of the base 3. Here C.
Column 10. Chain id of the base 3. Here A.
Column 11. Base pairing interaction between base 1 (central base) and base 3. Here S:SC.
The name given in line 1c is generated following the nomenclature rules given in terms and definition page.

Output for cyclic-3 quartet:

...
Figure 12. Output format for cyclic-3 quartets.

In Figure 12 also pdb file names are mentioned at the top. Then, to represent quartets two lines are mentioned, first one started with “quad:” and second one started with “triplet:”.
“Quad:” line is the same as 2b line of star quartet mentioned above. Fist residue of the “quad:” line is the central base of the semi-cyclic quartet.
“Triplet:” line represent the cyclic triple part of the semi-cyclic quartet, where one of the terminal base of the star quartet (mentioned in “quad” line) is the central base of the triple. The first residue of the “triplet:” line is the central base of the cyclic triple part, which is otherwise one of the terminal base of the quartet.
The third line is the name of the semi-cyclic quartet, generated following the nomenclature rules given in terms and definition page.


QUARNA version 1.0 © CCNSB, IIIT Hyderabad