EIYBrowse.tracks.genes module

The genes module defines a track for plotting the position of genes

class EIYBrowse.tracks.genes.GeneRows

Bases: object

Class for assigning genes to vertical rows without overlapping.

Create a new GeneRows object

add_gene(gene, start, stop)

Add a gene to the internal store.

Calls the get_gene_row() method to return the row we should add the gene to such that it doesn’t overlap with any already added genes. Then adds the gene to the returned row and updates the ‘stop’ key to reflect the new rightmost position in the row.

Parameters:
  • gene (pybedtools.Interval) – Gene object to be plotted
  • start (float) – Start position of the gene in axis co-ordinates
  • stop (float) – Stop position of the gene in axis co-ordinates
get_gene_row(start)

Go through all of the existing rows and return the first row where the start position of the new gene would not overlap the stop position of any existing genes in that row.

If no existing row is found, make a new empty row and return that.

Parameters:start (float) – Start position of the gene in axis co-ordinates
class EIYBrowse.tracks.genes.GeneTrack(datafile, color='#377eb8', name=None, name_rotate=False, **kwargs)

Bases: EIYBrowse.tracks.base.FileTrack

Track for displaying the position of genes and their introns/exons.

The genes track needs to have enough vertical space to display all of the genes over the requested region. Therefore, the list of genes and their positions must be retrieved before the figure axes are set up. The track therefore makes use of the get_config() method, which is always called before the plot is initiated.

Once the list of genes is retrieved, we need to decide how to arrange them without them overlapping. The most difficult part of this is ensuring that the name labels don’t overlap.

To create a new gene track:

Parameters:
  • datafile – Object providing access to the names and locations of genes. At the moment only GffutilsDb objects are supported.
  • color (str) – Color specifier for the gene icons
  • name (str) – Optional name label
  • name_rotate (bool) – Whether to rotate the name label 90 degrees
get_config(region, browser)

Calculate the number of vertial rows needed in the axis that will be assigned to this track.

Genes are retrieved from the backend by calling the EIYBrowse.filetypes.gffutils_db.GffutilsDb.get_genes() method of the backend. The private _get_gene_extents() method then iterates over the found genes and returns the start/stop of the gene when plotted (including the name label). Each gene is added to self.gene_rows, which is a GeneRows object that assigns each gene to a vertical row, making sure that none of them overlap.

Once all the rows are added, we return the total number of rows needed to arrange the genes without overlaps by calling total_rows().

Parameters:
  • region (pybedtools.Interval) – Genomic interval to plot genes over.
  • browser (Browser) – Parent browser object that will create the new plotting axis.
plot_exon(plot_ax, exon, row_index=0)

Plot the exon to the plotting axes.

First we get the extent of the exon in axis co-ordinates.

We then determine the vertial position of the exon . First the vertial span of each row is calculated by dividing the extent of the whole axis (which is always 1) by the total number of rows. The top of the current exon is then given by the top position of the current row (row_index) minus one fifth of the row span, and the bottom is given by the row_index minus three fifths of the row span.

Parameters:
  • plot_ax (matplotlib.axes.AxesSubplot) – Axes to plot the gene name label on
  • exon (pybedtools.Interval) – Exon object to be plotted
  • row_index (float) – Vertical position of the row which the gene is to be plotted to.
plot_gene_body(plot_ax, gene, row_index=0)

Plot the body of the gene to the plotting axes.

First we get the extent of the gene in axis co-ordinates.

We then determine the vertial position of the gene . First the vertial span of each row is calculated by dividing the extent of the whole axis (which is always 1) by the total number of rows. The position of the current gene is then given by the top position of the current row (given by row_index) minus two fifths of the row span.

Parameters:
  • plot_ax (matplotlib.axes.AxesSubplot) – Axes to plot the gene name label on
  • gene (pybedtools.Interval) – Gene object to be plotted
  • row_index (float) – Vertical position of the row which the gene is to be plotted to.
plot_gene_dict(plot_ax, gene_dict, row_index=0, plot_exons=True)

Plot the gene dictionary to the plotting axes.

The ‘gene’ key of gene_dict should contain a pybedtools.Interval representing the entire gene, which is passed to plot_gene_body().

The ‘exons’ key should contain a list of pybedtools.Interval objects representing each individual exon of the gene’s longest isoform, which are passed to plot_exon() if plot_exons is True.

Finally, the name label of the gene is plotted by plot_name().

Parameters:
  • plot_ax (matplotlib.axes.AxesSubplot) – Axes to plot the gene name label on
  • gene_dict (dict) – Dictionary containing the details of the gene to be plotted
  • row_index (float) – Vertical position of the row on which the gene is to be plotted
  • plot_exons (bool) – Whether to plot the exons
plot_name(plot_ax, start, name, row_index=0)

Plot the name label of the gene.

Parameters:
  • plot_ax (matplotlib.axes.AxesSubplot) – Axes to plot the gene name label on
  • start (float) – Start position of the gene in axis co-ordinates
  • name (str) – Name of the gene to be plotted
  • row_index (float) – Vertical position of the row which the gene is to be plotted to.
total_rows()

The number of rows needed to plot the current set of genes.

If there is no current set of genes, just return 1 (as we can’t take up 0 vertical space).

EIYBrowse.tracks.genes.get_start_stop_on_axes(axes, interval)

Given a set of axes and a genomic interval, return the start and stop of the interval in axes co-ordinates

Parameters:
  • plot_ax (matplotlib.axes.AxesSubplot) – Axes to plot the gene name label on
  • interval (pybedtools.Interval) – Genomic interval for which to determine the extent in axis co-ordinates
Returns:

interval start and stop in axes co-ordinates (pair of floats)