oggmap.of2orthomap module

Author: Kristian K Ullrich date: February 2025 email: ullrich@evolbio.mpg.de License: GPL-3

oggmap.of2orthomap.add_argparse_args(parser: ArgumentParser)

This function attaches individual argument specifications to the parser.

Parameters:

parser (argparse.ArgumentParser) – An argparse.ArgumentParser.

oggmap.of2orthomap.define_parser()

A helper function for using of2orthomap.py via the terminal.

Returns:

An argparse.ArgumentParser.

Return type:

argparse.ArgumentParser

oggmap.of2orthomap.get_continuity_score(og_name, youngest_common_counts_df)

This function calculates a continuity score for a given orthologous group and its corresponding LCA counts.

Parameters:
  • og_name (str) – Orthologous group name.

  • youngest_common_counts_df (pandas.DataFrame) – DataFrame with LCA counts.

Returns:

Continuity score.

Return type:

float

Example

>>>
oggmap.of2orthomap.get_counts_per_ps(omap_df, psnum_col='PSnum', pstaxid_col='PStaxID', psname_col='PSname')

This function return counts per phylostratum.

Parameters:
  • omap_df (pandas.DataFrame) – DataFrame with orthomap results.

  • psnum_col (str) – Specify PSnum column name.

  • pstaxid_col (str) – Specify PStaxID column name.

  • psname_col (str) – Specify PSname column name.

Returns:

DataFrame with counts per phylostratum.

Return type:

pandas.DataFrame

Example

>>> from oggmap import datasets, of2orthomap, qlin
>>> datasets.ensembl105(datapath='.')
>>> query_orthomap = of2orthomap.get_orthomap(
>>>     seqname='Danio_rerio.GRCz11.cds.longest',
>>>     qt='7955',
>>>     sl='ensembl_113_orthofinder_last_species_list.tsv',
>>>     oc='ensembl_113_orthofinder_last_Orthogroups.GeneCount.tsv',
>>>     og='ensembl_113_orthofinder_last_Orthogroups.tsv',
>>>     out=None,
>>>     quiet=False,
>>>     continuity=True,
>>>     overwrite=True)
>>> of2orthomap.get_counts_per_ps(
>>>     omap_df=query_orthomap[0],
>>>     psnum_col='PSnum',
>>>     pstaxid_col='PStaxID',
>>>     psname_col='PSname')
oggmap.of2orthomap.get_orthomap(seqname, qt, sl, oc, og, out=None, quiet=False, continuity=True, overwrite=True, ncbi=None, dbname=None)

This function return an orthomap for a given query species and OrthoFinder input data.

Parameters:
  • seqname (str) – Sequence name of the query species used for OrthoFinder comparison.

  • qt (str) – Query species taxID.

  • sl (str) – Path to species list file containing <OrthoFinder name><tab><species taxID>.

  • oc (str) – Path to OrthoFinder result <Orthogroups.GeneCounts.tsv> file.

  • og (str) – Path to OrthoFinder result <Orthogroups.tsv> file.

  • out (str) – Path to output file.

  • quiet (bool) – Specify if output should be quiet.

  • continuity (bool) – Specify if continuity score should be calculated.

  • overwrite (bool) – Specify if output should be overwritten.

  • ncbi (dict) – The NCBI taxonomic database.

  • dbname (str) – Specify taxadb.sqlite file.

Returns:

A list of results such as: orthomap, species_list, youngest_common_counts

Return type:

list

Example

>>> from oggmap import datasets, of2orthomap, qlin
>>> datasets.ensembl113_last(datapath='.')
>>> query_orthomap, orthofinder_species_list, of_species_abundance = of2orthomap.get_orthomap(
>>>     seqname='7955.danio_rerio.pep',
>>>     qt='7955',
>>>     sl='ensembl_113_orthofinder_last_species_list.tsv',
>>>     oc='ensembl_113_orthofinder_last_Orthogroups.GeneCount.tsv.zip',
>>>     og='ensembl_113_orthofinder_last_Orthogroups.tsv.zip',
>>>     out=None,
>>>     quiet=False,
>>>     continuity=True,
>>>     overwrite=True,
>>>     dbname='taxadb.sqlite')
>>> query_orthomap
oggmap.of2orthomap.get_youngest_common_counts(qlineage, species_list)

This function return LCA counts for a given query species lineage.

Parameters:
  • qlineage (list) – Query lineage information.

  • species_list (pandas.DataFrame) – Species list.

Returns:

DataFrame with LCA counts.

Return type:

pandas.DataFrame

Example

>>>
oggmap.of2orthomap.main()

The main function that is being called when of2orthomap is used via the terminal.