diamond()
,set_diamond()
,diamond_best()
anddiamond_rec()
enable a massive speed-up in pairwise sequence alignment functionalities.
- a logo is now set for
orthologr
. - DIAMOND2 is now used by default in
dNdS()
anddivergence_stratigraphy()
unless blast is specified (aligner = "blast"
).
- the
divergence_stratigraphy()
anddivergence_map()
functions now include the parametern_quantile
, which enables users to choose the number of quantiles to generate for thedivergence map
. This could allow users to get a higher-resolutiondivergence map
ifn_quantile
is greater than 10. Alternatively, this can resolve the issue of empty divergence strata when deciling the dNdS values for closely related organisms with dNdS = 0 for over 10% of the genes.
- new function
check_annotation()
helps to detect corrupt GFF or GTF annotation files and removes such outlier lines
-
the
generate_ortholog_tables()
andretrieve_longest_isoforms()
now include the newcheck_annotation()
function to capture corrupt GFF or GTF files and fix them -
adding a new argument
of_path
toorthofinder2()
to allow users to specify their own path to their locally installedorthofinder
executable -
adding new argument
task
tomap_generator_lnc()
andorthologs_lnc()
to allow users to use the fullblastn
range provided by blast_nucleotide_to_nucleotide() -
adding new argument
path
tomap_generator_lnc()
to allow users to specify their local installation path of BLAST
-
new function
plot_pairwise_orthologs()
allows users to plot pairwise orthologs tables for multiple pairwise comparisons -
new function
retrieve_core_orthologs()
allows users to retrieve a core set of orthologous gene loci from several pairwise ortholog tables -
new functions
generate_ortholog_tables()
andgenerate_ortholog_tables_all()
allow users to generate ortholog tables by gene locus and splice varaint for a set of species -
new function
retrieve_longest_isoforms_all()
allows users to specify folders and retrieve the longest splice variants for all proteomes stored in a folder -
new functions
translate_cds_to_protein()
andtranslate_cds_to_protein_all()
which translate coding sequences into amino acid sequences for single or multiple files
-
in
orthologs()
the default value ofdelete_corrupt_cds
changed fromdelete_corrupt_cds = TRUE
todelete_corrupt_cds = FALSE
to be consistent withdNdS()
anddivergence_stratigraphy()
-
the
divergence_stratigraphy()
function received a new argumentdnds_est.method
which now allows users to select different dNdS estimation methods when runningdivergence_stratigraphy()
(suggested by Momir Futo) -
the
divergence_stratigraphy()
function now allows to change theeval
argument which wasn't passed down to thedNdS()
call within the function (Many thanks to Momir Futo) -
the function
map.generator()
was renamed tomap_generator_dnds()
to be more consistent with the notation of other functions -
the function
map.generator.lnc()
was renamed tomap_generator_lnc()
to be more consistent with the notation of other functions -
the function
DivergenceMap()
was renamed todivergence_map()
to be more consistent with the notation of other functions -
the function
DivergenceMap()
was renamed todivergence_map()
to be more consistent with the notation of other functions -
the function
orthologs.lnc
was renamed toorthologs_lnc
to be more consistent with the notation of other functions -
the function
OF2CoreOrthologs()
was renamed toorthofinder2_retrieve_core_orthologs()
-
the function
advanced_blast()
is not supported anymore and thus is not available to users anymore (please consult the metablastr package in case you need this functionality) -
the function
advanced_makedb()
is not supported anymore and thus is not available to users anymore (please consult the metablastr package in case you need this functionality) -
the function
blast.nr()
is not supported anymore and thus is not available to users anymore (please consult the metablastr package in case you need this functionality) -
the function
delta.blast()
is not supported anymore and thus is not available to users anymore (please consult the metablastr package in case you need this functionality) -
the function
ProteinOrtho()
is not supported anymore and thus is not available to users anymore
- new function
retrieve_longest_isoforms()
which enables retrieval of the longest isoforms from a proteome file and save results as fasta file for downstream analyses - new function
OF2CoreOrthologs()
to retrieve core orthologs across multiple species from Orthofinder2 output - new function
extract_features()
: Helper function to extract gene loci and splice variant IDs from GFF files - new function
filter_best_hits()
: Helper function to select best BLAST hit based on minimum evalue - new function
generate_ortholog_tables()
: Generate ortholog tables by gene locus and splice varaint
read.cds()
now trimms corrupted CDS (= CDS not divisible by 3) whendelete_corrupt_cds = FALSE
is specified- the new default value of argument
delete_corrupt_cds
indNdS()
is nowFALSE
. Thus, given the new trimming feature inread.cds()
, corrupted CDS equences will be trimmed before being translated.
- The default setting of the
BLAST
argumentmax_target_seqs 1
was removed fromblast_best()
andblast_rec()
due to the misunderstood functionality of theBLAST
argument (See details here and #9 ; Many thanks to @armish)
- Users can now control via the new
delete_corrupt_cds
argument indNdS()
and related downstream functions whether or not corrupted input coding sequences shall be removed prior to dN/dS inference. In case corrupted CDS exist, thedNdS()
function will now generate a separate fasta file which stores all corrupted CDS so that they can be investigated. See issue #8 for details.
-
dNdS()
receives new argumentdelete_corrupt_cds
to remove corrupted input coding sequences (delete_corrupt_cds
is set toTRUE
as default) -
read.cds()
receives new argumentdelete_corrupt_cds
to remove corrupted input coding sequences (delete_corrupt_cds
is set toTRUE
as default) -
cds2aa()
receives new argumentdelete_corrupt_cds
to remove corrupted input coding sequences (delete_corrupt_cds
is set toTRUE
as default) -
set_blast()
receives new argumentdelete_corrupt_cds
to remove corrupted input coding sequences (delete_corrupt_cds
is set toTRUE
as default) -
blast()
receives new argumentdelete_corrupt_cds
to remove corrupted input coding sequences (delete_corrupt_cds
is set toTRUE
as default) -
blast_best()
receives new argumentdelete_corrupt_cds
to remove corrupted input coding sequences (delete_corrupt_cds
is set toTRUE
as default) -
blast_rec()
receives new argumentdelete_corrupt_cds
to remove corrupted input coding sequences (delete_corrupt_cds
is set toTRUE
as default)
- Fixing internal path bug that caused that wrong pal2nal paths were generated when using multiple sequence aligners -> see issue #5 (Many thanks to Dr. Mario López-Pérez)
- Fixing a major bug that caused KaKs_Calculator to not be able to correctly parse the kaks computation output (Many thanks to Hongyi Li who spotted the bug and found a solution).