Average Nucleotide Identity (ANI) and Aligned Fraction (AF) are useful metrics for quantifying the similarity of genomes. This article will explain them.
This measure answers the question of "how similar are the two genomes". There are various ways to compute the ANI, since it relies on first performing a pairwise alignment, and there are many different alignment algorithms. Regardless, the formula is the same once you have aligned your genomes.
Before we use this, we must
$N$
length (say 1000bp)Maybe we have 5 fragments:
Fragment | Identity | Note |
---|---|---|
1 | 98% | |
2 | 96% | |
3 | 80% | |
4 | 15% | Discard |
5 | 98% |
Average is 93%, so the ANI is 93%.
ANI measures aligned sequence identity, it does have an obvious problem -- we just completed ignored the 4th fragment! Imagine if fragments 1, 2, 3 and 5 were 100% - it would give the illusion the genomes are identical, which isn't the case.
Alignment Fraction (AF) solves for this. It tells us how much of the genomes aligned.
This woulkd work out to 80% for the above example - 4 out of the 5 fragments aligned to a high enough percentage (based on whatever threshold we choose). We don't consider how much of the sequences aligned - just the number that did.