BioC++ core-0.7.0
The Modern C++ libraries for Bioinformatics.
 
Loading...
Searching...
No Matches

Provides (semi-)alphabets for representing elements in CIGAR strings. More...

+ Collaboration diagram for CIGAR:

Classes

class  bio::alphabet::cigar
 The cigar semialphabet pairs a counter with a bio::alphabet::cigar_op letter. More...
 
class  bio::alphabet::cigar_op
 The cigar operation alphabet.. More...
 
struct  std::tuple_element< i, bio::alphabet::cigar >
 Obtains the type of the specified component. More...
 
struct  std::tuple_size< bio::alphabet::cigar >
 Provides access to the number of components in a tuple as a compile-time constant expression. More...
 

Detailed Description

Provides (semi-)alphabets for representing elements in CIGAR strings.

Introduction

CIGAR strings are combinations of count values and CIGAR operations, representing an alignment as a sequence of edit operations. This submodule has two different alphabets. One is the bio::alphabet::cigar_op alphabet, which is a base bio::alphabet::alphabet implementation. This contains all valid symbols contained in CIGAR strings. The other alphabet is the bio::alphabet::cigar alphabet, which is an alphabet tuple. It combines the bio::alphabet::cigar_op alphabet with a count value, such that one can represent an entire CIGAR string with a std::vector of bio::alphabet::cigar values.

The following table outlines the valid characters in the bio::alphabet::cigar_op alphabet.

Letter Description
M Alignment match (can be a sequence match or mismatch, used only in basic CIGAR representations)
I Insertion to the reference
D Deletion from the reference
N Skipped region from the reference
S Soft clipping (clipped sequences present in SEQ)
H Hard clipping (clipped sequences NOT present in SEQ)
P Padding (silent deletion from padded reference)
= Sequence match
X Sequence mismatch