The R package ssw offers an R interface for SSW, a fast implementation of the Smith-Waterman algorithm for sequence alignment using SIMD, described in Zhao et al. (2013). The package is currently built on ssw-py.
library("ssw")A short read sequence:
read <- "ACGT"Align the read against the reference sequence
(TTTTACGTCCCCC) and print the results:
read |> align("TTTTACGTCCCCC")CIGAR start index 4: 4M
optimal_score: 8
sub-optimal_score: 0
target_begin: 4 target_end: 7
query_begin: 0
query_end: 3
Target:        4    ACGT    7
                    ||||
Query:         0    ACGT    3Get specific results, such as the alignment scores:
a <- read |> align("TTTTACGTCCCCC")a$alignment$optimal_score[1] 8a$alignment$sub_optimal_score[1] 0read |> align("TTTTACAGTCCCCC")CIGAR start index 4: 2M1D2M
optimal_score: 5
sub-optimal_score: 0
target_begin: 4 target_end: 8
query_begin: 0
query_end: 3
Target:        4    ACAGT    8
                    ||*||
Query:         0    AC-GT    3read |> align("TTTTACTCCCCC", gap_open = 3)CIGAR start index 4: 2M
optimal_score: 4
sub-optimal_score: 0
target_begin: 4 target_end: 5
query_begin: 0
query_end: 1
Target:        4    AC    5
                    ||
Query:         0    AC    1read |> align("TTTTACTCCCCC", gap_open = 0)CIGAR start index 4: 2M1I1M
optimal_score: 6
sub-optimal_score: 0
target_begin: 4 target_end: 6
query_begin: 0
query_end: 3
Target:        4    AC-T    6
                    ||*|
Query:         0    ACGT    3a <- align("ACTG", "ACTCACTG", start_idx = 4)
aCIGAR start index 0: 4M
optimal_score: 8
sub-optimal_score: 0
target_begin: 0 target_end: 3
query_begin: 0
query_end: 3
Target:        0    ACTC    3
                    |||*
Query:         0    ACTG    3Print the results from position 4:
a |> print(start_idx = 4)CIGAR start index 0: 4M
optimal_score: 8
sub-optimal_score: 0
target_begin: 0 target_end: 3
query_begin: 0
query_end: 3
Target:        0    ACTG    3
                    ||||
Query:         0    ACTG    3Enforce no gaps by increasing the penalty:
a <- force_align("ACTG", "TTTTCTGCCCCCACG")
aThe results are truncated:
CIGAR start index 4: 3M
optimal_score: 6
sub-optimal_score: 0
target_begin: 4 target_end: 6
query_begin: 1
query_end: 3
Target:        4    CTG    6
                    |||
Query:         1    CTG    3Use formatter() to avoid the truncation:
b <- a |> formatter()
b[[1]]
[1] "TTTTCTGCCCCCACG"
[[2]]
[1] "   ACTG"Or pretty-print the formatted results directly:
a |> formatter(print = TRUE)TTTTCTGCCCCCACG
   ACTGssw-r is built upon the work of two outstanding projects:
We extend our sincere gratitude to Mengyao Zhao for developing the original SSW library and to Nick Conway for maintaining the ssw-py package. Their work forms the foundation of ssw-r. While ssw-r does not directly incorporate code from these projects, it serves as an R interface to their functionality. We encourage users to visit the original repositories for more information about the underlying implementation and to consider citing these works in publications that use ssw-r.