International Conference Papers
Permanent URI for this collectionhttp://dl.cerist.dz/handle/CERIST/4
Browse
3 results
Search Results
Item Smaller Fully-Functional Bidirectional BWT Indexes(Springer Nature, 2020-09-17) Belazzougui,, Djamal; Cunial, FabioBurrows-Wheeler indexes that support both extending and contracting any substring of the text $T$ of length $n$ on which they are built, in any direction, provide substantial flexibility in traversing the text and can be used to implement several algorithms. The practical appeal of such indexes is contingent on them being compact, and current designs that are sensitive to the compressibility of the input take either $O(e+\REV{e})$ words of space, where $e$ and $\REV{e}$ are the number of right and left extensions of the maximal repeats of $T$, or $O(r\log(n/r)+\REV{r}\log(n/\REV{r}))$ words, where $r$ and $\REV{r}$ are the number of runs in the Burrows-Wheeler transform of $T$ and of its reverse. In this paper we describe a fully-functional bidirectional index that takes $O(m+r+\REV{r})$ words, where $m$ is the number of maximal repeats of $T$, as well as a variant that takes $O(r+\REV{r})$ words.Item Fully-Functional Bidirectional Burrows-Wheeler Indexes and Infinite-Order De Bruijn Graphs(Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2019-06-18) Belazzougui, Djamal; Cunial, FabioGiven a string T on an alphabet of size σ, we describe a bidirectional Burrows-Wheeler index that takes O(|T| log σ) bits of space, and that supports the addition and removal of one character, on the left or right side of any substring of T, in constant time. Previously known data structures that used the same space allowed constant-time addition to any substring of T, but they could support removal only from specific substrings of T. We also describe an index that supports bidirectional addition and removal in O(log log |T|) time, and that takes a number of words proportional to the number of left and right extensions of the maximal repeats of T. We use such fully-functional indexes to implement bidirectional, frequency-aware, variable-order de Bruijn graphs with no upper bound on their order, and supporting natural criteria for increasing and decreasing the order during traversal.Item Fast Label Extraction in the CDAWG(Springer, 2017-09-06) Belazzougui, Djamal; Cunial, FabioThe compact directed acyclic word graph (CDAWG) of a string T of length n takes space proportional just to the number e of right extensions of the maximal repeats of T, and it is thus an appealing index for highly repetitive datasets, like collections of genomes from similar species, in which e grows significantly more slowly than n. We reduce from O(m log log n) to O(m) the time needed to count the number of occurrences of a pattern of length m, using an existing data structure that takes an amount of space proportional to the size of the CDAWG. This implies a reduction from O(m log log n+occ) to O(m+occ) in the time needed to locate all the occocc occurrences of the pattern. We also reduce from O(k log log n) to O(k) the time needed to read the k characters of the label of an edge of the suffix tree of T, and we reduce from O(m log log n) to O(m) the time needed to compute the matching statistics between a query of length m and T, using an existing representation of the suffix tree based on the CDAWG. All such improvements derive from extracting the label of a vertex or of an arc of the CDAWG using a straight-line program induced by the reversed CDAWG.