Browsing by Author "Raffinot, Mathieu"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- ItemFlexible Indexing of Repetitive Collections(Springer, 2017-06-07) Belazzougui, Djamal; Cunial, Fabio; Gagie, Travis; Prezza, Nicola; Raffinot, MathieuHighly repetitive strings are increasingly being amassed by genome sequencing experiments, and by versioned archives of source code and webpages. We describe practical data structures that support count- ing and locating all the exact occurrences of a pattern in a repetitive text, by combining the run-length encoded Burrows-Wheeler transform (RLBWT) with the boundaries of Lempel-Ziv 77 factors. One such vari- ant uses an amount of space comparable to LZ77 indexes, but it answers count queries between two and four orders of magnitude faster than all LZ77 and hybrid index implementations, at the cost of slower lo- cate queries. Combining the RLBWT with the compact directed acyclic word graph answers locate queries for short patterns between four and ten times faster than a version of the run-length compressed suffix ar- ray (RLCSA) that uses comparable memory, and with very short pat- terns our index achieves speedups even greater than ten with respect to RLCSA.
- ItemIndexing and querying color sets of images(Springer Varlag, 2016-09-27) Belazzougui, Djamal; Kolpakov, Roman; Raffinot, MathieuWe aim to study the set of color sets of continuous regions of an image given as a matrix of m rows over n ≥ m columns where each element in the matrix is an integer from [1,σ] named a color . The set of distinct colors in a region is called fingerprint. We aim to compute, index and query the fingerprints of all rectangular regions named rectangles. The set of all such fingerprints is denoted by F. A rectangle is maximal if it is not contained in a greater rectangle with the same fingerprint. The set of all locations of maximal rectangles is denoted by L. We first explain how to determine all the |L| maximal locations with their fingerprints in expected time O(n⋅m^2⋅σ) using a Monte Carlo algorithm (with polynomially small probability of error) or within deterministic O(n⋅m^2⋅σ⋅log(|L|/(n⋅m^2)+2)) time. We then show how to build a data structure which occupies O(n⋅m⋅log n+|L|) space such that a query which asks for all the maximal locations with a given fingerprint f can be answered in time O(|f|+log log n+k), where k is the number of maximal locations with fingerprint f. If the query asks only for the presence of the fingerprint, then the space usage becomes O(n⋅m⋅log n+|F|) while the query time becomes O(|f|+log log n). We eventually consider the special case of squared regions (squares).