(aside image)

Publications of the String Algorithms Group (2000–)

2016


T. Chhabra, J. Tarhio: A filtration method for order-preserving matching. Information Processing Letters 116, 2 (2016), 71–74.

2015


E. Giaquinta: Run-length encoded nondeterministic KMP and suffix automata. In: Proc. CIAA '15, Implementation and Application of Automata, 20th International Conference, Lecture Notes in Computer Science 9223, Springer 2015, 102–113.

T. Chhabra, E. Giaquinta, J. Tarhio: Filtration algorithms for approximate order-preserving matching. In: Proc. SPIRE '15, String Processing and Information Retrieval, 22nd International Symposium, Lecture Notes in Computer Science 9309, Springer 2015, 177–187.

T. Chhabra, S. Ghuman, J. Tarhio: Tuning algorithms for jumbled matching. In: Proc. PSC '15, 14th Prague Stringology Conference, 2015, 36–46.

T. Chhabra, M. O. Kulekci, J. Tarhio: Alternative algorithms for order-preserving matching. In: Proc. PSC '15, 14th Prague Stringology Conference, 2015, 57–66.

T. Flouri, E. Giaquinta, K. Kobert, E. Ukkonen: Longest common substrings with k mismatches. Information Processing Letters 115, 6–8 (2015), 643–647.

E. Giaquinta, A. Mishra, L. Pozzi: Maximum convex subgraphs under I/O constraint for automatic identification of custom instructions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34, 3 (2015), 483–494.

2014


H. Peltola, J. Tarhio: String matching with lookahead. Discrete Applied Mathematics 163, 3 (2014) 352–360.

S. Gog, K. Karhu, J. Kärkkäinen, V. Mäkinen, N. Välimäki: Multi-pattern matching with bidirectional indexes. Journal of Discrete Algorithms 24, 1 (2014), 26–39.

T. Chhabra, J. Tarhio: Order-preserving matching with filtration. In: Proc. SEA '14, 13th International Symposium on Experimental Algorithms. LNCS 8504, Springer, 2014, 307–314.

T. Hirvola, J. Tarhio: Approximate online matching of circular strings. In: Proc. SEA '14, 13th International Symposium on Experimental Algorithms. LNCS 8504, Springer, 2014, 315–325.

B. Ďurian, T. Chhabra, S. Ghuman, T. Hirvola, H. Peltola, J. Tarhio: Improved two-way bit-parallel search. In: Proc. PSC '14, 13th Prague Stringology Conference, 2014, 71–83.

K. Pollari-Malmi, J. Rautio, J. Tarhio: Speeding up compressed matching with SBNDM2. In: Proc. PSC '14, 13th Prague Stringology Conference, 2014, 110–123.

S. Ghuman, E. Giaquinta, J. Tarhio: Alternative algorithms for Lyndon factorization. In: Proc. PSC '14, 13th Prague Stringology Conference, 2014, 169–178.

A. Farzan, T. Gagie, G. Navarro: Entropy-bounded representation of point grids. Computational Geometry: Theory and Applications 41, 1 (2014) 1–14.

J. Barbay, F. Claude, T. Gagie, G. Navarro, Y. Nekrich: Efficient fully-compressed sequence representations. Algorithmica 69, 1 (2014), 232–268.

2013


T. Gagie, J. Kärkkäinen, G. Navarro, S. J. Puglisi: Colored range queries and document retrieval. Theoretical Computer Science 483 (2013), 36–50.

T. Gagie: On the value of multiple read/write streams for data compression. Information Theory, Combinatorics, and Search Theory 2013, 284–297.

P. Gawrychowski, T. Gagie: Minimax trees in linear time with applications. European Journal of Combinatorics 31, 1 (2013) 82–90.

T. Gagie, K. Karhu, G. Navarro, S. J. Puglisi, J. Sirén: Document listing on repetitive collections. Proc. Combinatorial Pattern Matching, 24th Annual Symposium (CPM 2013), LNCS 7922, Springer, 2013, 107–119.

H. Peltola: Towards Faster String Matching. Doctoral dissertations 78/2013, Aalto University, 2013.

K. Karhu: String Searching Methods for Bioinformatics. Doctoral dissertations 130/2013, Aalto University, 2013.

2012


F. Claude, G. Navarro, H. Peltola, L. Salmela, J. Tarhio: String matching with alphabet sampling. Journal of Discrete Algorithms 11 (2012), 37–50.

E. Czeizler, A. Mizera, El. Czeizler, R.-J. Back, J. E. Eriksson, I. Petre: Quantitative analysis of the self-assembly strategies of intermediate filaments from tetrameric vimentins. Transactions on Computational Biology and Bioinformatics 9, 3 (2012), 885–898.

El. Czeizler, A. Mizera, I. Petre: A Boolean approach for disentangling the roles of submodels to the global properties of a model. Fundamenta Informaticae, 112, 1–4 (2012), 51–63.

A. Mizera, El. Czeizler, I. Petre: Computational methods for quantitative submodel comparison. In: Biomolecular Information Processing. From Logic Systems to Smart Sensors and Actuators, Wiley-VCH, 2012, 323–346.

El. Czeizler, E. Czeizler, B. Iancu, I. Petre: Quantitative model refinement as a solution to the combinatorial state explosion of biomodels. Electronic Notes in Theoretical Computer Science 284 (2012), 35–53.

J. Fischer, T. Gagie, T. Kopelowitz, M. Lewenstein, V. Mäkinen, L. Salmela, N. Välimäki: Forbidden patterns. In: Proc. 10th Latin American Theoretical Informatics Symposium (LATIN), 2012, 327–337.

T. Gagie, K. Karhu, Juha Kärkkäinen, V. Mäkinen, L. Salmela, J. Tarhio: Indexed Multi-pattern matching. In: Proc. 10th Latin American Theoretical Informatics Symposium (LATIN), 2012, 399–407.

T. Gagie, P. Gawrychowski, J. Kärkkäinen, Y. Nekrich, S. J. Puglisi: A faster grammar-based self-index. In: Proc. 6th Conference on Language and Automata Theory and Applications (LATA), 2012, 240–251.

T. Gagie, G. Navarro, S. J. Puglisi: New algorithms on wavelet trees and applications to information retrieval. Theoretical Computer Science 426–427 (2012), 25–41.

P. Ferragina, T. Gagie, G. Manzini: Lightweight data indexing and compression in external memory. Algorithmica 63, 3 (2012), 707–730.

S. Gog, K. Karhu, J. Kärkkäinen, V. Mäkinen, N. Välimäki: Multi-pattern matching with bidirectional indexes. In: Proc. 18th Annual International Computing and Combinatorics Conference (COCOON), LNCS 7434, Springer, 2012, 384–395.

T. Gagie: A note on sequence prediction over large alphabets. Algorithms 5, 1 (2012), 50–55.

H. Bannai, T. Gagie, I. Tomohiro, S. Inenaga, G. M. Landau, M. Lewenstein: An efficient algorithm to test square-freeness of strings compressed by straight-line programs. Information Processing Letters 112, 19 (2012), 711–714.

T. Gagie: Bounds from a card trick. Journal of Discrete Algorithms 10, 1 (2012), 2–4.

B. Iancu, El. Czeizler, E. Czeizler, I. Petre: Quantitative refinement of reaction models. International Journal of Unconventional Computing 8, 5–6 (2012), 529–550.

2011


T. Gagie, P. Gawrychowski, S. J. Puglisi: Faster approximate pattern matching in compressed repetitive texts. In: Proc. 22nd International Symposium on Algorithms and Computation (ISAAC), 2011, 652–662.

T. Gagie, M. He, J. I. Munro, P. K. Nicholson: Finding frequent elements in compressed 2D arrays and strings. In: Proc. 18th Symposium on String Processing and Information Retrieval (SPIRE), 2011, 295–300.

E. Rivals, L. Salmela, J. Tarhio: Exact search algorithms for biological sequences. In: Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications. Wiley, 2011, 91–111.

F. Cicalese, T. Gagie, M. Milanic, E. Laber: Competitive Boolean function evaluation: beyond monotonicity, and the symmetric case. Discrete Applied Mathematics 159, 11 (2011), 1070–1078.

T. Gagie, Y. Nekrich: Tight bounds for online stable sorting. Journal of Discrete Algorithms 9, 2 (2011), 176–181.

K. Karhu, J. Mäkinen, J. Rautio, H. Salamon, J. Tarhio: GAST, a genomic alignment search tool. In: Proc. International Conference on Bioinformatics Models, Methods and Algorithms, SciTePress, 2011, 82–90.

H. Peltola, J. Tarhio: Variations of Forward-SBNDM. In: Proc. PSC ’11, Prague Stringology Conference, 2011, 3–14.

K. Karhu: Improving exact search of multiple patterns from a compressed suffix array. In: Proc. PSC ’11, Prague Stringology Conference, 2011, 226–231.

T. Gagie, J. Kärkkäinen: Counting colours in compressed strings. In: Proc. 22nd Symposium on Combinatorial Pattern Matching (CPM), Springer, 2011, 197–207.

2010


L. Salmela, J. Tarhio: Approximate string matching with reduced alphabet. In: Algorithms and Applications. LNCS 6060, Springer, 2010, 210–220.

K. Karhu, S. Khuri, J. Mäkinen, J. Tarhio: Identifying human miRNA targets with a genetic algorithm. In: Proc. ISB 2010, International Symposium on Biocomputing, ACM, 2010.

J. Mäkinen, J. Tarhio, S. Khuri: PMSGA: A fast DNA fragment assembler. In: Proc. BIOINFORMATICS 2010, First International Conference on Bioinformatics, INSTICC, 2010, 77–82.

B. Ďurian, J. Holub, H. Peltola, J. Tarhio: Improving practical exact string matching. Information Processing Letters 110, 4 (2010), 148–152.

L. Salmela, J. Tarhio, P. Kalsi: Approximate Boyer-Moore string matching for small alphabets. Algorithmica 58, 3 (2010), 591–609.

T. Gagie, G. Manzini: Move-to-front, distance coding, and inversion frequencies revisited. Theoretical Computer Science 411, 31–33 (2011), 2925–2944.

J. Barbay, T. Gagie, G. Navarro, Y. Nekrich: Alphabet partitioning for compressed rank/select and applications. In: Proc. 21st International Symposium on Algorithms and Computation (ISAAC), LNCS 6507, Springer, 2010, 315–326.

A. Farzan, T. Gagie, G. Navarro: Entropy-bounded representation of point grids. In: Proc. 21st International Symposium on Algorithms and Computation (ISAAC), LNCS 6507, Springer, 2010, 327–338.

T. Gagie, G. Navarro, S. J. Puglisi: Colored range queries and document retrieval. In: Proc. 17th Symposium on String Processing and Information Retrieval (SPIRE), LNCS 6393, Springer, 2010, 67–81.

2009


L. Salmela: Improved algorithms for string searching problems. Doctoral dissertation, TKK Research Reports in Computer Science and Engineering A, TKK-CSE-A1/09, Department of Computer Science and Engineering, Helsinki University of Technology, 2009.

B. Ďurian, J. Holub, H. Peltola, J. Tarhio: Tuning BNDM with q-grams. In: Proc. ALENEX '09, Tenth Workshop on Algorithm Engineering and Experiments, SIAM, 2009, 29–37.

E. Rivals, L. Salmela, P. Kiiskinen, P. Kalsi, J. Tarhio: MPSCAN: fast localisation of multiple reads in genomes. In: Proc. WABI '09, 9th Workshop on Algorithms in Bioinformatics, Lecture Notes in Bioinformatics 5724, Springer, 2009, 246–260.

G. Navarro, L. Salmela: Indexing variable length substrings for exact and approximate matching. In: Proc. SPIRE '09, String Processing and Information Retrieval, LNCS 5721, Springer, 2009, 214–221.

N. Philippe, A. Boureux, L. Bréhélin, J. Tarhio, T. Commes, E. Rivals: Using reads to annotate the genome: influence of length, false locations, and sequence errors on prediction capacity. Nucleic Acids Research 37, 15 (2009), e104.

J. Karlgren, J. Tarhio, H. Hyyrö (eds.): String Processing and Information Retrieval, 16th International Symposium, SPIRE 2009, Proceedings. LNCS 5721, Springer, 2009.

2008


L. Salmela, J. Tarhio: Fast parameterized matching with q-qrams. Journal of Discrete Algorithms 6, 3 (2008), 408–419

F. Claude, G. Navarro, H. Peltola, L. Salmela, J. Tarhio: Speeding up pattern matching by text sampling. In: Proc. SPIRE '08, String Processing and Information Retrieval. LNCS 5280, Springer, 2008, 87–98.

P. Kalsi, H. Peltola, J. Tarhio: Comparison of exact string matching algorithms for biological sequences. In: Proc. BIRD '08, 2nd International Conference on Bioinformatics Research and Development. Communications in Computer and Information Science 13, Springer, 2008, 417–426.

2007


L. Salmela, J. Tarhio: Algorithms for weighted matching. In: Proc. SPIRE '07, String Processing and Information Retrieval, LNCS 4726, Springer, 2007, 276–286.

P. Kalsi, L. Salmela, J. Tarhio: Tuning approximate Boyer-Moore for gene sequences. In: Proc. SPIRE '07, String Processing and Information Retrieval, LNCS 4726, Springer, 2007, 173–183.

H. Peltola, J. Tarhio: On string matching in chunked texts. In: Proc. CIAA '07, Conference on Implementation and Application of Automata, LNCS 4783, Springer, 2007, 157–167.

E. Rivals, A. Boureux, M. Lejeune, F. Ottones, O. Pérez, J. Tarhio, F. Pierrat, F. Ruffle, T. Commes, J. Marti: Transcriptome annotation using tandem SAGE tags. Nucleic Acids Research 35, 17 (2007), e108.

2006


L. Salmela, J. Tarhio, J. Kytöjoki: Multi-pattern string matching with q-grams. ACM Journal of Experimental Algorithmics 11, 1 (2006).

L. Salmela, J. Tarhio: Sublinear algorithms for parameterized matching. In: Proc. CPM '06, Combinatorial Pattern Matching (ed. M. Lewenstein et al.), LNCS 4009, Springer, 2006, 354–364.

2005


G. Navarro, J. Tarhio: LZgrep: a Boyer-Moore string matching tool for Ziv-Lempel compressed text. Software: Practice and Experience 35, 12 (2005), 1107–1130.

J. Rautio: Context-dependent stopper encoding. In: Proc. Prague Stringology Conference '05. Czech Technical University, 2005, 143–152.

G. Navarro, E. Sutinen, J. Tarhio: Indexing text with approximate q-grams. Journal of Discrete Algorithms 3, 24 (2005), 157–175.

J. Rautio: Speeding up string matching with text compression (in Finnish). In: Tietojenkäsittelytieteen päivät 2005. University of Oulu, 2005, 141–144.

2004


K. Fredriksson, J. Tarhio: Efficient string matching in Huffman compressed texts. Fundamenta Informaticae 62, 1 (2004), 116.

E. Sutinen, J. Tarhio: Approximate string matching with ordered q-grams. Nordic Journal of Computing 11, 4 (2004), 321–343.

2003


J. Kytöjoki, L. Salmela, J. Tarhio: Tuning string matching for huge pattern sets. In: Proc. CPM '03, Combinatorial Pattern Matching, LNCS 2676, Springer, 2003, 211–224.

K. Lemström, J. Tarhio: Transposition invariant pattern matching for multi-track strings. Nordic Journal of Computing 10, 3 (2003), 185–205.

H. Peltola, J. Tarhio: Alternative algorithms for bit-parallel string matching. In: Proc. SPIRE '03, 10th Symposium on String Processing and Information Retrieval, LNCS 2857, Springer, 2003, 80–94.

K. Fredriksson, J. Tarhio: Processing of Huffman compressed texts with a super-alphabet. In: Proc. SPIRE '03, 10th Symposium on String Processing and Information Retrieval, LNCS 2857, Springer, 2003, 108–121.

2002


J. Rautio, J. Tanninen, J. Tarhio: String matching with stopper encoding and code splitting. In: Proc. CPM '02, Combinatorial Pattern Matching, LNCS 2373, Springer, 2002, 42–52.

J. Rautio, J. Tanninen, J. Tarhio: String matching with stopper compression. Poster. In: Proc. DCC '02, Data Compression Conference, IEEE Computer Society Press, 2002, p. 469.

2001


J. Tarhio: On compression of parse trees. In: SPIRE 2001, Eight Symposium on String Processing and Information Retrieval, IEEE Computer Society, 2001, 205–211.

G. Navarro, R. Baeza-Yates, E. Sutinen, J. Tarhio: Indexing methods for approximate string matching. Bulletin of the Technical Committee on Data Engineering 24, 4 (2001), 19–27.

2000


K. Lemström, J. Tarhio: Searching monophonic patterns within polyphonic sources. In: Proc. RIAO '00, Content-Based Multimedia Information Access (Vol. 2), C.I.D., 2000, 1261–1279.

G. Navarro, J. Tarhio: Boyer-Moore string matching over Ziv-Lempel compressed text. In: Proc. CPM '00, Combinatorial Pattern Matching, 11th Annual Symposium, LNCS 1848, Springer, 2000, 166–180.

G. Navarro, E. Sutinen, J. Tanninen, J. Tarhio: Indexing text with approximate q-grams. In: Proc. CPM '00, Combinatorial Pattern Matching, 11th Annual Symposium, LNCS 1848, Springer, Berlin, 2000, 350–363.