HOME
CV
RESEARCH
SOFTWARE
PUBLICATIONS
LAB
TEACHING
Bioinformatics software -- the most recent version of the source code is available on GitHub: https://github.com/lucian-ilie
Protein-protein interaction prediction
- Seq-InSite: Sequence supersedes structure for protein Interaction Site prediction
- Employs the best protein embeddings with ensemble architecture
- The only sequence-based program to match or surpass the performance of structure-based methods
- Web server available.
- S. Hosseini, G.B. Golding, L. Ilie, Seq-InSite: sequence supersedes structure for protein interaction site prediction, Bioinformatics (2023), to appear.
- PITHIA: Predicting Interaction siTes using Highly Informative Alignments and Attention
- Combines multiple sequence alignments and learning attention to predict interaction sites
- Web server available.
- S. Hosseini, L. Ilie, PITHIA: protein interaction site prediction using multiple sequence alignments and attention, International Journal of Molecular Sciences 23(21) (2022) 12814.
- DELPHI: DEep Learning Prediction of Highly probable protein Interaction sites
- Interaction site prediction using a CNN-RNN ensemble and three novel features: High Scoring Pairs (HSPs), position, ProtVec
- Y. Li, G.B. Golding, L. Ilie, DELPHI: accurate deep ensemble model for protein interaction sites prediction, Bioinformatics 37(7) (2021) 896 -- 904..
Protein-protein interaction prediction
- SPRINT: Scoring PRotein INTeractions
- Homology-based PPI prediction program
- The only sequence-based program able to effectively predict the entire human interactome
- Y. Li, L. Ilie, SPRINT: Ultrafast protein-protein interaction prediction of the entire human interactome, BMC Bioinformatics 18 (2017) 485.
Sequence similarity search
- ALeS: Adaptive-length spaced-seed design
- A, Mallik, L, Ilie, ALeS: Adaptive-length spaced-seed design,
Bioinformatics, Bioinformatics 37(9) (2021) 1206 -- 1210.
- E-MEM: Efficient Computation of Maximal Exact Maches
- N. Khiste, L. Ilie, E-MEM: Efficient computation of Maximal Exact Matches for very large genomes, Bioinformatics 31(4) (2015) 509 -- 514.
- SpEED: Spaced SEEDs
- Fast computation of highly sensitive spaced seeds for effective local similarity detection
- L. Ilie, S. Ilie, and A. Mansouri Bigvand, SpEED: fast computation of sensitive spaced seeds, Bioinformatics 27(17) (2011) 2433 -- 2434.
- L. Ilie and S. Ilie, Multiple spaced seeds for homology search, Bioinformatics 23(22) (2007) 2969 -- 2977.
Error correction in DNA sequencing data
- Correcting Illumina Data
- New methods and tools for thorough evaluation of error correcting performance
- M. Molnar, L. Ilie, Correcting Illumina data, Briefings in Bioinformatics 16(4) (2015) 588 -- 599.
- RACER: Rapid and Accurate Correction of Errors in Reads
- Accurate and efficient correction of errors in NGS data
- L. Ilie, M. Molnar, RACER: Rapid and Accurate Correction of Errors in Reads, Bioinformatics 29 (2013) 2490 -- 2493.
- HiTEC: High-Throughput Error Correction
- Accurate error correction in NGS data
- L. Ilie, F. Fazayeli, and S. Ilie, HiTEC: accurate error correction in high-throughput sequencing data, Bioinformatics 27(3) (2011) 295 -- 302.
Genome Assembly
- SAGE: String Graph Assembly of GEnomes
- A genome assembler that is based on the overlap graph and that works well on short and medium-size genomes.
- L. Ilie, B. Haider, M. Molnar, R. Solis-Oba, SAGE: String-graph Assembly of GEnomes, BMC Bioinformatics 15 (2014) 302.
- LASER: Large genome ASsembly EvaluatoR
- A tool for genome assembly evaluation based on E-MEM and Quast; it produces the same evaluation as Quast but is 5.6 times faster and use half the memory.
- N. Khiste, L. Ilie, LASER: Large genome ASsembly EvaluatoR, BMC Research Notes 8 (2015) 709.
- SAGE2: String Graph Assembly of GEnomes 2
- An improved parallel algorithm and implementation of SAGE that works well on Illumina data for human genomes.
- M. Molnar, E. Haghshenas, L. Ilie, SAGE2: Parallel Human Genome Assembly, Bioinformatics (2018) 34(4) 678 -- 680.
- HISEA: HIerarchical SEed Aligner
- A read overlapper for PacBio data that has higher sensitivity and precision than the state-of-the-art. It produces the best PacBio assemblies in the Canu assembly pipeline.
- N. Khiste, L. Ilie, HISEA: HIerarchical SEed Aligner for PacBio data, BMC Bioinformatics 18(1) (2017) 564.
DNA Oligonucleotide design
- BOND: Basic OligoNucleotide Design
- Computation of perfect DNA oligonucleotides
- L. Ilie, H. Mohamadi, G.B. Golding, W.F. Smyth, BOND: Basic OligoNucleotide Design, BMC Bioinformatics 14 (2013) 69.
- L. Ilie, S. Ilie, S. Khoshraftar, and A. Mansouri-Bigvand, Seeds for effective oligonucleotide design, BMC Genomics 12 (2011) 280.
Short read mapping
- SHRiMP: Short Read Mapping Program
- Accurate mapping of short color-space reads
- SHRiMP2 uses seeds computed by SpEED
- M. David, M. Dzamba, D. Lister, L. Ilie, and M. Brudno, SHRiMP2: Sensitive yet practical short read mapping, Bioinformatics 27(7) (2011) 1011 -- 1012.