Textual Data Compression

description58 papers

group306 followers

lightbulbAbout this topic

Textual data compression is the process of encoding textual information using fewer bits than the original representation, aiming to reduce the size of data for storage or transmission while preserving the original content. This involves algorithms that exploit redundancy and patterns in the text to achieve efficient encoding.

lightbulbAbout this topic

Key research themes

1. How can locality of reference and dictionary-based heuristics improve static text compression efficiency and parallelizability?

This research theme explores text compression schemes that exploit locality of reference—where certain words or patterns appear frequently within short intervals to achieve better compression than classic Huffman coding. It also investigates the greedy approach to dictionary-based static text compression, particularly its execution in distributed systems. The focus is on heuristics for word caching, factorization, and dictionary design, balancing compression effectiveness with computational efficiency and scalability on parallel architectures.

Compression Scheme

by Ian Munro

2021

Key finding: Introduces a defined-word compression scheme using a move-to-front heuristic to exploit locality of reference in text. The scheme dynamically organizes a sequential word list, encoding recently used words with shorter... Read more

articleView Paper downloadDownload

The greedy approach to dictionary-based static text compression on a distributed system

by S. Agostino

2021, Journal of Discrete Algorithms

Key finding: Presents the implementation of a greedy factorization method for dictionary-based static text compression that can be executed efficiently via finite state machines and parallelized across distributed systems with minimal... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What are the advancements and challenges in achieving fully-compressed suffix trees supporting dynamic updates for text indexing?

Suffix trees are fundamental data structures for string processing with widespread applications but traditionally suffer from large space requirements. This theme investigates the development of fully compressed suffix trees (FCSTs) that achieve space usage close to the entropy of the text while supporting efficient queries. It further studies dynamic FCSTs that can handle text updates and their complexity trade-offs, aiming to attain polylogarithmic query times within optimal compressed space.

Dynamic Fully-Compressed Suffix Trees

by Luis Russo

2023, Lecture Notes in Computer Science

Key finding: Develops a framework for dynamic fully compressed suffix trees occupying asymptotically optimal space proportional to the entropy of the text and supporting all suffix tree operations in polylogarithmic time. This extends... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How can combined techniques of Burrows-Wheeler transform, pattern matching, and Huffman coding enhance lossless text compression?

This theme addresses the integration of statistical transforms and coding techniques to improve lossless text compression ratios. It focuses on leveraging the Burrows-Wheeler transform (BWT) to cluster repeated characters for efficient run-length representation, followed by pattern matching to detect frequently occurring substrings, and applying Huffman coding based on character frequencies. These combined methods aim to achieve superior compression performance compared to classical schemes while maintaining decompression efficiency.

Burrows-Wheeler Transform Based Lossless Text Compression Using Keys and Huffman Coding

by Md. Atiqur Rahman

2025

Key finding: Proposes a novel lossless compression algorithm that applies the Burrows-Wheeler transform with a two-key approach to reduce consecutive character repetitions. Subsequently, frequent patterns are identified for additional... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Textual Data Compression

Federated learning

by Valar Mathi

2026, International Journal of Innovative Research in Information Security

Federated Learning (FL) is a machine learning technique that helps safeguard data privacy by financial institutions to collaborate to train AI models without exchanging real data. In FL, the AI model is distributed to each institution, whereas in traditional machine learning, all the data is sent to a single location for training the data, each bank or institution uses its own private financial data to train the model and only transmits model updates to a central server. For activities including identifying fraud, evaluating credit risk, executing automated transacti providing individualized services, and forecasting market trends, this method assists in identifying helpful patterns in financial data. Crucially, it adheres to significant laws and regulations and protects data privacy..Our research focuses on developing a privacy address the unique difficulties of handling data dispersed across several locations, guaranteeing security, and adhering to legal requiremen create intelligent, safe, and cooperative AI systems In today's financial landscape, privacy and security of sensitive data have and increasing cyber threats. Traditional machine learning approaches require pooling data from multiple financial institutions into a centralized system, which raises significant privacy concerns and risks of Learning (FL) emerges as a groundbreaking solution to address these challenges by enabling decentralized, collaborative model training across multiple entities without sharing raw data participating institutions such as banks and fintech companies, where the model is locally trained on private financial data. Instead of sending actual data, only encrypted updates of the model are aggregated centrally to form an improved global model. This privacy accurate and robust AI applications while adhering to legal and ethical standards. Within the financial sector, FL finds critical applications in trading, personalized services, and market trend forecasting. Fraud detection is a particularly vital area where FL enables multiple banks to collaboratively identify fraudulent activities using de reducing false positives. However, deploying FL in finance introduces unique challenges such as dealing with heterogeneous and imbalanced data, ensuring secure communication, and defending against adversar focuses on developing and analyzing a federated learning framework specifically tailored for the finance domain. It covers the architecture of FL systems, privacy and security measures, implementation strategies, and evaluation metr world case studies demonstrate the efficacy and benefits of FL in financial applications. Lastly, the paper explores future directions in federated finance AI, emphasizing explain illustrate how FL can harness financial data responsibly and collaboratively to build intelligent, secure, and privacy AI systems in modern finance.

descriptionView Paper arrow_downwardDownload

Blackwell SHD-CCP Architecture - BioChain AI -Schism Labs REVIEW

by SHD-CCP 64-bit Packet and

2026, SHD-CCP Architecture Group

The "XOR-Torus" Implementation This document outlines the systematic approach to establishing the Blackwell Block, pairing the SHD-CCP Stream, and optimizing the Linguistic Crystallization pipeline for benchmarking on NVIDIA SM100... more

descriptionView Paper arrow_downwardDownload

Understanding Meme Coin Trends Through Sentiment Analysis

by IJRASET Publication

2025, International Journal for Research in Applied Science & Engineering Technology (IJRASET)

This study explores the use of sentiment analysis and machine learning models to predict the market trends of meme coins. By analyzing social media sentiment and financial metrics, the research achieved a 74% accuracy rate in forecasting... more

descriptionView Paper arrow_downwardDownload

Real Estate Price Prediction Using Machine Learning Models

by IJRASET Publication

2025, International Journal for Research in Applied Science & Engineering Technology (IJRASET)

Accurate real estate price prediction is crucial in today's market to aid buyers, sellers, and investors in making informed decisions. This study employs machine learning algorithms-specifically Linear Regression, Decision Tree... more

descriptionView Paper arrow_downwardDownload

Design and performance measurements of a parallel machine for the unification algorithm

by Fadi N Sibai

2025, ACM SIGMICRO Newsletter

Unification is known to be the most repeated operation in logic programming and PROLOG interpreters. To speed up the execution of logic programs, the performance of unification must be improved. We propose a parallel unification machine... more

descriptionView Paper arrow_downwardDownload

Developing Highly Resilient Architecture for Critical Systems to Mitigate Operational Risks

by IJSES Editor

2025, IJSES

The development of a highly resilient architecture for mission-critical systems is an integrated approach aimed at minimizing operational risks and ensuring the continuity of vital services. In the face of growing threats, including... more

descriptionView Paper arrow_downwardDownload

An Efficient Text Compression Technique Based on Using Bitwise Lempel-Ziv Algorithm

by Ayman Al Dmour

2024

This paper presents an efficient data compression technique based on using Lempel-Ziv coding algorithms such as the LZ-78 algorithm. The conventional LZ-78 algorithm was applied directly to a non-binary information source (i.e., original... more

descriptionView Paper arrow_downwardDownload

Improving Compression Efficiency of Data Warehouse

by Dr. Ajay Indian

2024, International Journal of Scientific & Engineering Research

Data compression has a paramount effect on Data warehouse for reducing data size and improving query processing. Distinct compression techniques are feasible at different levels, each of types either give good compression ratio or... more

descriptionView Paper arrow_downwardDownload

Teaching the Principles of Text Compression and Exploring Gender Differences in Eradicating Information Redundancy

by Maia Chkheidze

2024

The research on the phenomenon of text compression lies in response to the ever-increasing demands of the modern information society. These demands are intricately tied to the efficient utilization of knowledge and the continuous pursuit... more

descriptionView Paper arrow_downwardDownload

Teaching the Principles of Text Compression and Exploring Gender Differences in Eradicating Information Redundancy

by Revaz Tabatadze

2024, International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

descriptionView Paper arrow_downwardDownload

Prolong Lifespan of Wireless Sensor Network with Optimized Information Compression Algorithm and Magnetic Resonant Concept

by Prof Anil Ahlawat

2024, Proceedings of the 2018 International Conference on Computer Science, Electronics and Communication Engineering (CSECE 2018)

Remote sensor systems maintain numerous applications in various fields. Sparing vitality in such systems is continuously a basic issue that should be considered to delay the network lifespan. Bunching in the systems is additionally... more

descriptionView Paper arrow_downwardDownload

Study on Data Compression Technique

by Md Jayedul Haque

2024, International Journal of Computer Applications

In this current age both communication and generic file compression technologies are using different kind of efficient data compression methods massively. This paper surveys a variety of data compression methods. The aim of data... more

descriptionView Paper arrow_downwardDownload

A NAM Representation Method for Data Compression of Binary Images

by Mudar Sarem

2023, Tsinghua Science & Technology

A representation method using the non-symmetry and anti-packing model (NAM) for data compression of binary images is presented. The NAM representation algorithm is compared with the popular linear quadtree and run length encoding... more

descriptionView Paper arrow_downwardDownload

Data Compression Techniques for Wireless Sensor Network

by hemlata dakhore

2023

Data compression is an art used to reduce the number of bits required to transmit the data of particular information. The goal of data compression is to eliminate the redundancy in a data in order to reduce its size. Data compression can... more

descriptionView Paper arrow_downwardDownload

Semi-lossless text compression

by Shmuel Tomi Klein

2023

A new notion, that of semi-lossless text compression, is introduced, and its applicability in various settings is investigated. First results suggest that it might be hard to exploit the additional redundancy of English texts, but the new... more

Table 1 summarizes some of the results. The first column gives the size of the raw files, the second after having applied simple Huffman coding on the individual letters. All compression figures are given in bits per character (bpc). The next columns deal with bigrams and trigrams, first in a standard fragmentation of the text into bi- or trigrams, then using the reordering for those k-grams that can be changed. For the bigrams the variant with the flag-bit has been applied, for the trigrams, triples including the first or last letter of a word have not been reordered. The figures include the overhead of storing the bi- or trigrams. As can be seen, there is a slight improvement, though not a significant one. In fact, even with better parsing strategies than the simple one we used, one should not expect large savings for English text: the average word length being less than 5, and the two corner letters being fixed, the reordering will affect on the average less than 3 letters. However, with schemes going beyond word boundaries, like LZSS, or for other languages and other reordering rules, better results might be expected. WAT 3 yg. gdh). yg be fC. yop, 2. ee ee, ts i: < . © Ss

descriptionView Paper arrow_downwardDownload

On the Randomness of Compressed Data

by Shmuel Tomi Klein

2023

It seems reasonable to expect from a good compression method that its output should not be further compressible, because it should behave essentially like random data. We investigate this premise for a variety of known lossless... more

descriptionView Paper arrow_downwardDownload

Forward Looking Huffman Coding

by Shmuel Tomi Klein

2023, Lecture Notes in Computer Science

Huffman coding is known to be optimal, yet its dynamic version may yield smaller compressed files. The best known bound is that the number of bits used by dynamic Huffman coding in order to encode a message of n characters is at most... more

Fig. 2. Example for which classical dynamic Huffman coding produces a file twice the size of that constructed by FORWARD-HUFFMAN. Dynamic Huffman coding repeatedly changes the shape of the tree, but there is a delay between the occurrence of a change and when such a c to influence the encoding. For encoding the current character we hange starts use the tree built in the previous stage, and the changes implied by the processed character do only affect the encoding in the subsequent stages, if at all. T his behavior is demonstrated in the following extreme example, comparing the performances of the two dynamic variants. The example shows that the file constructed by traditional dynamic Huffman may be about twice as large as that the FORWARD-HUFFMAN algorithm. produced by

Table 1. Compression performance. Our goal was to compare the compression performance of the three meth- ods: static Huffman, the proposed FORWARD-HUFFMAN algorithm, and the tra- ditional dynamic Huffman. The results are presented in Tablel. The second column gives the original file sizes in MB. The third column gives the size of the encoded alphabet, m. The following three columns, entitled STATIC, FOR- WARD and DYNAMIC, show the compression ratios achieved by the compared algorithms. The compression ratio is defined as the size of the compressed file divided by the size of the original file. As mentioned, we included the overhead of the description of the model in the size of the compressed file. As can be seen, our method is consistently better than static Huffman, as

descriptionView Paper arrow_downwardDownload

Simulation and Comparison of Various Lossless Data Compression Techniques based on Compression Ratio and Processing Delay

by Alan Janson

2023, International journal of computer applications

With increasing need to store data in lesser memory several lossless compression techniques are developed. This paper intends to provide the performance analysis of lossless compression techniques over various parameters like compression... more

descriptionView Paper arrow_downwardDownload

Text Mining of Stocktwits Data for Predicting Stock Prices

by Usman Naseem

2023, Applied system innovation

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY

descriptionView Paper arrow_downwardDownload

Comparison of Huffman Algorithm and Lempel Ziv Welch Algorithm in Text File Compression

by Muhammad Alif

2023, IT Journal Research and Development

The development of data storage hardware is very rapidly over time. In line with the development of storage hardware, the amount of digital data shared on the internet is increasing every day. That way no matter how big the size of the... more

Figure 12. Space Saving Comparison of Huffman and LZW Algorithms on DOCX Files Figure 12 shows the percentage of space saving from 2 algorithms compared, namely the Huffman algorithm and the Lempel Ziv Welch (LZW) algorithm. The Lempel Ziv Welch (LZW) algorithm shows the results of compression of 12 files failed. In figure 12, it is shown that in the space saving of 12 files using the Huffman algorithm 7 test files were successfully compressed and got space saving results below 3% and 5 test files failed and got minus space saving results.

Figure 3 is a page view of the selection of actions to be performed on the compressio process with the Lempel Ziv Welch algorithm. On this page there are two types of actions that ca be selected, namely compressing data or decompressing data.

igure 1. WeSsealCn W OFKTIOW This research starts by conduc ing a literature review to determine the algorithm to be compared, then proceeds to the problem identification stage, namely formulating the problems found in the form of questions. The nex step is the determination of the algorithm to be compared, at this stage it is proposed the Huffman algorithm and the Lempel Ziv Welch algorithm. Then the Huffman algorithm and the Lempel Zi using a programming language. This im process on the Huffman algorithm and v Welch algorithm were implemented into an application plementation process takes place in 2 processes, namely the the process on the Lempel Ziv Welch algorithm. The next step is to test 2 compared algorithms to see the performance of the 2 algorithms. algorithms are compared. The last step is to draw t algorithms that have been compared. 2 PECTIT TS AND ANAT VQCTIC he conclusion of an algorithm that is superior to the 2 2.5. Compression Time

Figure 10 is a comparison diagram of the space savings produced by the Huffman algorithm and the LZW algorithm. Huffman algorithm space saving is indicated by blue, LZW algorithm space saving is indicated by red. Huffman's algorithm shows the highest space saving value of 61.73 and the lowest space saving value of 34.83. LZW algorithm shows the highest space saving value of 93.89 and the lowest space saving value of 45.

Figure 9 is a comparison diagram of the compression time generated by the Huffman algorithm and the LZW algorithm. Huffman's algorithm compression time is indicated by blue, LZW algorithm compression time is indicated by red. Based on the diagram, it can be concluded that the LZW algorithm is faster in compressing TXT files than Huffman algorithms.

Figure 8 is a comparison diagram of the space savings produced by the Huffman algorithm and the LZW algorithm. Huffman algorithm space saving is indicated by blue, LZW algorithm space saving is indicated by red. Huffman's algorithm shows the highest space saving value of 46.96 and the lowest space saving value of 28.87. LZW algorithm shows the highest space saving value of 90.86 and the lowest space saving value of 8.65.

Figure 5 is a display image for input or inserting a file to be compressed using the Lempel Ziv Welch algorithm. Similar to Figure 4, a statement on this page states that the allowed file type is a text file type. Hasil Kompresi Algoritma LZW

are translated into a programming language that was later created into a web-based application tc carry out the compression process. Figure 2 is a page view of the selection of actions to be performed on the compression process with the Huffman algorithm. On this page there are two types of actions that can be selected, namely compressing data or decompressing data.

Table 1 shows the comparison of the results of the compression process he TXT extension. The information displayed includes the file name; the size o Sompression; a comparison of the sizes of 12 fi he Lempel Ziv ising ulgorithm. The Huffman algorithm. The second stage is testing of ing using the Huf ulgorithm. Tes estl file.txt to ugorithm for ising the Hufi algorithm starts nto the applica ulgorit v H Lempel Ziv We able 1 above, Huffman algorit he Huffman algorithm and the Lempel Sompression time of 12 files compressed using t the compression fman algorithm. Next, the ap -ompression process, which inc compressing, space saving value, and com uffman algorit Welch algorithm; a comparison of the space-saving value of 12 he Huffman algorithm and the Le Ziv Welch algorithm; and a comparison of mpel Ziv We 2 test files with f 12 files before es compressed using the Huffman algorithm and files compressed he ch test was carried out in 2 stages. The first stage testing 12 tes fman algorit test12.txt) alternately into process, the hm begins n the app plication d udes the fi by inputting eac he file name, file size before compressing, file size after compressi alue, and compression time. Information on the compression results of 12 test files using hm and information on the results of the compression of 12 test files using ch algorithm are summarized in a table that can be seen in table 1 above. From he comparison results were obtained, namely the average space saving by hm was 38.45, the average space saving by the Lempel Ziv Welch algorithm was 2 test files using the Lempel Ziv We by inputting each test file (starting from he application, then selecting the type of Huffman ication performs the compression process isplays the information resu e name, file size before compressing, pression time. Testing using the Lempel Ziv We h test file (starting from the test1 file.txt to test12. ion, then chooses the type of Lempel Ziv Welch algorithm for the compression orocess, and then the application carries out the compression process using the Lem hm. Next, which includes t files using the ch he ting from the file size after ch xt) alternately pel Ziv Welch the application displays the information resulting from the compression process, ng, space saving he he he 93.85, the average compression time of the Huffman algorithm was 72.44, and the compression ime of the Lempel Ziv Welch algorithm was 2.33. IT Jou Res and Dev, Vol.7, No.2, March 2023 : 155 - 169

Table 2. CSV File test Result Table 2 the CSV extension. The information displayed includes the file name, the size o shows the comparison of the results of the compression process 12 test files w compression, a comparison of the sizes of 12 files compressed using the Huffman algorithm a the Lempel Ziv using the Huffman algorithm and the Lempel Ziv Welch algorithm, and a comparison of compression time of 12 files compressed using the Huffman algorithm and the Lempel Ziv We algorithm. The Huffman algorithm. The second stage is testing of 12 test files using the Lempel Ziv We algorithm. Testing using the Huffman algorithm begins by inputting each test file (starting from test! file.csv to compression process, which includes the file name, file size before compressing, file size aft compressing, sp algorithm starts into the applica algorithm. Next, which includes t Huffman algorit Lempel Ziv We table 2 above, Huffman algorit process, and then the application carries out the compression process using the Lempel Ziv We value, and compression time. Information on the compression results of 12 test files using ith f 12 files before nd Welch algorithm, a comparison of the space-saving value of 12 files compressed test was carried out in 2 stages. The first stage testing 12 test files using test12.csv) alternately into the application, then selecting the type of Huffman algorithm for the compression process, then the application performs the compression process using the Huffman algorithm. Next, the application displays the information resulting from he ch he ch ne ne ace saving value, and compression time. Testing using the Lempel Ziv We ion, then chooses the type of Lempel Ziv Welch algorithm for the compressi the application displays the information resulting from the compression proce he file name, file size before compressing, file size after compressing, space savi hm and information on the results of the compression of 12 test files using er ch by inputting each test file (starting from the test1 file.csv to test12.csv) alternately on ch Ss, ng he he ch algorithm are summarized in a table that can be seen in table 2 above. From he comparison results were obtained, namely the average space saving by he hm was 40.04, the average space saving by the Lempel Ziv Welch algorithm was 77.56, the average compression time of the Huffman algorithm was 119.62, and the compression time of the Lempel Ziv Welch algorithm was 1.83.

Table 3. DOCX File Test Result Table 3 shows the comparison of the results of the compression process the DOCX extension. The information displayed inc pression, a comparison of the sizes of 12 fi the Lempel Ziv Welch algorithm, a comparison he Huffman algorithm and the Lempel pression time of 12 files compressed using t hm. The test was carried an algorithm. The second stage is testing o hm. Testing using the Huf file.docx to test12.docx) alternate hm for the compression com using com algorit! Huffm algorit! test algorit! using the Hu compression compressing, algorithm sta alternately into the applic compression process, and ffman algori process, whi rts by input fman a udes t out in 2 stage process, then the application hm. Next, the application displays hich inc space saving value, and ing each test file (starting from t of t Ziv s. T gorithm begins by inpu y into the application, he file name, file size mparison of files using ting each test file ( hen selecting the performs the com he information resulting from before compressing, file size aft starting from compression time. Tes ation, then c Lempel Ziv Welch algori compression process, w compressing, space saving 12 test files using the Huf: test hooses the type of Lempel Ziv Welch algorithm for then the application carries out the compression hm. Next, the application displays ing using the Lempel Ziv We he test! file.docx to test12.doc process using he information resulting from hich includes the file name, file size before com value, and compression time. Inform: fman algorithm and information on the results of the compression of files using the Lempel Ziv Welch algorithm are summarized in a table 3 above. From table 3 above, the comparison results were obtained, namely saving by the Huffman algorithm was 0.46, the average space algorithm was -21.40, the average compression time of the Huffman algorit compression time of the Lempel Ziv Welch algorithm was 4.98. pressing, file size af ation on the compression results ype of Huffman pression process 2 test files with udes the file name, the size of 12 files before es compressed using the Huffman algorithm and he space-saving value of 12 Welch algorithm, and a co he Huffman algorithm and the Lempel Ziv We he first stage testing 12 tes 12 test files using the Lempel Ziv We files compressed ne ch ne ch ne ne er er of saving by 2 hat can be seen in table he average space the Lempel Ziv Welch hm was 150.80, and the

descriptionView Paper arrow_downwardDownload

Survey of Lossless Data Compression Algorithms

by Dr. Kruti Dangarwala

2023, International Journal of Engineering Research and

The main goal of data compression is to decrease redundancy in warehouse or communicated data, so growing effective data density. It is a common necessary for most of the applications. Data compression is very important relevancy in the... more

descriptionView Paper arrow_downwardDownload

Survey of Text Compression Algorithms

by Dr. Kruti Dangarwala

2023, International Journal of Engineering Research and

Data compression is now almost a common requirement for every applications as it is a means for saving the channel bandwidth and storage space. Data Compression is an art of allowing a technique to reduce the volume of data i.e. excess... more

descriptionView Paper arrow_downwardDownload

Comparative Analysis of the Compression of Text Data Using Huffman, Arithmetic, Run-Length, and Lempel Ziv Welch Coding Algorithms

by peter Baidoo

2023, Journal of Advances in Mathematics and Computer Science

The purpose of the study was to compare the compression ratios of file size, file complexity, and time used in compressing each text file in the four selected compression algorithms on a given modern computer running Windows 7. The... more

Table 1. Selected Sample Files for the Research

Fig. 4. The effect of number of words on compression ratios of selected files Research Question Two: What is the effect of file size on compression ratios of text compression algorithms? A bar graph was used to determine the effect of file size on compression ratios of text compression algorithms. The results is shown in Fig. 4.

Fig. 1. Designed Interface The researcher created a user interface to surface the manipulation of the files into bits, which is often not friendly to interact with, for simple interactivity. The researcher created the interface depicted in Fig. 1 using the NetBeans development environment's Graphical User Interface (GUI) designer.

This research question was analysed by using grouped bar graph to determine how different are the compression size of text compression algorithms. The results is shown in Fig. 3.

Fig. 5. Compression ratio difference of selected files Research Question Three: Does the complexity of a text file affect the compression ratios of text compression algorithms? To determine the complexity of a text file affect the compression ratios of text compression algorithms a bar was used as shown in Fig. 5:

Fig. 2 displays an example of a human compression used in the study. Fig. 2. Samples of compressed file

Fig. 6. Compression time for the selected files Furthermore, “goldenbaton” with file size 254360 used much time of 53191 milliseconds than Atomic with larger file size of 260376 which used a maximum time of 19710 milliseconds.

descriptionView Paper arrow_downwardDownload

Forward Looking Huffman Coding

by Dana Shapira

2023, Theory of Computing Systems

descriptionView Paper arrow_downwardDownload

Security Measures of Textual Data

by Sumanth J Samuel

2023, International Journal of Innovative Research in Information Security

As the volume and importance of textual data in data science continues to grow, combined with advancements in its techniques, it has created numerous opportunities for extracting valuable insights from textual information. However,... more

Fig. 2. Three nodes in consistent hash ring Now, let's say we have several nodes in a distributed system, represented by points on the circle. To determine which node is responsible for storing a particular data item, we take the hash of the item and find its position on the circle. We then move clockwise around the circle until we find the first node encountered. That node becomes the owner of the data item.

Fig. 2. (a) Three Centralized orchestration versus (b) Decentralized orchestration using triggers. [6] Decentralized replication and orchestration, as an approach for achieving fault tolerance, can be compared with centralized replication techniques commonly used in fault- tolerant systems. Distributed, centralised, and decentralised architectures are used to build centralised versus decentralised systems. We analyse these systems for clarity's sake because the definitions of fault-tolerance and resilience of a system can vary based on the (small or extensive) implementation [5]. Let's compare these two approaches in terms of scalability, fault tolerance, consistency, and coordination overhead:

descriptionView Paper arrow_downwardDownload

Energy-Efficient Data Collection in Clustered Wireless Sensor Networks employing Distributed DCT

by Tiến Nguyễn

2023, International Journal of Wireless & Mobile Networks

In this paper, a energy-efficient data collection method is proposed in which an integration between Discrete Cosine Transform (DCT) matrix and clustering in wireless sensor networks (WSNs) is exploited.Based on the fact that sensory data... more

descriptionView Paper arrow_downwardDownload

Security Measures of Textual Data

by IJIRIS Journal Division and

2023, IJIRIS:: AM Publications

descriptionView Paper arrow_downwardDownload

Effective Compression of Digital Video using LZW

by Anil Lohar

2023

Video compression is nothing but compression of video, it involves compression of video size, audio format. In other words we can state video compression as one of the encoding format of video that it can have less memory size than the... more

- To measure the quality of our system to another apps, same input video given to the different apps as well as to proposed system. Result as shown in below. And Proposed system gives the better result than other apps with respect quality as well as compression ratio also. extract the frames then convert the RGB into the YUV color. After color conversion spatial redundancy is done within the frame and temporal redundancy between the frames. Then apply the LZW algorithm on each frame and reduced the data as per their redundancy. Here no data loss hence it can be provide the better quality as carry the original data after compression also. This proposed system provide the compressed video with good compression ratio and provides the similar quality as original video.

In this paper , proposed system consist of effective video compression. LZW algorithm is used to effective compression of digital video which provides the similar quality as the original video. LZW algorithm is suitable for where the data redundancy is present. In the video frames are repeated with the minor difference, hence it is effective for the video compression. It is lossless algorithm ,only redundancy us eliminated not data. In this software, the above metrics are implemented in OpenCV (C++) based on the original Matlab implementations provided b their developers. The source code of this software can be compiled on any platform and only requires the OpenCV library (cor and imgproc modules). This software allows performing video quality assessment without using Matlab and shows bette performance than Matlab in terms of run time.

To measure the quality of the compressed video with respect to original video VOMT (Video Quality Measurement Tool) is used. This software provides fast implementations of the following objective metrics:

descriptionView Paper arrow_downwardDownload

Rescore in a Flash: Compact, Cache Efficient Hashing Data Structures for n-Gram Language Models

by Gautam Tiwari

2023, Interspeech 2020

We introduce DashHashLM, an efficient data structure that stores an n-gram language model compactly while making minimal trade-offs on runtime lookup latency. The data structure implements a finite state transducer with a lossless... more

descriptionView Paper arrow_downwardDownload

Self-Organized Distributed Compressive Projection in Large Scale Wireless Sensor Networks

by Eduardo Morgado

2023, 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC)

The optimal configuration for a Large Scale Wireless Sensor Networks (LS-WSN) is the one that minimizes the sampling rate, the CPU time and the channel accesses (thus maximizing the network lifetime), with a controlled distortion in the... more

descriptionView Paper arrow_downwardDownload

A Survey Paper on Hard Disk Failure Prediction Using Machine Learning

by HARSH DUBEY

2023, International Journal for Research in Applied Science and Engineering Technology

Failure of Hard Disk is a term most companies and people, fear about. People get concerned regarding data loss. Therefore, predicting the failure of the HDD is an important and to ensure the storage security of the data center. There... more

descriptionView Paper arrow_downwardDownload

9-13Noise Reduction in Data Communication Using Compression Technique

by Brian Usibe

2023

Noise is an ever present phenomenon while dealing with recording devices, be it digital or analog, be it specks in images or background hiss in music recordings. Therefore, this paper aims at ways of reducing the effects of these forms of... more

descriptionView Paper arrow_downwardDownload

Lossless Image Compression Techniques: A State-of-the-Art Survey

by mohamed hamada

2023, Symmetry

Modern daily life activities result in a huge amount of data, which creates a big challenge for storing and communicating them. As an example, hospitals produce a huge amount of data on a daily basis, which makes a big challenge to store... more

descriptionView Paper arrow_downwardDownload

Energy-Efficient Data Collection in Clustered Wireless Sensor Networks employing Distributed DCT

by Tiến Nguyễn

2023, International Journal of Wireless & Mobile Networks

descriptionView Paper arrow_downwardDownload

A Survey Paper on Hard Disk Failure Prediction Using Machine Learning

by IJRASET Publication

2023, International Journal for Research in Applied Science & Engineering Technology (IJRASET)

descriptionView Paper arrow_downwardDownload

Energy-Efficient Data Collection in Clustered Wireless Sensor Networks employing Distributed DCT

by Tiến Nguyễn

2023, International Journal of Wireless & Mobile Networks

descriptionView Paper arrow_downwardDownload

Lossless Compression Tool for Limited Number of Colors

by Radu Rădescu

2023

Obiectivul utilitarului ICompress, descris în acest articol, este studiul în condiţii reale al compresiei secvenţelor masive de date având un număr limitat de culori. Datorită faptului că standardele actuale nu oferă suficientă... more

descriptionView Paper arrow_downwardDownload

Effective Compression of Digital Video using LZW

by Anil Lohar

2023

descriptionView Paper arrow_downwardDownload

Comparison of Source Coding Techniques for the Vehicle to Vehicle Communication

by Subrahmanya Gunaga

2023, arXiv (Cornell University)

Autonomous driving is gaining its importance due to the advancements in technology. With the intention of safety during human driving and with the longer-term aim to act as a communication enabler for autonomous driving, vehicle to... more

descriptionView Paper arrow_downwardDownload

An improved plagiarism detection scheme based on semantic role labeling

by Mohammed Salem Binwahlan

2023, Applied Soft Computing

Plagiarism occurs when the content is copied without permission or citation. One of the contributing factors is that many text documents on the internet are easily copied and accessed. This paper introduces a plagiarism detection... more

descriptionView Paper arrow_downwardDownload

Implementation of Flight Fare Prediction System Using Machine Learning

by Neel Bhosale

2023, International Journal for Research in Applied Science and Engineering Technology

The Flight ticket prices increase or decrease every now and then depending on various factors like timing of the flights, destination, duration of flights. In the proposed system a predictive model will be created by applying machine... more

descriptionView Paper arrow_downwardDownload

Energy-Efficient Data Collection in Clustered Wireless Sensor Networks employing Distributed DCT

by Tiến Nguyễn

2023, International Journal of Wireless & Mobile Networks

descriptionView Paper arrow_downwardDownload

Image preprocessing for compression: Attribute filtering

by Florence Tushabe

2023

This work proposes a preprocessing method for image compression based on attribute filtering. This method is completely shape preserving and computationally cheap. Three filters were investigated, including one derived from the power... more

descriptionView Paper arrow_downwardDownload

Self-Organized Distributed Compressive Projection in Large Scale Wireless Sensor Networks

by Julio Ramiro

2023, 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC)

descriptionView Paper arrow_downwardDownload

Energy-Efficient Data Collection in Clustered Wireless Sensor Networks employing Distributed DCT

by Tiến Nguyễn

2023, International Journal of Wireless & Mobile Networks

descriptionView Paper arrow_downwardDownload

Medical Insurance Cost Prediction using Machine Learning

by IJRASET Publication

2022, International Journal for Research in Applied Science & Engineering Technology (IJRASET)

Insurance is a policy that helps to cover up all loss or decrease loss in terms of expenses incurred by various risks. A number of variables affect how much insurance costs. These considerations of different factors contribute to the... more

Fig.6.Box plot of Medical Charges per children

Fig.4.Box plot of Medical Charges by diabetic status

The dataset includes nine variables, as shown in table 1.[3] From these variables each one of these attributes has some contribution to estimate the cost of the insurance, which is our dependent variable. In this stage, the data is scrutinized and updated properly to efficiently apply the data to the ML algorithms. Now the categorical variables are converted into numeric or binary values to represent either 0 or 1. For example, instead of "SEX" with males or females, the "Male" variable would be considered as false (0) if the person is male. And "female" would be (1) see table II; following this phase now, we can apply this data __ to all regression models used in this study. Table I. Dataset overview

Now we examine the other independent variables with the dependent variable (charges). Fig.1.Box plot of Medical Charges per Region

Fig.3.Box plot of Medical Charges for alcoholic person Fig.2.Box plot of Medical Charges by Smoking status

Fig.5.Box plot of Medical Charges per Gender

Table II: categorical variables after translated into numeric or binary values

descriptionView Paper arrow_downwardDownload

The Hybrid Compressive Sensing Data Collection Method in Cluster Structure for Efficient Data Transmission in WSN

by mallanagouda biradar

2022

Wireless sensor network consists of large number of wireless node that are responsible for sensing processing and monitoring environmental .These sensor nodes are battery operated. Clustering is a standard approach for achieving efficient... more

descriptionView Paper arrow_downwardDownload

A Survey on Different Compression Techniques Algorithm for Data Compression I

by Jeegar Trivedi

2022

descriptionView Paper arrow_downwardDownload