Papers by Ojus Thomas Lee

Phishing Detection using Extra Trees Classifier
2021 5th International Conference on Information Systems and Computer Networks (ISCON), Oct 22, 2021
With a rapid growth in global networking, the online users are vulnerable to different kinds of a... more With a rapid growth in global networking, the online users are vulnerable to different kinds of attacks, phishing being prevalent among them. Phishing is the type of attack where the attacker aims to steal critical information by tricking the user to click on phishing links. There already exists several anti-phishing software and computational methods for actively detecting phishing activities. However, new methods of cybercrimes are evolved by the attackers that surpass the existing detection models. So, there is a constant need to research and improvise the ways to detect phishing. The proposed system develops a web-based application to detect phishing URLs using a machine learning model. Two ensemble classifiers, Random Forest (RF) and Extra Trees (ET) are compared to find the one with higher performance measures. The models are trained on the UCI dataset with 30 features. Hyperparameter Tuning is performed on the models to check whether it enhances their predictive performance. The Extra Trees classifier without tuning achieved the highest accuracy of 97.47% on the test dataset with the least false positive rate.

Zenodo (CERN European Organization for Nuclear Research), Jun 7, 2023
Braille is a vital means of communication; it is a system for blind people, one of touch reading ... more Braille is a vital means of communication; it is a system for blind people, one of touch reading and writing in which Raised dots are impressions that represent the letters of the alphabet. It is an extremely important tool for blind people to educate themselves, and it is a critical component that supports not only educational advancement, but subconsequently increases employment prospects. The blind should be taught Braille to be able to become literate, which is a necessity in today's world. Braille is a much harder language than sign, as there are a lot of combinations of the impressions of the six raised dots that are not easy to memorize. Visually impaired people are required to master skills to communicate through Braille text, which itself is really time-taking and cumbersome task. In addition, other people need to learn the same set of skills to understand and respond to the visually impaired person. We have devices that convert text to Braille language as well as real-time Braille to speech using Raspberry Pi camera and a Raspberry Pi. Other devices use FPGAs/Arduinos for converting speech to braille. This paper is a survey of different techniques that were used for the conversion of text to braille and vice versa, and an evaluation of the accuracy of these methods is done.
With the evolution of computing technology in many application like human robot interaction, huma... more With the evolution of computing technology in many application like human robot interaction, human computer interaction and health-care system, 3D human body models and their dynamic motions has gained popularity. Human performance accompanies human body shapes and their relative motions. Research on human activity recognition is structured around how the complex movement of a human body is identified and analyzed. Vision based action recognition from video is such kind of tasks where actions are inferred by observing the complete set of action sequence performed by human. Many techniques have been revised over the recent decades in order to develop a robust as well as effective framework for action recognition. In this survey, we summarize recent advances in human action recognition, namely the machine learning approach, deep learning approach and evaluation of these approaches.
Identification of Indian Herbs Using Stepwise Transfer-Learning Technology
Social Science Research Network, 2023

Virtual Character Animation based on Data-driven Motion Capture using Deep Learning Technique
Perceptions in motion capture (mocap) technology are increasing every day as the variety of appli... more Perceptions in motion capture (mocap) technology are increasing every day as the variety of applications using it is doubling. By leveraging the resources offered by mocap technology, human activity characteristics are captured and can be used as the source for animation. The devices involved in the technology are therefore very costly and hence not practical for personal use. In this scenario, we implement a framework capable of producing mocap data from standard RGB video and use it to animate a character in 3D space, based on the action of person in the original video with the help of deep learning techniques. The Human Mesh Recovery (HMR) scheme is used to extract mocap data from the input video to determine where joints of the person in the input video are located in 3D space, using 2D pose estimation. The locations of 3D joints are used as mocap data and transferred to Blender with a simple 3D character using which the character can be animated. A subjective evaluation of our framework based on the metric called observation factor was performed and yielded an accuracy value of 73.5%.
An Abridged Review of Transfer Learning Technology
An Abridged Review of Transfer Learning Technology
2022 Second International Conference on Next Generation Intelligent Systems (ICNGIS)
Identification of Indian Herbs Using Stepwise Transfer-Learning Technology
SSRN Electronic Journal

Virtual Character Animation based on Data-driven Motion Capture using Deep Learning Technique
2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)
Perceptions in motion capture (mocap) technology are increasing every day as the variety of appli... more Perceptions in motion capture (mocap) technology are increasing every day as the variety of applications using it is doubling. By leveraging the resources offered by mocap technology, human activity characteristics are captured and can be used as the source for animation. The devices involved in the technology are therefore very costly and hence not practical for personal use. In this scenario, we implement a framework capable of producing mocap data from standard RGB video and use it to animate a character in 3D space, based on the action of person in the original video with the help of deep learning techniques. The Human Mesh Recovery (HMR) scheme is used to extract mocap data from the input video to determine where joints of the person in the input video are located in 3D space, using 2D pose estimation. The locations of 3D joints are used as mocap data and transferred to Blender with a simple 3D character using which the character can be animated. A subjective evaluation of our framework based on the metric called observation factor was performed and yielded an accuracy value of 73.5%.
International journal of engineering research and technology, Aug 2, 2021
Hyperspectral imaging (HSI) with otherworldly high targets occasionally encounters low-space targ... more Hyperspectral imaging (HSI) with otherworldly high targets occasionally encounters low-space targets that can be inferred from image sensor obstacles. Image fusion is a convenient and practical method for processing HSI space target enhancement. It can solidify HSI and multispectral image (MSI) of higher space targets with comparable environments. In the early years, various combinations of HSI and MSI calculations were all familiar to obtain high-target HSI. In any case, you have not conducted large-scale research on the recently proposed combination of HSI and MSI. They are divided into four categories, including pan-honing or pan-sharpening, frame or matrix decomposition, tensor representation, and methods based on deep convolution neural networks.

International journal of engineering research and technology, Aug 2, 2021
In addition to maintaining the earth's ecosystem plants provide us with oxygen, food, medicine, a... more In addition to maintaining the earth's ecosystem plants provide us with oxygen, food, medicine, and fuel. The accurate identification of plant species is a very challenging task because plant species identification requires specialized knowledge and in-depth training related to botany. Even for botanists themselves species identification is often a difficult task. Therefore, there is an urgent need to develop an automatic plant leaf recognition system. Many researches focuses on plant leaf based identification, since it's easier to access as compared to other parts of a plant. This paper provides a survey on the methods and classifications used to identify the various plants in the recent years. Moreover, this survey includes a comparative study of those methods according to the accuracy achieved by the classifiers. This review will be helpful to beginners in research field to understand and analyse the methods as a guideline.

International journal of engineering research and technology, Aug 2, 2021
As we have moved most of our financial, work related and other daily activities to the internet, ... more As we have moved most of our financial, work related and other daily activities to the internet, we are exposed to greater risks in the form of cybercrimes. URL based phishing attacks are one of the most common threats to the internet users. In this type of attack, the attacker exploits the human vulnerability rather than software flaws. It targets both individuals and organizations, induces them to click on URLs that look secure, and steal confidential information or inject malware on our system. Different machine learning algorithms are being used for the detection of phishing URLs, that is, to classify a URL as phishing or legitimate. Researchers are constantly trying to improve the performance of existing models and increase their accuracy. In this work we aim to review various machine learning methods used for this purpose, along with datasets and URL features used to train the machine learning models. The performance of different machine learning algorithms and the methods used to increase their accuracy measures are discussed and analysed. The goal is to create a survey resource for researchers to learn the current developments in the field and contribute in making phishing detection models that yield more accurate results.

Phishing Detection using Extra Trees Classifier
2021 5th International Conference on Information Systems and Computer Networks (ISCON), 2021
With a rapid growth in global networking, the online users are vulnerable to different kinds of a... more With a rapid growth in global networking, the online users are vulnerable to different kinds of attacks, phishing being prevalent among them. Phishing is the type of attack where the attacker aims to steal critical information by tricking the user to click on phishing links. There already exists several anti-phishing software and computational methods for actively detecting phishing activities. However, new methods of cybercrimes are evolved by the attackers that surpass the existing detection models. So, there is a constant need to research and improvise the ways to detect phishing. The proposed system develops a web-based application to detect phishing URLs using a machine learning model. Two ensemble classifiers, Random Forest (RF) and Extra Trees (ET) are compared to find the one with higher performance measures. The models are trained on the UCI dataset with 30 features. Hyperparameter Tuning is performed on the models to check whether it enhances their predictive performance. The Extra Trees classifier without tuning achieved the highest accuracy of 97.47% on the test dataset with the least false positive rate.
Improved Epoch Expiry and Load Handling Mechanism for RAPID - The Fast Data Update Protocol in Erasure Coded Storage Systems
2018 International Conference on Data Science and Engineering (ICDSE), 2018
Erasure coding techniques provides significant fault tolerance with reduced storage overhead in c... more Erasure coding techniques provides significant fault tolerance with reduced storage overhead in comparison with replication. Fast data update strategies in erasure coded storage systems are needed to handle the big data and cloud applications involving frequent data updates. Research work presented here focus on the performance enhancement of RAPID protocol for fast data update. In this paper we present a method to identify low load for epoch firing. In addition, a method has been put forward for firing epoch on heavy load. In the third contribution, we implement a scheme to reduce the heavy load on the storage system due to simultaneous epoch firing of multiple files.

A Method for Storage Node Allocation in Erasure Code Based Storage Systems
2017 IEEE 3rd International Conference on Collaboration and Internet Computing (CIC), 2017
Fault tolerance is a major issue for all storage service providers. Currently, the storage servic... more Fault tolerance is a major issue for all storage service providers. Currently, the storage service providers make use of data replication as a method to ensure fault tolerance. In the big data era, relying on data replication for fault tolerance reduces the storage efficiency. Most of the modern applications make use of erasure code based storage systems as an alternative to the data replication. In erasure code based storage systems, the allocation of storage nodes for storing data is to be done with care so that the load on the nodes of the storage system is always balanced. In this paper, we propose a greedy solution for the storage node allocation problem in big data environment with load balancing. Other major contributions discussed in the paper are modeling this problem with graph theory and suggesting an integer linear program formulation for the problem.

ECSim-2: A Performance Evaluator for Erasure Code based Storage Systems
2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), 2018
In today’s world, cloud storage systems built on distributed technology, are being used to store,... more In today’s world, cloud storage systems built on distributed technology, are being used to store, manage and access massive amount of data in real time. The data replication based storage method is used by the storage service providers, to ensure fault tolerance although simple, results in storage overhead. With erasure code based storage systems, additional storage requirements to ensure fault tolerance can be reduced, while ensuring reliability equivalent to data replication. Several schemes of erasure coding existing today, however performance evaluation of such schemes through real distributed storage systems, is costly and time consuming. A feasible alternative solution for the problem is the use of simulators. In this research paper, we present a framework that simulates the behavior of an erasure code based storage system. This framework is implemented as an extension to CloudSim, thus making it a platform capable of performance evaluation of the erasure coding schemes. The s...

Text Line Detection in Camera Caputerd Images Using Matlab Gui
1Post Graduation Student, Dept. of CSE, College of Engineering Kidangoor, Kerala, India 2Associat... more 1Post Graduation Student, Dept. of CSE, College of Engineering Kidangoor, Kerala, India 2Associate Professor & HOD, Dept. of CSE, College of Engineering Kidangoor, Kerala, India ----------------------------------------------------------------------***--------------------------------------------------------------------Abstract In today’s digital world, the camera based text pre-processing get an important task that finds applications in many fields like Text2Speech conversion, information retrieval etc... So, this led to the development of many pre-processing methods. This paper proposes a system that combines text line identification from images containing printed and handwritten text. The Maximally Stable Extremel Region (MSER) algorithm is used to determine the character regions in the image. Based on the geometric property of the region, unwanted or dissimilar regions are removed. The Optical Character Recognition (OCR) is used to recognize each character and detect the word regi...

Modelling multi level consistency in erasure code based storage systems
Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, 2019
In the internet era, cloud storage services are in high demand. Due to huge volumes of the presen... more In the internet era, cloud storage services are in high demand. Due to huge volumes of the present day big data, cloud service providers look for alternatives for data replication, which has been traditionally used for providing fault tolerance and availability. If three copies of data are maintained for ensuring availability, then there is 200% storage overhead in replication based storage systems. Erasure code based storage is proving itself the most suitable alternative to replication schemes. Hardware assisted encoding, decoding process and research outcomes on data updates from various research communities, indicate a promising future for erasure code based storage systems for hot data storage. In this paper, a new protocol is proposed which shows how to provide different types of consistency in erasure code based storage systems in concurrent data access scenarios where failures of components are anticipated. We have developed protocols and data structures for implementing the strong, eventual and monotonic types of consistencies. The proposed consistency model has been tested successfully and the results are promising.

RAPID: A Fast Data Update Protocol in Erasure Coded Storage Systems for Big Data
2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2017
Erasure codes are nowadays used extensively indistributed storage systems that handle big data, s... more Erasure codes are nowadays used extensively indistributed storage systems that handle big data, since theyoffer significant fault tolerance with low storage overhead. Eventhough erasure coded systems are space efficient, these involvehigher network bandwidth and computational complexity in theiroperations. In this paper, we present RAPID, a protocol for fastdata updates, which works by choosing a subset of code blocksfor updates and adapts the strength of the subset based on thepredicted number of failures. The proposal uses a predictionbased heuristic in which the set of failures that may happen inthe near future is represented as a function of past failures. Ahybrid protocol that uses both locking and buffering mechanismsis adopted in the solution to maintain the consistency on the dataand code blocks updates. Our experimental results demonstrateimprovement in the performance of data updates by 30% andthe failure prediction mechanism proposed shows an accuracy of 80%.
With the evolution of computing technology in many application like human robot interaction, huma... more With the evolution of computing technology in many application like human robot interaction, human computer interaction and health-care system, 3D human body models and their dynamic motions has gained popularity. Human performance accompanies human body shapes and their relative motions. Research on human activity recognition is structured around how the complex movement of a human body is identified and analyzed. Vision based action recognition from video is such kind of tasks where actions are inferred by observing the complete set of action sequence performed by human. Many techniques have been revised over the recent decades in order to develop a robust as well as effective framework for action recognition. In this survey, we summarize recent advances in human action recognition, namely the machine learning approach, deep learning approach and evaluation of these approaches.
Uploads
Papers by Ojus Thomas Lee