Papers by Chaitali Choudhary

IACR Cryptology ePrint Archive, 2021
Most existing Secure Multi-Party Computation (MPC) protocols for privacy-preserving training of d... more Most existing Secure Multi-Party Computation (MPC) protocols for privacy-preserving training of decision trees over distributed data assume that the features are categorical. In real-life applications, features are often numerical. The standard "in the clear" algorithm to grow decision trees on data with continuous values requires sorting of training examples for each feature in the quest for an optimal cut-point in the range of feature values in each node. Sorting is an expensive operation in MPC, hence finding secure protocols that avoid such an expensive step is a relevant problem in privacy-preserving machine learning. In this paper we propose three more efficient alternatives for secure training of decision tree based models on data with continuous features, namely: (1) secure discretization of the data, followed by secure training of a decision tree over the discretized data; (2) secure discretization of the data, followed by secure training of a random forest over the discretized data; and (3) secure training of extremely randomized trees ("extratrees") on the original data. Approaches (2) and (3) both involve randomizing feature choices. In addition, in approach (3) cutpoints are chosen randomly as well, thereby alleviating the need to sort or to discretize the data up front. We implemented all proposed solutions in the semi-honest setting with additive secret sharing based MPC. In addition to mathematically proving that all proposed approaches are correct and secure, we experimentally evaluated and compared them in terms of classification accuracy and runtime. We privately train tree ensembles over data sets with 1000s of instances or features in a few minutes, with accuracies that are at par with those obtained in the clear. This makes our solution orders of magnitude more efficient than the existing approaches, which are based on oblivious sorting.

Existing secure Multi-Party Computational (MPC) protocols for the training of decision trees over... more Existing secure Multi-Party Computational (MPC) protocols for the training of decision trees over distributed data are only capable of handling categorical attributes. This is an enormous restriction on the practicality of their use, as attributes in data sets used in practice are often numerical. The standard “in the clear” algorithm to train decision trees on real-valued data sets requires sorting training examples for each feature at each node to find an optimal cut point – a prohibitively expensive operation in MPC. In this paper we propose an alternative method for securely training tree-based models on data with continuous attributes. Namely, the secure training of extremely randomized trees (“Extra Trees”). In addition to randomizing feature choices – as is done in random forests training – feature value thresholds are chosen randomly as well, thereby removing the need for sorting. We implement our solution in the semi-honest majority setting with additive secret sharing base...
An Empirical Comparison of Community Detection Techniques for Amazon Dataset
Lecture notes on data engineering and communications technologies, 2023

Proceedings on Privacy Enhancing Technologies, 2022
Most existing Secure Multi-Party Computation (MPC) protocols for privacy-preserving training of d... more Most existing Secure Multi-Party Computation (MPC) protocols for privacy-preserving training of decision trees over distributed data assume that the features are categorical. In real-life applications, features are often numerical. The standard “in the clear” algorithm to grow decision trees on data with continuous values requires sorting of training examples for each feature in the quest for an optimal cut-point in the range of feature values in each node. Sorting is an expensive operation in MPC, hence finding secure protocols that avoid such an expensive step is a relevant problem in privacy-preserving machine learning. In this paper we propose three more efficient alternatives for secure training of decision tree based models on data with continuous features, namely: (1) secure discretization of the data, followed by secure training of a decision tree over the discretized data; (2) secure discretization of the data, followed by secure training of a random forest over the discret...
Ensemble Deep Learning Approach with Attention Mechanism for COVID-19 Detection and Prediction
Smart innovation, systems and technologies, Nov 23, 2022
SARWAS: Deep ensemble learning techniques for sentiment based recommendation system
Expert Systems with Applications

Blockchain for IoT Security and Privacy: Challenges, Application Areas and Implementation Issues
Cross-Industry Blockchain Technology: Opportunities and Challenges in Industry 4.0
Blockchain and IoT are the most exciting technologies in the current world, combining these two t... more Blockchain and IoT are the most exciting technologies in the current world, combining these two together may resolve a lot of issues. In the current scenario, we are using IoT devices in nearly everything. By the end of this era, we can presume that all of our day-to-day use devices will be smart. But with this various issue may rise like safety, security, and performance concerns of smart devices. To resolve these issues, blockchain technology has emerged as a very powerful tool. In this chapter, the basics of blockchain along with its architecture and algorithms involved are discussed. IoT challenges and related literature are also discussed along with blockchain as an efficient technology to resolve these issues. The chapter also includes the challenges in using blockchain in IoT devices.
Community detection algorithms for recommendation systems: techniques and metrics
Computing
& Sciences Publication Pvt. Ltd. Secure of Face Authentication using Visual Cryptography
Abstract: Visual Cryptography is a process of creating shares from an Image so that it would beco... more Abstract: Visual Cryptography is a process of creating shares from an Image so that it would becomes unreadable for intruder or unauthenticated person. There are various measures on which performance of visual cryptography scheme depends, such as pixel expansion, contrast, security, accuracy, computational complexity, share generated is meaningful or meaningless, type of secret image. This technique encrypts a secret image into shares such that stacking a sufficient number of shares reveals the secret image.This paper implements visual cryptography for color images in a biometric application. The project modules have a strong authentication and robustness scheme. In this project, face authentication scheme helps in achieving robustness by locating an image face from n input image.

Sentiment-based recommendation systems are growing very fast nowadays , as users cannot express t... more Sentiment-based recommendation systems are growing very fast nowadays , as users cannot express their opinion on the Likert scale from 1 to 5. Most of the current techniques work on either one of the parameters (ratings or reviews), mainly on reviews. This paper explored a new algorithm, SARWAS, involving both ratings and reviews for the recommendation system. This paper proposed a deep learning model using a sentiment and rating weighted association score (SARWAS) framework for combining ratings and reviews. We scraped reviews from e-commerce sites and calculated polarity and subjectivity for each review. Then a neural network model is further applied to calculate the weights and determine a combined score for a product. We evaluated the proposed model in terms of correlation between rating, review, and recommendation. It is being observed from the experiment that the proposed method produced satisfactory results in terms of accuracy. The correlation of reviews and recommendations ...
Community Detection Techniques and Metrics: A State-of-the-Art Survey
Futuristic Sustainable Energy and Technology, Jun 6, 2022
A Real-Time Fault Tolerant and Scalable Recommender System Design Based on Kafka
2022 IEEE 7th International conference for Convergence in Technology (I2CT)
International Journal of Scientific Research in Computer Science, Engineering and Information Technology, May 19, 2017
Configuration of firewalls is a task which every System Administrator need to perform from time t... more Configuration of firewalls is a task which every System Administrator need to perform from time to time. Every firewall comes with its own set of default rules which need to be updated from time to time. A thorough analysis of the user behavior in an organization or institution would help the Administrator to understand the needs of the organization in a much better manner. This paper aims at understanding the user behavior of a particular network over a period of time and accordingly redefine the rules. This upgradation of rules helps in efficient utilization of resources in the organization. This also ensures even distribution of resources of organization giving equal usage opportunities to all users.
Classification and analysis of data streams are the most promising fields of research and develop... more Classification and analysis of data streams are the most promising fields of research and development in Data stream mining. Ensemble based classification approach is one the most challenging flavor of developing an efficient classifier due to large number available base classifiers and increase in the computational time required for training and classification. This paper emphasizes on various factors which affects the accuracy of an ensemble based classifier.

Intelligent Computing and Communication, 2020
New challenges have emerged in data mining as the traditional techniques have floundered with rea... more New challenges have emerged in data mining as the traditional techniques have floundered with real-time data streams. The traditional technique needs refurbishing so as to acclimatize with concept drifting data streams. Thus dealing with the concept changes is the most imperative task of stream data mining. Ensemble classifiers have the ability to automatically adapt with the incoming drifts and, therefore, it is the most interesting research area in data stream mining. Bagging, Boosting and Random forest generation are the common ensemble techniques and are the most popular machine learning approaches in the current scenario for static data (Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9-10):469-1495, [1]). A large number of base classifiers in an ensemble can cause computational overhead. Data mining classifiers for real-time data streams, therefore, need to be updated constantly and retrained with the labeled instances of the newly arrived novel classes in data streams and to cope with concept drift; otherwise, the mining models will become less and less accurate as time passes by. However, for data streams, adaptive random forest algorithms have been widely used for ensemble generation due to its competence to handle different types of drifts. This paper proposes a modified adaptive random forest with meta level learner algorithm and concept adaptive very fast decision tree to overcome the concept drift problem in real-time data streams. The proposed algorithm is experimentally compared with state-of-the-art adaptive random forest algorithm on several real synthetic datasets. Results indicate its efficiency in terms of accuracy and processing time.
A Survey on Classification Algorithm for Real Time Data Streams using Ensembled Approach
-Classification and analysis of data streams are the most promising fields of research and develo... more -Classification and analysis of data streams are the most promising fields of research and development in Data stream mining. Ensemble based classification approach is one the most challenging flavor of developing an efficient classifier due to large number available base classifiers and increase in the computational time required for training and classification. This research emphasizes on developing an efficient ensemble based classification algorithm for data stream.

International journal of engineering research and technology, 2018
Medical Data Mining is the process of extracting hidden pattern from medical data. This paper dev... more Medical Data Mining is the process of extracting hidden pattern from medical data. This paper develops an Android application using IF-THEN rules extracted from the Decision Tree (Classification Technique), which is constructed using classification algorithms (like C4.5). Once the tree was constructed, the production rules are obtained from that tree. In order to improve the coverage capacity and accuracy, rules from one or more trees are used to develop a predictive model on Android. This app provides user an easy tool which can predict diabetes and accordingly the user can maintain his/her diet and enhance some regular activities, this app provides some emotional support to users. The complete rule set which was extracted from the decision trees, was manually checked and conflicts are resolved which arises when the same consequents of the rule classify to different antecedent (class label). KeywordsDiabetes; Decision Tree; Classification Rules; C4.5 algorithm.

Mobile ad hoc network system (MANET), contain a network area with nodes. In a mobile ad hoc netwo... more Mobile ad hoc network system (MANET), contain a network area with nodes. In a mobile ad hoc network system, each node has to rely on others to relay its data packets. Since most mobile nodes are typically constrained by power and computing resources, so some nodes may choose, not to cooperative by refusing to do so while still using the network to forward their packets. Most previous works focus on data forwarding. However, dropping control packets is a better strategy for the selfish nodes to avoid themselves from being asked to forward data packets and hence could conserve resources for their own use. In this paper, we present a new system to detect those selfish nodes and simulate result using NS2 tool. Each node is expected to contribute to the network on the continual basis within a time frame. Those which fail will undergo a test for their suspicious behaviour. In this paper we only present the review and propose system for selfish nodes detection using a NS2 tools. Currently ...

Single level frequent pattern Finding an Efficient Approach for Generating Frequent Patterns in Large Database
Data mining is the process of finding interesting pattern from different data sets, which is used... more Data mining is the process of finding interesting pattern from different data sets, which is used in market basket analysis, cross marketing, fraud detection. Data collection and storage technology has made it possible for organizations to accumulate huge amounts of data at lower cost. Exploiting this stored data, in order to extract useful and actionable information, is the overall goal of the generic activity termed as data mining. Data mining functionalities are used to specify the kind of patterns to be found in data mining tasks. Frequent pattern mining is extensively used in market basket analysis. Each record consists of the transaction of individual customer towards different products at different level. There is need for a monitoring and effective suggestion mechanism for merchant to develop a strategy for attracts customers to their shop. The proposed methodology uses the FP (Frequent pattern) growth algorithm for customers' behavior which describe a hierarchical schem...

: The key to the recommendation system is to predict user performance. Day –by-Day we see huge gr... more : The key to the recommendation system is to predict user performance. Day –by-Day we see huge growth in ECommerce and this continues growth in the E-commerce field gave the birth of a recommendation system. The recommendation system is done of different methods by using usersimilarily, content-based method, collaborative filtering, Hybrid Models, and many more with various algorithms and accuracy.this essay critically examine how various algorithms perform and which of each algorithms performs with better accuracy between all of them .the most better accuracy c by this origination of recommendation system, there are various types of recommendation available in the market nowadays and each and every recommendation system works on distinct appearance like the interest of users, history of users, location of users and many more. In this process, we will generally discuss the movie recommendation system. Movie recommendation Engine will recommend movies to the users on the basis of the...
Uploads
Papers by Chaitali Choudhary