Academia.eduAcademia.edu

Design and Analysis of Prediction Model Using Machine Learning In Agriculture

International Journal of Innovative Research in Computer Science & Technology

https://doi.org/10.55524/IJIRCST.2022.10.3.14

Abstract

The reality of worldwide population growth and climate change demand that agriculture production can be increased. Traditional study findings which are difficult to extend to all conceivable fields since these are dependent on certain soil types, climatic circumstances, and background management combinations that aren't appropriate or transferable to all farms. There is no way for evaluating the efficacy of endless cropping system interactions (including many management practises) to crop production across the World. We demonstrate that dynamic interactions, that cannot be examined in repetitive trials, which are linked with considerable crop output variability and therefore the possibility for big yield gains, using massive databases and artificial intelligence. Our method can help to speed up agricultural research, discover sustainable methods, and meet future food demands. This is a paper attempted that at crop yield prediction using machine learning techniques with historic ...

Key takeaways
sparkles

AI

  1. Machine learning enhances crop yield prediction by analyzing complex interactions in agriculture.
  2. The study utilizes historical datasets from 1997 to 2015 for predictive modeling across India.
  3. Random Forest outperforms other algorithms like Multiple Linear Regression for agricultural predictions.
  4. Supervised machine learning techniques inform better decision-making in farming practices.
  5. The research aims to address food demands amid population growth and climate challenges.
International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN: 2347-5552, Volume-10, Issue-3, May 2022 https://doi.org/10.55524/ijircst.2022.10.3.14 Article ID IRPV1039, Pages 72-75 www.ijircst.org Design and Analysis of Prediction Model Using Machine Learning In Agriculture Diksha Gupta1, Dr. Yojna Arora2, and Dr. Aarti Chugh3 1 Student, Amity School of Engineering and Technology Gurugram, India 2,3 Associate Professor, Amity School of Engineering and Technology, Gurugram, India Correspondence should be addressed to Diksha Gupta; [email protected] Copyright © 2022 Made Diksha Gupta et al. This is an open-access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ABSTRACT- The reality of worldwide population growth tools and processes, and it will be able to make better and climate change demand that agriculture production decisions because to this processed information. As a result, can be increased. Traditional study findings which are greater results are guaranteed. difficult to extend to all conceivable fields since these are Farmers can usually predict the eventual yield-crop based on dependent on certain soil types, climatic circumstances, and their previous experience with a specific crop. Farmers' yield background management combinations that aren't predictions are inaccurate and ineffective. It is critical to appropriate or transferable to all farms. There is no way for adopt contemporary farming methods employing technology evaluating the efficacy of endless cropping system rather than traditional farming methods in order to meet the interactions (including many management practises) to crop food needs of the entire population of the country and to production across the World. We demonstrate that dynamic interactions, that cannot be examined in repetitive trials, export some agricultural goods to other countries. Modern which are linked with considerable crop output variability farming practises enable farmers to plant crops in tiny areas and therefore the possibility for big yield gains, using with minimal water, fertilisers, and pesticides, resulting in a massive databases and artificial intelligence. Our method can high yield and profit for the farmers. help to speed up agricultural research, discover sustainable methods, and meet future food demands. This is a paper II. LITERATURE REVIEW attempted that at crop yield prediction using machine learning techniques with historic crop production data. For Ashwani kumar Kushwaha [2] outlines crop yield prediction this, data has been collected from data.gov.in and data.world. methods and suggests a suitable crop to boost the farmer's profit and the agriculture sector's quality. This study uses KEYWORDS- About four Machine learning, Big Data Hadoop platform and agro algorithm to acquire huge volume Analysis, Forecasting, Artificial Intelligence, Algorithms, data, also known as big data (soil and meteorological data), Prediction and Analysis. for crop yield prediction. As a result of the repository data, crop suitability for certain conditions may be predicted, and I. INTRODUCTION crop quality can be improved. Modern Agricultural market has a huge potential in a nation Random forest for global and regional crop yield prediction like India. Farmers in India play a major role in feeding the are discussed in [4]. Journal PLoS ONE. Because of its growing population of the country. This makes crop analysis highest accuracy and precision, ease of use, and value in data and prediction as important as crop production. Farmers can analysis, our generated outputs suggest that RF is a viable use crop yield prediction data to make their decisions about and flexible machine-learning method for agricultural crops. Agriculture yields’ prediction is one of the major production projections at regional and global scales. The challenges in machine learning. There are many factors most efficient technique is Random Forest, which affecting crop yield such as crop genotype, environmental outperforms multiple linear regression (MLR). factors like soil conditions, farmer’s efforts such as proper According to Rahul Katarya and Ashutosh Raturi,[5] they irrigation, timely plantation, etc. This forecasting is have provided various methodologies for which crop specifically relied on climate features to predict crop yield. prediction for the states of Uttar Pradesh and Karnataka. Based on the considered datasets, we have taken into These employ models such as Naive Bayes, Random Forest, consideration the climate features specific to the agricultural KNN, and others. Cross validation, accuracy, RMSE, seasons in India such as Rabi, Kharif and whole year and then precision, and recall are some of the strategies used to have tried to make the crop yield predictions for the evaluate performance on data. The model with the best upcoming year. performance is chosen and then use to classify and The meteorological department's data sets of temperature, recommend crops. Wheat crop production investigates the humidity, rainfall, and soil are analysed using big data use of machine learning in the production of wheat crops. analytics techniques. This type of study is carried out with The method entails employing digital image processing the help of certain software tools, many of which are free techniques to extract features for crop maturity and source. The system will have information thanks to these classifying the stage of growth using the supervised Machine Innovative Research Publication 72 International Journal of Innovative Research in Computer Science & Technology (IJIRCST) Learning (ML) technique. Data mining techniques [5] such as K-Means Clustering, KNN, SVM, and Bayesian network algorithms were used to estimate agricultural yield with great accuracy. Crop yield forecasts based on climate parameters using supervised Machine Learning (ML). International Conference on Computer Communication and Informatics paper (ICCCI). Crop Advisor, a user-friendly online portal for estimating the impact of climatic conditions on crop yield, has been developed as part of the current project [6]. The C4.5 method is used to determine the most influential climatic parameter on agricultural production in Madhya Pradesh's selected districts. Decision Tree is used to implement the article [1]. III. PROPOSED SYSTEM By studying the historical data of the farming area, the Figure 1: Block Diagram of a proposed model proposed method tries to predict or forecast crop yield. The system uses machine learning techniques to develop a predictive model by taking into consideration many elements IV. METHODOLOGY AND such as soil conditions, rainfall, temperature, yield, and other IMPLEMENTATION things. We use a number of different of machine learning A. Dataset techniques here, including Random Forest, Linear To perform our predictions, datasets that were basically Regression, and Decision Tree. The predicted accuracy is required i.e., Historic data related to crop yield and data used to assess performance. related to the climatic conditions. For crop yield production Using various data input, create, design, and implement a data, we used a dataset that had crop data for all districts of learning model. Using machine learning techniques, the all states of India for all crops grown and their seasons from system would learn the characteristics and predict the crop the year 1997 to 2015. The dataset was gathered from production from the data. data.gov.in [7]. This dataset can be viewed using the dashboard that we created for demonstration purposes. Now, our main concern was to make use of this available data so as to make the predictions. The following attributes must be included in this dataset. These factors will be used for crop prediction: i) State Name ii) District Name iii) Crop Year iv) Season v) Crop vi) Area vii) Production. We have state wise data with district names and by this data we can analyse the production of crops depending on its area and season. Figure 2: Screenshot of Dataset B. Data Pre-processing attributes that are not considered for crop prediction during data cleaning. So, in order to improve accuracy, we must After gathering data from a variety of sources. Before eliminate unnecessary attributes and datasets having some training the model, the dataset must be pre-processed. The missing values, or fill them with unwanted nan values. Then data pre-processing process can be divided into several steps, decide on a model's goal. Using the sklearn library, the starting with reading the acquired dataset and progressing to dataset will be separated into training and test sets after data data cleaning. The datasets contain certain redundant cleaning. Innovative Research Publication 73 International Journal of Innovative Research in Computer Science & Technology (IJIRCST) Figure 3: Screenshot of Data Pre-processing of UP Model For the implementation, we have basically made use of the used in predictive analytics to determine the likelihood of following packages and the entire code is in python: numpy, future outcomes based on historical data. The purpose is to pandas, pickle. provide the best judgement of what will happen in the future, rather than only knowing what has happened. We employed C. Prediction Algorithm Using Machine Learning a supervised machine learning technique with classification Machine learning predictive algorithms require highly and regression as subcategories in our system. Our system efficient estimation based on previously taught data. Data, will benefit from a classification method. statistical algorithms, and machine learning techniques are Figure 4: Screenshot of Accuracy of UP Model B. Decision Tree V. ALGORITHMS The greedy strategy is employed by decision trees, the A. Linear Regression attribute chosen in the first phase cannot be used Linear Regression is supervised Machine Learning technique subsequently to improve data classification. If Decision Tree in Python that observes continuous features and predicts a is employed in the following phases, it may over fit the result. We can call it simple linear regression or multiple training dataset, resulting in unsatisfactory outcomes. To linear regression depending on whether it operates on a solve this flaw, an ensemble model is used, and ensemble single variable or many features. models produce promising outcomes. This is one of the most common Python Machine C. Random Forest Learning(ML) methods, however it is often overlooked. It creates a line ax+b to anticipate the output by assigning An ensemble of decision trees is known as a random forest. optimal weights to variables[3]. We frequently utilise linear Trees vote for class, and each tree gives classification, in regression to estimate actual values based on continuous order to categorise the every new object based on its variables, such as the number of calls and housing costs. The attributes. In the forest, the classification with the most votes best line that fits Y=a*X+b to denote a link between wins. Random forest is also known as random decision independent and dependent variables is the regression line. forests, are an ensemble learning method for classification, regression, and other tasks that works by training the large number of decision trees and then its output of the class that Innovative Research Publication 74 International Journal of Innovative Research in Computer Science & Technology (IJIRCST) is the mode of the classes (classification) or mean prediction Engineering Research & Technology (IJERT) ISSN: 2278- (regression) of the individual trees. 0181, 08 August-2020. [3] Jeevan Kumar, Rajesh Kumar Tiwari, Vijay Pandey. "Diabetes prediction using machine learning tools", 2021 4th VI. RESULT AND DISCUSSION International Conference on Recent Trends in Computer Science and Technology (ICRTCST), 2022. As per the prediction, to evaluate the performance of the [4] Jig Han Jeong, Jonathan P. Resop, Nathaniel.D. Mueller, algorithm that we have implemented we checked the David H. Fleisher et al. "Random Forests for Global and accuracy of our predictions and it gave the accuracy for Regional Crop Yield Predictions", PLOS ONE, 2016. various states is shown in the table below. This says that our [5] Rahul Katarya, Ashutosh Raturi, Abhinav Mehndiratta, implementation would always give the current prediction in Abhinav Thapper, “Impact of Machine Learning Techniques terms of positive or negative directions i.e. for the chosen in Precision Agricul- ture”,3rd International Conference on parameters, would the yield increase or decrease. Hence, if Emerging Technologies in Computer Engineering: Machine we compare both the algorithms which is Random Forest and Learning and Internet of Things (ICETCE- 2020), 07-08 Linear Regression, then as per evaluation Random Forest February 2020. [6] Pragathi Tummala, M Sobhana, Sruthi Kakumani. "Predicting gave more accuracy. crop yield with NDVI and Backscatter Networks", 2022 International Mobile and Embedded Technology Conference Table 1: Accuracy Using Random Forest and Linear (MECON), 2022. Regression [7] Data.gov.in, https://data.gov.in. S.No. State Name Random Linear Regression Forest (Accuracy) (Accuracy) 1 Uttar Pradesh 97.38 77.46 2 Rajasthan 88.54 65.34 3 Punjab 98.08 97.21 4 Maharashtra 81.20 69.22 5 Madhya 87.23 77.81 Pradesh 6 West Bengal 92.63 92.26 7 Tamil Nadu 47.74 85.97 VII. CONCLUSION We conclude that accurate yield, rainfall, and soil nutrient prediction systems are getting closer. We can forecast with excellent accuracy using ensemble learning algorithms. For greater performance, we can apply the big data analysis and mining techniques for large-scale forecasts. The data is one of the most important components here; we must analyse it by crop, season, and productivity. It is also necessary to educate farmers about such procedures. Other factors to consider for future study are fertiliser consumption on farm and the terrain of the area. A smartphone application that will notify farmers via text message when it is time to seed and harvest. It is necessary to make the technology accessible to everyone. CONFLICTS OF INTEREST The authors declare that they have no conflicts of interest REFERENCES [1] B M Sagar, NK Cauvery, P Abbi, N Vismita, B Pranava, Pranav A Bhat. "Chapter 105 Analysis and Prediction of Cotton Yield with Fertilizer Recommendation Using Gradient Algorithm", Springer Science and Business Media, 2022 [2] Ashwani kumar Kushwaha, Swetabhattachrya, "Crop Prediction using Machine Learning", International Journal of Innovative Research Publication 75

References (7)

  1. B M Sagar, NK Cauvery, P Abbi, N Vismita, B Pranava, Pranav A Bhat. "Chapter 105 Analysis and Prediction of Cotton Yield with Fertilizer Recommendation Using Gradient Algorithm", Springer Science and Business Media, 2022
  2. Ashwani kumar Kushwaha, Swetabhattachrya, "Crop Prediction using Machine Learning", International Journal of Engineering Research & Technology (IJERT) ISSN: 2278- 0181, 08 August-2020.
  3. Jeevan Kumar, Rajesh Kumar Tiwari, Vijay Pandey. "Diabetes prediction using machine learning tools", 2021 4th International Conference on Recent Trends in Computer Science and Technology (ICRTCST), 2022.
  4. Jig Han Jeong, Jonathan P. Resop, Nathaniel.D. Mueller, David H. Fleisher et al. "Random Forests for Global and Regional Crop Yield Predictions", PLOS ONE, 2016.
  5. Rahul Katarya, Ashutosh Raturi, Abhinav Mehndiratta, Abhinav Thapper, "Impact of Machine Learning Techniques in Precision Agricul-ture",3rd International Conference on Emerging Technologies in Computer Engineering: Machine Learning and Internet of Things (ICETCE-2020), 07-08 February 2020.
  6. Pragathi Tummala, M Sobhana, Sruthi Kakumani. "Predicting crop yield with NDVI and Backscatter Networks", 2022 International Mobile and Embedded Technology Conference (MECON), 2022.
  7. Data.gov.in, https://data.gov.in.

FAQs

sparkles

AI

What factors significantly affect crop yield predictions in machine learning?add

The research identifies climate features such as temperature, humidity, and rainfall alongside soil conditions as critical factors influencing crop yield predictions. These variables were crucially considered in developing the predictive models for Indian agricultural seasons.

Which machine learning techniques yield the highest accuracy for crop yield prediction?add

The study finds that Random Forest surpasses multiple linear regression and other models in prediction accuracy. Specifically, Random Forest demonstrates the highest precision and reliability for agricultural production projections.

What datasets are essential for accurate agricultural yield forecasting?add

The proposed system utilizes a comprehensive dataset comprising historic crop yield data and climate conditions from 1997 to 2015 across all Indian districts. This data includes attributes such as state name, district name, crop year, season, crop type, area, and production.

How does ensemble modeling enhance prediction outcomes in crop yield forecasts?add

Ensemble modeling, particularly through the Random Forest approach, mitigates the overfitting issues commonly associated with single decision trees. This method combines the predictions of multiple trees to improve classification accuracy and robustness.

What innovative software tools facilitate data analysis for crop yield predictions?add

The study highlights the use of open-source tools like Hadoop and various Python packages such as numpy and pandas for big data analytics. These tools are essential for processing vast quantities of agricultural and meteorological data.

About the author
Papers
67
Followers
40
View all papers from Dr. Yojna Aroraarrow_forward