Detection and Classification of Caterpillar using Image Processing with K-Nearest Neighbor Classification Technique

1 Dr. Mukta Jagdish, 2 Andres Medina Guzman, 3 Gerber F. Incacari Sancho, 4 Aura GuerreroLuzuriaga Lecturer, Jawaharlal Nehru Rajkeeya Mahavidyalaya, Government College, Port Blair, affiliated to Pondicherry Central University, South Andaman, Andaman and Nicobar Islands, India Universidad de la Costa, Barranquilla, Colombia Universidad Nacional del Callao, Lima, Perú Universidad Católica de Cuenca, Ecuador mukta.jagdish13@gmail.com, amedina10@cuc.edu.co, gfincacaris@unac.edu.pe, ing.auragl@gmail.com


Introduction
Caterpillars are the larval stage of members of the order of butterflies. Caterpillars are herbivorous, but some are insectivorous about 1%. Caterpillars are having soft bodies that can grow rapidly between molts [1]. Their size varies between species from 1 mm up to 14 cm. Caterpillars can defend against physical conditions such as cold, hot, or dry environmental conditions. Appearances of caterpillars can often threaten a predator. Its markings on certain body parts can make it look poisonous and bigger thus helps in threatening foreign particles [2]- [3]. Some caterpillars are poisonous or tasteless because of their bright coloring & aposematic, while others may mimic dangerous caterpillars. Many caterpillars are cryptically colored to protect themselves from predators and resemble the plants on which they feed [4]. Some caterpillars use camouflage for defense in the species. If the caterpillars hatch in the spring they appear green and if they hatch in the summer they appear dark-colored. The difference in color is due to the tannin content in the diet. Some caterpillars have spines or growths that resemble plant parts such as thorns and few look like objects in the environment such as bird droppings [5]. Some species cover themselves in plant parts, while some construct and live in a bag covered in sand, pebbles, or plant material. Some caterpillars have spiny bristles or long fine hairs that will irritate by lodging in the skin or mucous membranes whereas some caterpillars acquire toxins from their host plants [6]. Most of the caterpillar defenses from urticating hairs are associated with venom glands. The venom is powerful enough to cause a human from hemorrhage to death. Caterpillars suffer predation from European pied flycatcher& Paper wasps [7]. Many caterpillars feed in protected environments, such as enclosed in silk galleries, rolled leaves, or by mining between the surfaces of leaves. Caterpillars are also called eating machines as they eat leaves voraciously [8]. They grow very quickly and their skin sheds 4 to 5 times as they grow and their weight also increases very fast. Here we specifically discuss some of the specified caterpillars like the Buck Moth Caterpillar, Saddleback Caterpillar, and Io moth caterpillar [9]- [10].
1.1 Buck Moth Caterpillar: These insects are the genus of Hemileuca from the family group of Saturniidae. The buck moth caterpillars are mostly found in oak forests of the United States from peninsular Florida to New England and west of Texas and Kansas. It is also called Dru Drury. These caterpillar larvae generate in spring which is covered with a hollow spine that is attached to a poisonous sac. These poisonous sacs of larvae cause itching, burning sensation to nausea. These larvae are feed from live oak, white oak, blackjack oak, scrub oak, and many more. Caterpillar laid their eggs in a spiral cluster on oak twigs. Mature larvae went to the litter of the leaves to pass through their pupal stage at the end of July. The mating month of these insects is February. These moths are infamous for stinging people.
1.2 Saddleback Caterpillar: These are the larvae from the species of moth from the genus of Acharia and the family group of Limacodidae. These are native to Eastern North America such as Mexico, Yucatan. These caterpillars are leaf green color with brown color & fleshy horn at both ends and a prominent white-ringed brown dot in the center looks like a saddle for riding. The other parts of the body are covered with urticating hairs that secrete irritating venom if touched which protects it from external predators. This venom causes a swollen and painful rash on the skin and even nausea. In some cases, it leads to migraine, asthma, gastrointestinal symptoms, and sometimes leads to hemorrhaging. The larvae depend on a variety of plants like ornamental palm like Manila palm for food and nutrient for growth. The mating month of their moth is January to February. The moth laid eggs after 3 days of mating on the underside of the leaf in a series of 30-50 eggs. The size of the eggs is 1-2 mm. Throughout the life cycle, the larvae of this caterpillar go through different stages. They are the first instars, the middle instars, the late instars. The adult comes out in June and July.
1.3 Io moth caterpillar: Io moth caterpillars are the genus of Automeris from the family group of Saturniidae. These insects are mostly found in US-North Dakota, South Dakota, New Mexico, Montana, and Southern Florida. Some species are also found in Southern Manitoba, Ontario, Quebec of Canada. The moths of these mature caterpillars are famous for their beauty. The adult moths have a 4 wings span of 2.5 -3.5 inches. Males have yellow forewings, body, and legs whereas females have reddish forewings, body, and legs. They have a feathery antenna in front of their head and have a bluish eyespot with some white mark on its center on each wing. This provides a defense mechanism to the moth from external predators. These moths laid their eggs in a cluster of 20 on the leaves. As the eggs develop it turns black and when the orange larvae emerge from the egg after hatching it eaten up its eggshell. In the advanced stage, this larvae turns from orange to green with urticating hairs & spine all over the body for its protection. The larvae develop in 5 instars. The spine on the body of the caterpillar release very painful venom which rashes and swollen your skin. It also leads to nausea, migraine, and gastrointestinal problems. When the caterpillars are ready, they draw out a flimsy, soft cocoon made up of dark coarse silk in leaves. They depend on plant leaves for their food and nutrients. They have eaten up plant leaf voraciously during their growth period. The adult moths of this species are strictly nocturnal.

Methodology
These researchesperformclassification methods using image processing with machine learning techniques using data of 100caterpillar images for analysis. The research method is categorized into different stages such as Detection and Classification of Caterpillar using Image Processing with K-Nearest Neighbor Classification Technique preprocessing, segmentation, feature extraction, and K-Nearest Neighbor classifier. Data collected for research from multiple sources during the investigation.In this research, we discuss the classification of different types of caterpillars. Now a day's various Image Processing Techniques are used for a detailed study of different types of species and creatures. In this research, it detects and classifies a specific category of the caterpillar using Image Processing with the K-Nearest Neighbor Classification Technique.

Preprocessing
The acquisition of a caterpillar image is a very challenging task for classifying the current situations. Based on different ways of taking pictures, the image may get affected and become more noises, so it is very important to remove noise from background images during the research process and to extract detailed features from images. For this research image preprocessing play an important role in the removal of unwanted noise from the background of images to achieves better results. Preprocessing makes a noise-free image using much digital processing. In this research work, the median filter is used during preprocessing techniques.

Segmentation
In this research segmentation methods are used to extract region-based image segmentation for caterpillar classification. Image segmentation process the image into multiple segments or a group of pixels with relevant criteria based on some conditions or protocols. Segmentation is used to make an image with more meaningful representation. During the complete process, it will divide the image into multiple segments part for better understanding and analysis. The result of the segmentation process will collectively cover the entire caterpillar image.In the segmentation, method thresholding is used to extract caterpillar images that are completely based on threshold values.

Feature Extraction
Features extracted for caterpillar classification are color-based analysis (RGB Colors), mean, standard deviation, ellipticity, entropy, skewness, intensity, and correlation co-coefficient with wavelet such as symplet1, symlet2, symlet3, symlet4, symlet5 analysis.For SYMLET analysis it uses high computational overhead.

Classification
For this research k-Nearest Neighbor classifier method is investigated with three classes such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and the accuracy of Io moth Caterpillar. It is a classification method which makes prediction target variable class according to nearest neighbors.It calculates the distance which needs to classify with instance training data based on the majority of the vote given by the nearest neighbor pixel. The three-parameter is used they are k value, train set, and test instance during analysis. The dataset used is 100 caterpillar images with three classes such asBuck Moth Caterpillar, Saddleback Caterpillar, and Io moth Caterpillarfor classifications. This investigation contains 70 % for training data and 30% for testing data with k=4.

Result and Discussion
In this investigation, an advance study of caterpillar classification using image processing and machine learning methods have been considered for analysis. A 100 caterpillar image dataset has been taken for analysis and investigation. The method is broadly categories into the following stages such as preprocessing phase, segmentation phase, feature extraction such as color-based analysis (RGB Colors), mean, standard deviation, ellipticity, entropy, skewness, intensity, and correlation co-coefficient with wavelets such as symlet1, symlet2, symlet3, symlet4, symlet5 analysis, and K-Nearest Neighbor classifier with 70% used for training and 30% used for testing.    figure 4 denotes the detailed result of a classification using K-NN classifier and Symlet1 analysis for a Caterpillar image classification such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and accuracy of Io moth Caterpillar.  Table 2 represents caterpillar classification using a k-NN classifier with symlet2 analysis with three classes such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and accuracy of Io moth Caterpillar. Features extracted are color-based analysis (RGB Colors), mean (M_Value), standard deviation (SD_Value), ellipticity (E1), entropy (E2), skewness (S1), intensity (I1), and correlation co-coefficient (CC).    Table 3 represents caterpillar classification using a k-NN classifier with symlet3 analysis with three classes such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and accuracy of Io moth Caterpillar. Features extracted are color-based analysis (RGB Colors), mean (M_Value), standard deviation (SD_Value), ellipticity (E1), entropy (E2), skewness (S1), intensity (I1), and correlation co-coefficient (CC).    Table 4 represents caterpillar classification using a k-NN classifier with symlet4 analysis with three classes such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and accuracy of Io moth Caterpillar. Features extracted are color-based analysis (RGB Colors), mean (M_Value), standard deviation (SD_Value), ellipticity (E1), entropy (E2), skewness (S1), intensity (I1), and correlation co-coefficient (CC).    Table 5 represents caterpillar classification using a k-NN classifier with symlet5 analysis with three classes such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and accuracy of Io moth Caterpillar. Features extracted are color-based analysis (RGB Colors), mean (M_Value), standard deviation (SD_Value), ellipticity (E1), entropy (E2), skewness (S1), intensity (I1), and correlation co-coefficient (CC).     Table 6 represents caterpillar classification using a k-NN classifier with SYMLET analysis with three classes such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and accuracy of Io moth Caterpillar.Features extracted are color-based analysis (RGB Colors), mean (M_Value), standard deviation (SD_Value), ellipticity (E1), entropy (E2), skewness (S1), intensity (I1), and correlation co-coefficient (CC). This investigation contains 70% for training data and 30% for testing data with k=4. The present investigation results that symlet5 analysis works well in the classification of caterpillar categories classification with an accuracy of 96% using the K-Nearest Neighbor classifier.

Conclusion
In this investigation, a caterpillar classification method and detection system using image processing and machine learning method are performed. This research help in characterizing the type of caterpillar image classification for particular three classes such as accuracy of Buck Moth Caterpillar, the accuracy of Saddleback Caterpillar, and the accuracy of Io moth Caterpillar. The following stages have been considered are preprocessing, segmentation, feature extraction, and classification methods using K-Nearest Neighbor classifier. The present investigation results that SYMLET5 analysis works well in the classification of the caterpillar with an accuracy of 96% using K-Nearest Neighbor classifier compare with other measures during investigation and analysis.