Gridding and Segmentation Method for DNA Microarray Images

Abstract: This article mainly explores meshing and segmentation techniques for microarray image analysis. The term "grid" refers to dividing an image into subgrids of dots and then dividing them into point detection. Most of the existing methods depend on input parameters such as the number of rows / columns, the number of points in each row / column, the size of the subarrays, etc. This article proposes a fully automatic mesh generation algorithm. This can remove any initialized parameter without any manual intervention. In the segmentation step, clustering algorithms are used because they do not consider the size and shape of the spots, do not depend on the initial state of the pixels, and do not require post-processing. In this article, a new method is proposed to estimate the initial parameters (centroid and number of clusters) required by any clustering algorithm. Qualitative and quantitative analysis shows that the algorithm can perform grid processing on microarray images well, and improves the performance of the clustering algorithm.


INTRODUCTION
T Microarray technology is used to analyze gene expression values of thousands of points in parallel [1]. The output of the microarray experiment is the image extracted from the hybrid microarray slide using a sensor with two different wavelengths, Cy3 and Cy5. The analysis of the microarray image is carried out in three stages, namely gridding, segmentation and quantification. The gridding process is divided into two stages. First, divide the image into sub-arrays, called sub-grids, and then divide these sub-arrays into gene point regions, called point detection [1]. Segmentation is the process of grouping the spot pixels and background pixels in each spot area. Quantification is the process of finding the value of gene expression, which is equal to the logarithmic ratio of the green and red intensity values of a point [2]. The gene expression value depends on the intensity values of foreground pixels (spot areas) and background pixels. Most of the existing algorithms for microarray image analysis are semi-automatic, which means that manual intervention is required to initialize the parameters to execute the gridding algorithm [3]. This article offers a fully automatic mesh generation algorithm. Microarray technology is widely used in genetics, disease diagnosis, drug development, and pharmacology [4]. The microarray image analysis mechanism is shown in Figure 1.
In this article, in the second part, a grid generation algorithm is presented, in the third part, an algorithm for estimating the necessary parameters of the clustering algorithm is presented, in the fourth part, a clustering algorithm is presented, and in the fifth. part, experimental results and conclusions are presented.

MICROARRAY GRIDDING:
The grid is the most important step in microarray image analysis. For sequential analysis of microarray images, a fine mesh can improve the efficiency of the segmentation and quantification steps. To capture each spot of a gene from a microarray image, the image is first divided into sub arrays, and then these sub arrays are divided into spot regions. The first step is called global meshing (sub-meshing), and the second step is local meshing (point detection). The end result of the mesh is an area with one spot and a background. The existing meshing algorithms are divided into manual and semi-automatic. When building a mesh manually, the user must specify all the parameters necessary for building the mesh, such as the number of sub arrays, the number of points and the size of the points. In a semi-automatic program, the user part provides parameters such as the number of rows/columns and the number of points in each row/column. This paper proposes an algorithm to automatically divide the grid using the horizontal and vertical contours of the microchip image. Figures 2 and 3 show the algorithm used to detect sub-grids and points.

MICROARRAY SEGMENTATION
Image segmentation is the process of dividing an image into regions, and for microarray images, it is the process of dividing a subnet into foreground and background regions. This foreground area corresponds to the spot area. Using this spot of the segmented image, gene expression levels are estimated. This segmentation is a difficult task, since the intensities in this area are heterogeneous, and the sizes and shapes of the points also differ from each other. Several statistical segmentation methods are proposed for segmentation of microarray images. Shapebased circular segmentation [6], in which a circular mask with a specific radius is used to identify a point. But all spots are not of equal shape. Region growing method [5] in this the spot area and background area are separated by selecting seed pixels in both areas and regions are extracted by using some predefined criteria. Selection of seed pixel is a difficult task. Thresholding method [6], by using the histogram of the image, a suitable threshold is estimated, and the image is divided into two regions. This estimation is done using Mann Whitney test. Morphology based segmentation [7], uses the morphological operations such as hit or miss transforms to segment the image. Here the process depends on the selection of mask used for morphological operations. Supervised learning-based segmentation [8] uses support vector machine algorithm to segment the image. In this paper, clustering algorithms are used to segment the image. These algorithms are efficient than other existing segmentation methods, as they do not depend on the shape and size of the spot. This algorithm does not require any mask and seed pixels. Every clustering algorithm depends on the number of clusters and initial values of centroids. If these values are estimated, the number of iterations required for any algorithm to segment an image can be minimized. The required number of centroids and clusters for microarray image segmentation are two, as every algorithm divides into two regions background and spot area. Existing algorithm used minimum value in the background area and maximum value in the foreground area for segmenting the image into two regions. This article presents an algorithm for estimating centroids and the number of clusters using empirical mode decomposition. A block diagram for estimating the number of clusters is shown in Figure 4. This article uses clustering algorithms such as k-mean, moving k-mean, and FCM.

CLUSTERING ALGORITHMS
In this article, k-mean [9], moving k-mean [10], and FCM algorithm [11] are used to segment microarray images. The two parameters required by these algorithms are estimated using the algorithm described in Section 3.

EXPERIMENTAL RESULTS
The qualitative and quantitative analysis of proposed sub-gridding and spot detection algorithms are performed on two microarray images of breast cancer a CHG tumor tissue. The qualitive analysis of proposed subalgorithm is shown in figure 5. The qualitative analysis of proposed spot detection algorithm is shown in figure 6. Table 1 shows a quantitative analysis of the proposed mesh generation algorithms in comparison with existing methods. The precision of the mesh generation algorithm is estimated by the formula Percentage accuracy = *100 After sub-gridding and spot detection algorithms, individual spots are extracted. These individual spots are segmented using clustering algorithms with initial values are estimated using proposed estimation of number of clusters and centriods algorithm. A sample spot is extracted from the image and segmented using clustering algorithm. Three different clustering algorithms are used in the presented work. Table 2 shows the number of iterations used by different clustering algorithms in segmenting the spot image with and without estimation of initial parameters. Table 3 shows the MSE values [11] of different segmentation algorithms. The segmented step on a single spot is shown in figure 7.

6.CONCLUSIONS
The microarray image analysis is a sequential process with gridding, segmentation and quantification process. Any error in the gridding and segmentation stages, the gene expression value will be affected. This paper presents an automatic gridding algorithm and estimates the parameters required for clustering in segmentation process. The experimental results show the proposed algorithm grids the image with 96% accuracy and decreases the number of iterations by estimating the required parameters for segmenting the image by clustering algorithm.