Research on density-based sparse representation and its classification of tobacco leaves

The construction of the dictionary in sparse representation (SRC) is crucial to the efficiency and accuracy of grading. A density-based SRC dictionary construction method is proposed, and the established DSRC (density-based SRC) is used to classify tobacco leaves. This method applies the idea of ​​center selection based on density in the subtractive clustering algorithm to the sparse algorithm for dictionary construction. By determining the appropriate clustering radius kia, kib and constraint conditions to determine the dictionary atoms, it can not only reduce the number of dictionary atoms, but also select The dictionary has better representation. Tobacco leaves from 2013 (13 levels), 2014 (6 levels) and 2015 (42 levels) were graded based on the dictionary selected by this method. The test results show that this method can not only improve the accuracy of tobacco leaf grading, but also improve the accuracy of tobacco leaf grading. It can also effectively increase the tobacco leaf grading speed.

At this stage, in the process of purchasing tobacco leaves in my country, most of them are graded manually. This highly subjective grading method has large errors when manpower and material resources are limited, which in turn affects the quality of cigarettes. In recent years, computer and artificial intelligence technology have been increasingly used in agricultural product inspection, and the non-destructive classification of tobacco leaves based on computer vision and infrared spectrum analysis technology has attracted more and more attention [1-2].

Research on tobacco leaf grading based on computer vision mainly focuses on identification methods and digital image feature screening methods [1-2]. There are many methods for intelligent classification of tobacco leaves, such as nearest neighbor, radial basis neural network, support vector machine, Ada-boost, rough set, random forest [3] and sparse representation [4], etc., in the literature [4] Simply randomly select 2/3 of each grade of tobacco leaves as dictionary atoms to build a sparse representation dictionary. The dictionary selected in this way not only has a large number of atoms, which affects the tobacco leaf grading time, but also may select incorrect samples as the dictionary, thus affecting the tobacco leaf grading. accuracy. A suitable dictionary has an important impact on the accuracy and speed of tobacco leaf grading. Therefore, this study proposes a density-based sparse representation algorithm to classify tobacco leaves.

The subtractive clustering algorithm was proposed by Chiu in 1994 based on the mountain clustering algorithm. This method calculates the density value (peak value) of each sample point based on the Euclidean distance criterion, and selects the point with the highest density as the cluster. center[5]. Then the density of the remaining samples is updated, and the sample point with the largest density value is repeatedly selected until the set conditions are reached. This study applies the density-based cluster center selection idea in the subtractive clustering algorithm to the construction of dictionary atoms for sparse representation, and proposes a density-based sparse representation method [6]. By determining the appropriate cluster radius kia, kib and constraint conditions in each category, the number of dictionary atoms and selection of dictionary atoms are determined, and then the tobacco leaves are classified by solving the L1 norm minimization problem and the minimum residual term. The results show that this method can It can effectively increase the speed of tobacco leaf grading while ensuring a certain recognition rate.

Density-based sparse representation (DSRC) 1.1 Sparse representation (SRC) principle [7] The sparse representation algorithm first constructs a dictionary through training samples, and then uses test samples to perform pattern recognition on the projection of the dictionary. The common dictionary construction principle is as follows. Assume that the pattern belongs to category C, and the training sample set of category i is: Di = [Di1, Di2,…, Dik,…, Dik], Dik represents the kth training sample in category i , Ki represents the number of training samples of the i-th category, forming a dictionary matrix D = [D1, D2,…, DC]; after the dictionary is formed, calculate the projection of the test sample Y on the dictionary X∶Y=DX, and solve the calculation method of X There are greedy pursuit methods (such as matching pursuit, orthogonal matching pursuit, etc.) to solve it, but it is not the optimal solution. At present, the most commonly used method is to solve the coefficient matrix X based on the L1 norm: formula As follows:

Neem contact met ons op