The main idea behind CVC is identifying health changes of plants based on pictures. The most basic approach is to define indexes based on the presence of selected wave lengths in the available samples. Methods of plant identification can also rely on the quantitative analysis of histograms for the examined pictures in separate R, G and B channels of the visible light. In our method we analyse data derived from pictures and represent it as multi-dimensional sets of points described with values in the HSL scale. This allows a precise identification of value concentrations specific for known defects.
In order to analyse data more efficiently, we need a classification method that is robust and in the same time allows fast comparison of obtained results with a large base of patterns. Amongst the classification methods of large data clusters are many modern approaches, like K-means, CLASS method, SOM or GnG. While the K-means method assumes a fixed number of classes that the cluster is divided to, a parameter which is not known in the examined case, the SOM and GnG methods do not determine this number. Moreover, in the selected approach we can interpret the identified classes as multi-dimensional sets of vectors that identify the value concentrations.
Data set representation using CLASS and GnG methods
There are numerous variations and adaptations of the GnG method, used depending on the characteristics of input data or being an expansion of the original method (FGNG, AING, AGiNG and others). For our studies we have chosen our own implementation of the GnG method, expanded with algorithms that determine the expected quality of the input data set representation generated by the network.
The pattern database created using the CVC system is composed of pictures, corresponding histograms and maps of created networks. All elements have a graphical representation. Each pattern can include a text describing the health change in case. Patterns are classified by text attributes that describe the species, crop variation and other variables if suitable.
The results of the test set analysis include the presentation of the mentioned elements, along with a list of identified patterns that correspond to the test set. Similarities are described with a deviation indicator, and for each study the best corresponding pattern or group of patterns is presented.