Incomplete Information And Columns-Based Intelligent Systems

The paper considers columns-based intelligent systems that work under conditions of incomplete information, that is, input patterns are represented by their sub-patterns. The definition of direct and inverse problems under incomplete information is given. The solution of these problems is shown using the method of element-wise comparison of patterns and the intersection method. A relation between system’s ability to work under incomplete information and prediction is shown.


Introduction
Column-based intelligent systems are an example of systems whose main function is to memorize patterns. Such systems have not become widespread, since at first glance it seems that such systems, apart from memorizing patterns, cannot solve any other problems. Column-based intelligent systems show that this is not the case at all.
Column-based intelligent systems are the systems considered within the following model [1,2].
There is, albeit a very large, yet a finite set of names U , intended for naming objects of arbitrary nature. Without loss of generality, the set of names U is considered to be a subset of the set of integers Z .
In a set of names U disjoint subsets are distinguished, called name domains. The number of name domains allocated is not constant. New name domains can be introduced at any time, and additional elements can be added to any name domain. The allocation of name domains in real subject areas can be caused by various reasons (for example, typification). One of the main reasons is the need to make sure that there are no random name conflicts in different parts of a large-scale system.
Any finite set of names belonging to one or another name domain is called a pattern.
Patterns of any set of patterns P can be renumbered using the names of some name domain U  : An ordered pair ( , ) i ip is called a column. A column is designated as ( | ) i ip, where i is the column name, i p is the column pattern. Also used notation i ip → . In this case, it says that the column name i is a reference or a pointer to a column pattern i p . In turn, about the pattern i p itself it will be said that this pattern has a name i or known by the name i .
The mapping : i ip  → is called name mapping. By default, the name mapping is considered to be one-toone. All cases when this is not the case are discussed separately.
To name an pattern p with a name i , or assign a name i to the pattern p , means that the definition of the name mapping  an addition is made such that () ip  = .
A name i , that has not yet been used for naming patterns is called a pure, or empty name. It can be thought of as a column with an empty pattern, i.e. a column ( | ) i  or i →, where  is an empty set.
Column patterns include other column names as well as pure names. Therefore, a column pattern is entirely composed of the names of other columns, each of which serves as a pointer to the corresponding pattern, possibly empty. In turn, any name from a non-empty pattern also points to its own pattern, etc. The result is a complex column structure.

2772
An index is any finite set of columns. The composition of any index can be changed by adding or removing columns. These operations are called index addition and subtraction and are denoted by + and −. In the example below, to the index A , l columns ( | ) kk ip are added: Obviously, the index can be represented in the form of a table consisting of records of the form "column namenames included in the column pattern". Such a table in vertical form is composed of columns of variable height. In the bottom row of the table, under the line, are the names of the columns. All names included in the column pattern are listed above the name of each column. By default, column names and names in patterns are assumed to belong to different name domains.
If the patterns are unordered sets of names, then the order of the names in the column patterns can be arbitrary. Below, as a simple example, an index A with patterns in the form of unordered sets, consisting of three columns (1|{1, 3}) , (1|{2, 3, 4}) and (3 |{4, 5}) , is given. If the patterns are ordered, then the names in the column patterns are recorded in a certain order, for example, from the bottom up, i.e. the first name of the pattern in the first row above the line, the second in the second row, etc. This writing order is adopted in this article and in other works devoted to column-based intelligent systems.
An column-based intelligent system is one or more indexes and a mechanism that operates on them, called a column engine. Receiving information about the external world in the form of patterns, the column engine forms new columns, modifies existing ones, deletes unnecessary ones, and performs all other operations with columns.
Knowledge in the systems under consideration is represented using columns, and the process of accumulating knowledge is based on memorizing new patterns under certain names. Obviously, elementary basic problems, without the solution of which the functioning of such a system is impossible, are the direct problem (given a pattern, obtain its name) and the inverse problem (given a name, obtain the corresponding pattern). Memorization of new patterns is carried out as a part of the direct problem. If when solving this problem a nameless pattern is found, then the column engine assigns a certain name to it and saves the corresponding data.
From a formal point of view, memorizing any pattern under a certain name always means the formation of the corresponding column ( | ) i ip. At the same time, this does not mean that the data will be stored in this form inside the system. The internal representation of the stored data is determined only by the method of solving basic problems and the way of its implementation. Internal representation may differ significantly from the formal column description ( | ) i ip. An example of a method for which the formal description coincides with the internal data representation is the method based on element-wise comparison of patterns [1,2]. In other cases a formed column ( | ) i ip most probably will be stored in an implicit form, and solving basis problems will not only show its existence, but will also help to receive its pattern by the column name, and a column name, by its pattern.
By solving basic problems, the column engine actually implements the following links i pi → in the direct problem and i ip → in the inverse problem. This provides the basis on which to build the solution to all other problems. The solution to any such problem is essentially a chain of links until the result is obtained.
Since in the considered model everything is finite, the solution to basic problems always exists. Thus, a universal method of solving them is the afore-mentioned method of element-by-element pattern comparison. From the theoretical standpoint, this is enough to estimate the possibilities of solving different problems with the help of column-based intelligent systems. However, if we are talking about practical application of such systems, we need more efficient methods of solving basic problems, especially high dimension problems.

2773
One of the possible methods for more efficient solution of basic problems is the intersection method. The idea behind the intersection method goes back to the book index. In it, for each heading, there are many pointers to those pages of the book where this heading can be found. A query from several headings obviously corresponds to intersection of pointer sets for these headings.
In the early 2000s. A.M. Mikhailov showed that the intersection method can be used to work with patterns [6,7]. Within the framework of the emerging direction, called the index approach, the intersection method is used mainly in solving problems of pattern recognition [8][9][10].
Based on the results of [6,7], the author proposed a variant of the intersection method intended for research in the field of column-based intelligent systems [1,2]. For this version, necessary and sufficient conditions for the existence of a solution to the direct problem were obtained, the fulfillment of which has little effect on the universality of the method. This variant of the intersection method is also characterized by the complete absence of the need for element-wise comparison of patterns.
It should be emphasized that the intersection method is not a necessary component of column-based intelligent systems. This is just one of the possible methods for solving basic problems. Instead of it, any other methods and means can be used, in particular, software-hardware, providing high efficiency in solving basic problems of one type or another.
In works [1][2][3] the solution of various basic problems for patterns in the form of unordered finite sets, for patterns in the form of vectors or finite sequences, as well as for patterns in the form of finite multisets was considered. In [1,2] it was proved the possibility of implementing arbitrary Boolean functions : In the article [4] a much stronger result was provedarbitrary functions of the kind : nm f U U → can be implemented in the systems under consideration, including arbitrary Boolean functions . In addition, it was shown in [5] that in column-based intelligent systems, arbitrary relations (predicates) n rU  can be realized, where r is a n-ary relation over a set U , ...
Cartesian power of a set U .
As mentioned earlier, in column-based intelligent systems, basic problem solving serves as the basis for solving all other problems. The solution to any such problem is a chain of links until the result is obtained. When solving the problem of implementing functions, such a chain consists of three linkstwo links of the direct problem and one link of the inverse problem [4]. When solving the problem of realizing relations, the chain of transitions is even shorter. It consists of only two linksone direct problem link and one inverse problem link [5]. The simplicity of the solution scheme allows the formation of functions and relationships dynamically during the functioning of the column-based intelligent system. As a result, as knowledge accumulates, such a system can form an increasingly accurate and complex internal representation of those functional dependencies and relationships that exist in the real world.
Working in the real world means, among other things, that the system can operate with incomplete information, when, for one reason or another, only a part of the original pattern (sub-pattern) enters the system. Therefore, in order for the system to work in the real world, it must solve basic problems under conditions of incomplete information. This work is devoted to this.
The next section describes the definition of basic problems under incomplete information. Further, the existence of a solution to these basic problems for all types of patterns is shown. Then, for patterns in the form of finite unordered sets and patterns in the form of finite sequences or vectors, the solution of basic problems under incomplete information is considered using the intersection method. Finally, in conclusion, the relation between system's ability to work under incomplete information and prediction is shown.

Basic problems under incomplete information
We denote by 0 p the original full pattern, and by p the pattern that came into the system and is only a part of the original pattern 0 p (sub-pattern). For patterns in the form of finite unordered sets, what has been said means In what follows, we will assume that the column engine has additional information with which it can distinguish between complete and incomplete patterns. For patterns in the form of finite unordered sets, the number of elements of the original pattern 00 || np = can be used, where || number of elements (cardinality) of a set. The sign of completeness in this case will be the equality 0 || pn = . For other types of patterns, it is most simple to distinguish between complete and incomplete patterns by replacing the missing pattern elements with special service names. An obvious sign that the pattern p is complete is the absence of the specified service names in its composition.
A simple definition of basic problems will be considered, in which parts of patterns are not memorized. Consider first the direct problem under incomplete information.
Let be pa complete pattern, i.e. 0 pp = . In this case, the usual direct problem is solved [1,2]. Need to get the name i of the pattern p . If the column engine manages to do this, then the name i is a solution to the direct problem. Otherwise, the pattern p is new and memorized under a certain name p i , which in this case will be the solution to the direct problem.
Let the pattern pnow be an incomplete pattern. In this case, the column engine must specify the names of all those known patterns, of which the pattern p is a part. If it succeeds, then the set of names of such patterns 0 S is a solution to the direct problem. Otherwise, the pattern under consideration p is a part of some pattern unknown to the system.
The inverse problem under conditions of incomplete information remains unchangedby the name of the pattern i it is necessary to obtain an pattern p known by this name.

Solving basic problems under incomplete information using the method of element-wise comparison
A general universal method for solving basic problems is a method based on element-wise comparison of patterns. To solve basic problems under incomplete information, one index A is used, which consists of columns ( | )

Solving basic problems under incomplete information using the intersection method Intersection method for patterns in the form of finite unordered sets
We will assume that the patterns of the set P are finite unordered sets of names In addition, we will assume that the true cardinality 0 m , is known for each input pattern p i.e. the cardinality that the complete pattern 0 p has. Because Therefore, for complete patterns, the usual intersection method can be used [1,2]. So, let it be necessary to solve the direct problems for the complete pattern As already mentioned, in column-based intelligent systems, the solution to any problem is a chain of links until the result is obtained. To make a prediction, such a chain consists of only two linksone direct problem link and one inverse problem link.  . Therefore, an incomplete pattern (3, 1, 0) p = nonzero coordinates coincides with the patterns of the same dimension that are known to the system under the name 2 and 3. Solving the inverse problem, we find that these patterns are equal to the patterns of the columns 23 (2 | ), (3 | ) b b B  , i.e. these are patterns (3,1,3) and (3,1,2) . For time sequences, this will mean that for the sequence available at time 2 (3, 1, 0) p = possible scenarios are sequences (3, 1, 3) and (3, 1, 2) .

Results
In the real world, the system should be able to work under conditions of incomplete information, when, for one reason or another, only a part of the original pattern (sub-pattern) enters the system. In column-based intelligent systems, basic problems form the basis for all other problems. Therefore, in order for such systems to work in the real world, they must solve basic problems under incomplete information.
The paper proposes a simple definition of basic problems under conditions of incomplete information. It is assumed that the system, in addition to the received pattern, has additional information that makes it possible to distinguish between complete and incomplete patterns. In this case, incomplete patterns in the system are not memorized.
For complete patterns, the setting of basic problems does not change. In the direct problem, for a complete pattern p you need to get its name. If the pattern pnameless full pattern, then it must be memorized. In the inverse problem for pattern name i you need to get a complete pattern known by this name.
For incomplete patterns, the direct problem changes. For an incomplete pattern p you need to get a set of names 0 S , consisting of the names of all complete patterns known to the system, of which the pattern p is a part.
Using the method based on element-by-element comparison of patterns, a solution to basic problems under incomplete information was obtained for any types of patterns. Using a more efficient intersection method, we show the solution of basic problems under conditions of incomplete information for patterns in the form of finite unordered sets The methods for solving basic problems presented in the work represent a more general version of the methods from [1,2], which includes the ability of the system to work under incomplete information. The most important consequence of this is that the ability of the system to work under incomplete information at the same time means the ability of the system to predict, since from the currently available part of the time sequence, the system can restore options for the development of events in the future. Moreover, the part of the sequence available to the present moment may be known only partially, i.e. we are talking about a prediction under incomplete information. Thus, an elementary baseline prediction is an intrinsic property of a system capable of solving basic problems under conditions of incomplete information. This demonstrates one of the main features of the column-based modelits versatility, where the same mechanism serves different purposes.