Notion System - The system network

Ronald Poell <rapoell@notionsystem.com>
Revised $Date: 2001/06/24 12:00:00$

Introduction   Basics   Perception   Managing  knowledge   Core  &  Extensions   History   Network   Topic  Maps   Data,  Information  and  Knowledge  -  WMDR   About the samples  Samples XML-Samples

Introduction

Within Notion System we can distinguish two different networks.

The first one is the knowledge network (very close to a semantic network) and is formed by all the notions and relations between them. This network is the basic idea behind Notion System and is worked out in the other articles.

The second one is the network of the systems itself. The ideas behind it are detailed in this paper.

For completeness I must say that the ideas mentioned here or not all implemented in the actual System. Some are partially realized, others aren't but the basics are available to do so.

From the beginning it was clear that the power of Notion System could only reach its full expansion if it would handle a large amount of knowledge. The more knowledge is stored the better the newly added knowledge will be. This will be particularly true if this knowledge is generated by software (the key issue is how a user (human or agent) perceives what another user made available). Keeping this in one system and in one place would need very powerful resources and would be very risky with regards to the availability. Just keep in mind the recent Denial of Service attacks (DoS).

We presume in this paper that the difficulties related to security and privacy of the information (knowledge) are solved (this will be handled in another paper).

The system network

I think one of the better solutions is a network of systems that handle each a subpart of the available knowledge (distributed data and collaborative computing). Backups of notions (clones) should be done on different systems in different places. So you would need a network of interconnected systems.

One of the system network topologies I think might be a good solution is the hypercube. In an n-hypercube each corner of a cube has (at most) 5 connections to other corners (nodes). Three of them are connections to nodes of the same cube, one to the inner cube, and one to the outer cube. In this kind of topology there is a good equilibrium between the amount of "connections" for each node, the path length to get to a particular node, and the availability of the other nodes in case one of the nodes gets "of-line". There are other network topologies available and I don't think that the choice of a particular topology is really important as long as the criteria for maintainability and availability are respected.

The DoS attacks I mentioned above always remain possible but making a topology like a hypercube big enough will make it very difficult.

Cloning

Within Notion System the smallest entity that will exist in several places is a notion with its names, relations etc. There will be one "master" system responsible for maintaining the information about a particular notion.

This master system is also responsible for maintaining at least one link to another system for that particular notion.

This second system maintains a clone of the original notion. The master system will send incremental update orders to the second system when the original notion is updated. This looks like a normal incremental backup system. So why do we call this a clone and not a backup?

Through all the concepts behind Notion System the good - better - best idea (GBB) is used. When a user requests information about a particular notion from the network, the most up-to-date information comes from the master system of that particular notion. But what if that system is not available? A system maintaining a clone of that notion can deliver the requested information. If you don't have a clone that can be consulted you have either everything or nothing. A clone on the other hand might not be "complete", because the cloning activity will always be delayed in time (even if it are only a few seconds). But the clone will at least give an almost complete image of the requested notion.

In my humble opinion system failure is not to be considered as "if there is a failure" but as "when there is a failure". Within Notion System, delivery failure is not an exception but is supposed to occur sometimes.

So if we can't get the "best" we should except only the "better" or except even the "good (enough)".

Cloning topology

In the above part I only mentioned one clone. But that remains rather risky. Cloning a particular notion several times might be a good idea.

We can imagine several cloning topologies for this. You could use a star, a cascading or e.g. a hypercube or others.

In a star topology the master system maintains for each notion a number of links to other systems with clones. The clones know at least who the master system is. They might also know where the other clones are or they might not. When a master system fails, the cloning systems have to negotiate who is going to take over the task of the master for the necessary period. Negotiation arguments will be in the first place the last update date. But there is a big chance that some of clones will have exactly the same state. Other arguments will than be used like the workload of the system, the physical network load of the segment the system resides on etc. If the master remains unavailable over a long (?) period one of the clones can become the new master by default. If this new master doesn't know where the other clones are, he can request this information from the network.

In a cascading topology each clone will assure a clone of itself. So you will end up with a clone of a clone of a clone etc. This can stop at let's say 5 or 10 levels of cloning. This might even vary for different notions. If the master should be (temporary) replaced by a clone there will be one request for clones on the network and from all the answers the highest clone in the chain will take that role. There will be no negotiation (or hardly any). If this clone becomes the new master he doesn't have to send out a request for clones because he already knows where his (for him) unique clone is.

Other forms of topologies form a combination of the star and the cascading topologies. In a hypercube topology e.g. the initial master will have a link to 5 other systems (star) which in turn will have 4 links to other systems (1 link is used by the original master - clone link)(chain followed by a new star).

System topology and cloning topology

Remains the question where the notions should be cloned. There are basically two ways to do the cloning. Either a defined group of notions (e.g. all persons) of a particular master system are all cloned in the same cloning system(s). Or each notion has its own cloning path.

In both cases, in my opinion, the cloning topology should not be necessary the same as the systems network topology. Neither in topology type, nor in the systems participating in a particular part of the network.

Naturally this is all about spreading the risks. The risk is a bit higher to have a particular set of notions unavailable in the case of one cloning schema for a particular set of notions than there would be if each notion has its own cloning schema. The overhead in data and operations in the latter is though somewhat bigger.

I think both mechanisms should be available. The choice is up to the "owner" of the notion.

The user and cloning

Does the use of clones complicate the way a user can work with the available knowledge?

I don't think that the user should know what is coming from where unless he needs this information for a particular action.

We can though distinguish two different kinds of user actions.

If a user consults or searches for knowledge it is of no interest where the knowledge physically comes from. He must have an indication in some way if a clone provides this information because it might not be the "best" one. But he will not be able to do something about it.

If a user creates knowledge we can distinguish two different kind of actions. He either adds a new relation to an existing notion or he creates a new notion. If he adds a new relation, and one of the notions involved is a clone, some things must happen behind his back with the system maintaining the clone. This might end up with a rejection of the update (in case of a rejection of upgrading a clone to the level of master). So the user might experience the problem of working with a clone.

If he is creating a new notion he can choose a cloning mechanism from a set of available standard mechanisms, he might define one specific by hand or let the system define a random schema for that notion. But that will be the only moment he eventually has to handle cloning concepts. Naturally the system will provide a default solution if the user doesn't bother about it.

Knowledge exchange format

As this paper is all about exchanging knowledge between systems and between users and systems, it's a good place to say something about the format.

When I developed the communication part of Notion System I called the exchange format the Notion System Data Structure (NSDS). This was a tag-based structure quite similar to the pre version 1.0 XML (without a DTD).

When XML 1.0 was defined I adopted that standard and made a specific DTD for Notion System.

The use of XML (with Xpath and Xlink) makes it rather easy to assure import and export facilities through XSLT to other (knowledge) systems.

One of the things that make import and export towards e.g. databases quite easy is a very simple relationship:

<some notion>[has a record ID](the ID)<database>.

Once a record from a particular database integrated in Notion System (and that might give some problems for "identifying" what is what) future updates to and from Notion System are straightforward.


Introduction   Basics   Perception   Managing  knowledge   Core  &  Extensions   History   Network   Topic  Maps   Data,  Information  and  Knowledge  -  WMDR   About the samples  Samples XML-Samples