On data and information

On data and information

In  Akinity, a distinction between data and information is drawn.

This distinction is not merely a matter of definition, but describes a crucial, intrinsic property of the Akinity system. This definition may also be applied to other systems.

 

Interpretating Data

In any system data alone has no semantic value, until it is interpreted by an agent. Rather than just raw data, participants in the system require information, which we define as data under an interpretation by a subject in a domain of knowledge relevant to the agent.

Conditions for interpretation

An agent may interpret data if the following conditions are met:

  1. The agent must have access to a schema, which has some relevance to the agent's information domain.

  2. The schema must have some applicability to the data.

  3. The agent must make a judgement over the significance of a probabilistic interpretation of complex data. Simple data though has a single standard interpretation, so no judgement is required.

Whenever these conditions are met, an agent may be considered the subject of an interpretation of data.

Communication

Akinity makes the distinction between dissemination of data and communication. Dissemination solely entails distributing data to another (potential) recipient. Typically, this involves verbatim copying or moving of data.

Communication entails first interpreting data to produce new information, then disseminating this information by means of a freshly minted data item. With the assumption that any subsequent recipient(s) of the disseminated information will be capable of interpreting it, at least in part.

Exformation

Interpretation of the same complex data is invariably different between subjects because, in general, neither schemas nor judgements are identical. Many systems consider exformation to be a problem. Their designers seek to exclude the possibility of variable interpretation by using standardised schemas and formal interpretation methods. Akinity though, treats exformation as both an inevitable and a desirable property of large-scale heterogeneous systems.

Exformation has important implications for applications, where it may sometimes be desirable to communicate different information to different recipients, while disseminating a single, consistent data item. Below we explore some of the ways in which Akinity's information design can be useful for applications.

Access control

Steganography is a term from the field of secure communications. Here, we will use it in a more generic sense. Concealing information in data so that some, but not all, of the possible recipients may interpret different parts of the information contained therein.

Steganography vs. encryption

An advantage over encryption is that, for some applications, steganography can offer highly granular information. Encryption is boolean: either the user can access the all the clear content or they can access none of it. There is no in-between state. Once a piece of encrypted data has been widely disseminated in clear form, the originator of the content effectively loses any control over it. Read once, write many.

Depending on the degree of subtlety with which information has been encoded into data, however, Akinity's security model can be read some, write some.

We do not claim that steganography offers better security than any particular encryption method. Rather that, through its layered opacity, Akinity simply makes it infeasible for a user to knowingly reveal all of the information that has been encoded in a piece of data. Akinity does this in a way that encryption alone cannot manage. Encryption and steganography serve different use cases. They are not incompatible with each other.

Variable information quality

Akinity allows for many separate pieces of information to be encoded into a single data item. For instance: more specialised; detailed; localised; timely; better language translation. Through any conceivable variation a higher quality version may be disseminated alongside a lower quality form of the same information content, within the same data item.

Like television channels, quality variations may be bundled into a few packages or they may be offered in combinations of any complexity. Each dimension of variation may have different degrees; e.g. simple high/low quality or on a more differentiated scale. Such variations are effectively limitless; depending only upon the art and imagination of content producers.

Opacity

A subject cannot know whether they have already accessed all the information present in Akinity's data. Since no single agent need have access to all of the information encoded in an item of data, it would usually not be worth anybody's effort to try to gather all the information together. In part because they would not even know it if they had achieved their goal. And also because, as we shall see, the data itself (consequently the information therein) can be constantly evolving.

Information value

Akinity allows information quality to be different for different users of the same data. Users can each determine the value that they individually place on particular combinations of quality variation.

Whilst some customers might value superior content in their own language and details, others might demand specific domain relevance and timeliness.

Pricing

Responding to demand, higher valued information might be made available at a higher price, whilst less valued information might be distributed at no charge.

Note that value is multi-dimensional: something that is of high value to you might not be as valuable to me. Whereas something else might be valued highly by me, but not by you. We might even value the same variation inversely.

Up to the point that customers tolerate it, pricing may be allowed to approach the value that each recipient places upon particular variations of information quality.

Information production, distribution and control

Akinity can be a way for producers of information to use existing public data distribution channels, whilst maintaining some control over the locus of that data's contained information value.

Public dissemination

By using public data distribution channels, applications can make use of the many and varied downstream distribution channels. Social and ever-expanding channels of data distribution on the internet and elsewhere, with their burgeoning revenue models, need not be affected at all by an information producer's attempts to extract value through its implementation of Akinity.

It is decidedly unnatural and anyway ineffective to attempt to prevent or restrict data flow. Nevertheless, the interests of applications of Akinity are only served only by encouraging dissemination of their application's data in any distribution channel. All the while, retaining some private control over the means of the data's interpretation.

Private interpretation

Through use of a private channel, an application may attempt to control precisely what and how much schematic data it makes available to different (groups of) users. Hence what quality of information those users may interpret.

Public data items, meanwhile, may advertise those private data source(s) of quality enhancement. Thus users of the public data might be enticed to add value to their interpretation of the information by accessing the private application channel. Of course, it is through private channel(s) that value can be extracted by information producer(s).

Under this public data / private schema distribution model, information gets to where it is most valued. And not, as is generally the case at present, all information going to everywhere.

Personalisation

It could be objected that the private channel is only private so long as the schematic data itself does not become widely distributed. This is objection is answered with reference to the heart of the matter. People's interests are unique, but not wholly unique in every dimension. So, while it may well be worthwhile to distribute certain schematic data to like-minded people this same distribution structure will be less than wholly satisfactory for different information. The reason is that not the same set of agents would be interested in the new information.

The proposed model requires that information producers should actively manage their relationship with their customers and provides a data model for them to do so. Personalisation is currently a vigorous trend in the information economy. It is only just getting started but the opportunities are immense. However, personalisation does have an insidious undertow which, until properly addressed, is going to prevent personalisation achieving its potential.

Privacy

Web sites are able to add value by collecting personal data about their customers and using this to target advertising. The more data they collect, the greater their opportunities to exploit it. But a negative aspect of this trend is loss of privacy to specific sites and in general as personal information is gathered and shared.

Self-declared interest

The Akinity model of personalisation is not based on a person, but upon the self-declared interest of an agent (person or process). At every session, an agent can represent their relevant interests to a provider. The provider does not need to store details of the user, since they are given what they need to know. In fact it is not necessary for them to associate personal interests with a single entity, such as person machine address or user account.

Architecturally, this model is more true to a stateless REST model of personalisation than the current model, where a 'session' remains persistent through stored data at the server and some token such as a cookie at the client.

To the extent that they are comfortable, it is in the interest of the user to faithfully represent their interests, so as to be offered content that best suits those interests. Beyond their comfort threshold is the point at which a user may prefer to withhold information about their interests. It is explicitly in the control of the user, how much personalisation information is offered to the provider, who can in turn offer a non-retain, non-share privacy policy without necessary loss of business opportunities offered by personalisation.

Schema Problem

Traditionally, a self-declared interest model has not been workable because of the schema problem. If I self-declare an interest category such as entertainment on a portal, this category was designed to suit the data structure of the portal provider. But such categories does not scale over massive numbers of heterogeneous providers, nor even to a large offering from a single provider having many heterogeneous users.

Akinity is a solution to the schema problem, which will facilitate new business opportunities and could help re-invigorate some moribund information providers, whose traditional business is locked up in assets whose value is not fully realisable in the current information economy.

Interestingly, the property of Akinity which states that every cTag has measurable distance from every other cTag, enables any information provider to make a reasonable attempt at satisfying the requirements of any declared interest from any agent, with a mutually transparent measurement of success likelihood.

Information Economy

The private channel can be characterised as a discourse, or information exchange. The provider is also a consumer of information and vice versa. And this highlights a key point about Akinity: that value is usually added as a consequence of the process interpretation. Not only for the subject, for whom the data becomes information, but also for subsequent users of data representing the subject's (interpretation of) that original data. As successive generations of subjects interpret data that is disseminated to them, the distance of the newly created data from 'the' originator becomes ever-greater. At the same rate, the number of additional potential claimants to be 'the' originator also increases. From the point of view of one originator, this appears to be a process of dilution. But from the point of view of a new subject it is about potentially recognising attribution to multiple originators.

Akinity thus offers a means to validate quantifiably a claim by an upstream content producer to have added value to some information. A contractual framework, constructed from the principles and utility of the system, might facilitate an information economy based on revenue distribution, according to a producer's claim to have added value.