The purpose of the Akinity system is to issue a common data format, known as cTag, for use in applications. Information encoded in this way can be communicated between different applications, despite potentially massive heterogeneity between the data schemas native to those applications.
Kinship similarity effect
Akinity's first foundation is a metaphor from living systems.
People (and other sexually reproducing species) generally resemble others with whom they share a closer kin relation more than they resemble those with whom a kin relation is more distant.
This effect is readily apparent for observeable characteristics such as facial features. It is still more evident when a large number of discrete genes, SNPs say, are compared in the laboratory. In general the greater the quantity of SNPs in the sample (and qualitatively, the more entropy in the selected SNPs for the chosen population), the greater the accuracy of the test in discerning the degree of kinship between individuals. Also, the more the quantity and quality of genes used, the greater the number of generations through which the test is able to discern kinship.
The second foundation of Akinity is Information Theory, which is concerned with discerning signal in a noisy communication channel.
Aknitiy's theory is elaborated here.
Akinity takes the kinship similarity effect described above as an abiding metaphor for kin between items of digital information.
The system produces a comparable effect to the biological kinship effect. The closer the kin of two items of digital information, the greater their similarity should be expected to be. cTags are specially designed to support this kin effect.
Akinity's key algorithm is known as meiosis. Meiosis is analogous to sexual reproduction for binary data. The algorithm combines exactly two existing cTags to produce a new cTag child, whose resulting similarity to each parent is quantifiably and equally strong. In Akinity, parents are not required to be one each of two sexes.
Distance and direction
Similarity to more distant ancestors or descendants invariably decays in such slow and predictable measure that the results of meiosis remain apparent and estimable even after many generations. Just as with SNPs, the number of generations of meiosis through which the similarity effect is measurable depends on the variable number of bits used in the cTags and on the quantity of uncertainty accptable for the designers or users of the application.
The effect is omni-directional within kinship. For example, two cTags in a grandparent relation (two up) are approximately equally similar to those in a half-sibling relation (one up, one down) or in a grandchild relation (two down).
In Akinity, there is no systematic distinction between classifier and classified. Anything may be classified with respect to anything else. Conversely, anything may be a classifier for anything else.
The meaning of classification in Akinity is a probabilistic determination of the apparent existence / non-existence of a kinship relation between cTags; up to a specified maximum distance and within a specified error margin.
akin is the algorithm used to discern a classification. Based on Information theory, this algorithm produces as output the measure of similarity between two cTags, quantified in units of entropy. An application can use akin's result to perform classification of any input data, within the threshold given by its user's level of acceptable uncertainty.
A universal schema is expected to emerge bottom-up from the Akinty system as heterogeneous data items are maintained discretely and combined according to local, subjective selection processes. Emergence occurs even in the absence of any over-arching top-down schema.
The reasons why Akinity has this property of emergence are not easy to describe without reference to specific examples, which unfortunately are not yet available. Emergence can be caused by a combination of factors:
- A generic, schema-agnostic algorithm to combine elements to produce new elements
- The synthesis algorithm assimilates elements (key-words, IDs, signatures etc.) from other information schemas into Akinity format
- Uneven distribution of elements across the user base
- User preference for some elements over others drives local selection pressures
- Element selections are iterated by many users through many generations
Akinity enjoys a relationship with alternative systems that is both complementary and competitive. With respect to its objectives, Akinity is similar to the Semantic Web, which aims to be a universal medium for the exchange of data. Other semantic tagging systems, such as folksonomy, are used for broadly similar purposes. In human-to-human communication, mediated without use of digital processing, natural language tends to plays an equivalent role.
Akinity will interoperate effectively with any existing schema or communication convention. Akinity's synthesis algorithm enables it to readily assimilate other data schemas.
Though the objectives of alternative systems or conventions of communication may be similar, the design of Akinity is substantially different to others. Its unique design imbues Akinity with properties that are unavailable to any alternative system.
Network effects and maturation
The Akinty system delivers its greatest benefit when it is used at both the sending and receiving ends of a data communicaton. Furthermore, the quality of communication is enhanced through frequent and diverse past inter-communications.
Nonetheless, Akinity can provide significant benefits under conditions of asymmetric adoption. For instance, an application processing a one-time data transaction from a non-Akinity external source could still benefit from Akinity so long as information from the source had been encoded in an informal schema such as natural language or a formal schema like RDF and that schema's semantics are, to some extent, in common use between receiver and sender.
License to use Akinity is automatically granted to anyone, free of charge, forever. The only restriction on how Akinity may be used is concerned with attribution.
The system is delivered as an open specification which is owned and maintained by a non-commercial organisation. Membership is open to any party who is interested in collaborating to improve the system.
A singleton system
In order to accrue benefit from network effects, it is necessary that Akinity be one-of-a-kind. For the same reasons, it is just as undesirable that two Akinity-like systems should co-exist than, say, two competing specifications of the World Wide Web.
Akinity's key components are outlined below.
- The data structure of a special kind of tag called a cTag
- A pair of algorithms for creating a cTag which are known collectively as conception
- A method to read a pair of cTags to infer their relative semantic distance
For the comprehensive treatment of this subject, see the system specification document.
As with any system that implements tags, there is an explicit or implicit relationship between a cTag and the item that it tags.
Either of two different conception algorithms may be used to produce a cTag.
Synthesis is used to produce a cTag from any digital data source such as text, url or image file.
Meiosis is used to produce a single child cTag from exactly two existing parent cTags; whether the parents were themselves created through Meiosis, Synthesis or one of each.
Every cTag inherently has a quantitive distance from every other. This distance can be readily measured using the akin function. Akin's unit of measurement is entropy.
In many cases, the relative distance of two cTags approaches maximum entropy, indicating that there appears to be no close kin relatation between them. If a subject observes a similarity that is (for them) significantly lower than maximum entropy, they may make an inference about the likely existence of a kin relation between those two cTags. Hence, an inference about the likely existence of a semantic relationship between the items that were tagged. In short, they may determine that two such items are akin.
Complexity and scalability
The existence of an inherent relationship between any pair of cTags reduces the system's overall complexity and thereby greatly improves its scalability. Alternative graph-based systems such as the Semantic Web, must maintain a complex graph of assigned relationships between tags.
Consider a functional graph such as the map of the London Underground and contrast it with a spatial map such as a geographic map of London.
In order to calculate the (logical) distance between two stations on the Undergound map, it is necessary to have access to information about all the intermediate stations and their connections. Any data missing and the calculation may become impossible. The more stations are added to the map, the greater the system's complexity becomes.
In the case of the geographic map though, one requires only information about the co-ordinates of the two stations and the ability to use a standard algorithm for calculating (physical) distance. Points can be added to the map indefinitely without increasing the complexity of the system. The inherent relationship between cTags in Akinity is comparable to the spatial relationship on a geographical map.
Useable classification process
For a semantic system to grow, it is important that the process for classifying data items should be easy to use. Akinity's classification process is not set apart from simply using the system. Rather classification is weaved in to the fabric of the system's normal use. Classification is thus an emergent property of the Akinity system.
Using Akinity to classify data involves few steps, can be automated and is intuitive.
Akinity's principle is to (re)classify some relatively unknown data item in terms familiar to the subject. Specifically, to classify the item with a single cTag from the subj ect's own culture.
The process of semantic transition from a relatively obscure term to one more familiar presents fewer impediments than, say, requiring a subject to internalise a new term.
It is normal for an external data item to be reclassified at its point of use. For instance, when browsing the web I may come across a cTagged item whose cTag is foreign to my culture.
My Akinity-enabled application would interpret the information encoded in the foreign cTag by creating a new native cTag within my culture. It would use Meiotic Conception to combine the foreign cTag with exactly one extant, native cTag. The foreigner need never be introduced into my culture. Its legacy, including its item of reference, will live on in my culture through the newly created child cTag.
Minimise classification activity
Classification of a cTag from an external source can be done fully or semi automatically.
Non-cTag data items may be assigned rich classification by making just one or two choices.
Using the system delivers benefits to both user and to other users of the system. In a single transaction, the subject gets some data classification service and the data is also enhanced for any subsequent users of it.
Akinity explicitly distinguishes between data and information. Information is rendered to a subject when data undergoes interpretation through a culture. Two different subjects would usually glean different information from the same data.
Coherence and emergence
Whenever a cTag is communicated and then interpreted by some external party, infinitessimallly subtle characteristics of the originating subject's culture are transmitted along with the cTag's data, Not all of these subtleties are useful to the other party, of course. But because of the principle of coherence, those characteristics which are useful are more likely to survive into ensuing generations. This principle is similar to the notion of fitness in the the theory of evolution. It differs from a fitness function in that a genetic algorithm's designer contrives a fitness function to achieve explicit goals.
Akinity has no explicit objectives but it does nevertheless tend to produce cTags that approximate to the contours of a fitness landscape described by the communication intended by its participants.