This specification document is under version control. The version number of this document is 0.1.
A complete list of past and present versions of the specification is here.
The most recently published version of the specification is here.
This specification document extensively references a glossary, which is also under version control. Version 0.1 of the glossary SHOULD be considered as part of this specification.
If a system component is said to be compatible with version p.q of Akinity, what is meant is that the component wholly conforms to version p.q of the specification document.
draft / released
Version 0.1 is the first draft of the Akinity specification.
Versions 0.1 up to 1.0 signifies draft. Version 1.0 and higher signifies released.
Released versions of Akinity are to be be fully compatible with all earlier released versions. The same is not necessarily true for draft versions. Nor are released versions necessarily compatible with draft versions.
stable / unstable
Stable versions of the specification have been through a more rigourous quality control process than unstable versions. Unstable versions are marked using the version identifiers such as Aplpha, Beta.
All draft versions should be considered unstable.
This specification is written in English. Translations should make reference to the English master.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT","SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, [RFC2119].
This document specifies how to implement the Akinity system. It is in two sections. The first section specifies Akinity's four main components: cTag; synthesis; meiosis; akin. Implementation of cTag is REQURED in every implementation of Akinity. Implementation of any or all of the other three components is OPTIONAL.
The second section covers general matters that are not specifically dealt with in the first section.
Akinity expects that any application which supports the technology in which a cTag has been implemented, should be able to recognise a cTag and process it in accordance with the specified cTag data format. Therefore it is crucial that developers implementing Akinity should follow this specification when creating or processing cTags in any technologies.
cTag is a generic data structure, independent of the underlying technology substrate of its imlementation.
In principle, a cTag may be implemented in any digital data format that supports text and (some proxy for) binary data. In practice, XML is the only significant implementation of cTag to be released with this version of the specification. JSON has also been implemented, but only as an automatic translation from XML.
It is anticipated that for future versions of this specification, akinity.org will continue to maintain a reference implementation of cTag in one substrate only, which will be XML. The XML implementation will continue to be referenced by the specification, However the predominance of XML will diminish as additional technologies are deployed in the field. akinity.org maintains a process so that developers implementing cTag in other technology substrates can submit their reference implementation for dissemination and recommendation by akinty.org as a best practice,
Developers planning to implement cTag in XML, JSON or another substrate should study both this specification and its reference implementation in XML. In particular the XML Schema, which constrains a cTag implemented in XML should be properly understood. If available, any best practices for the applicable substrate published by akinity.org should also be studied.
Whenever a non-XML substrate is to be implemented, all cTag functionality available in XML MUST be translated to that substrate.
This specification relies heavily on the XML reference impementation to specify cTag.
The XML schema should be studied in order to grasp the cTag data structure.
This schema should be used either directly for XML implementations or as a template for implementations in any alternative substrates.
XML Schema Documentation for the reference implementation provides a readble web interface to the xsd specification. It includes comments and notes intended to aid developers' understanding.
A versioned instance of the XML Schema constrains every valid cTag implemented in XML for that version of Akinity.
The cTag validity section below applies to every cTag contained in an XML document. Whether the cTag is stand-alone, a discrete list or a linked list of ancestral relations.
For any cTag implemented in XML to be valid the containing XML document MUST:
- reference exactly one cTag.xsd file at the akinity.org domain, according to the schema location specified below
- reference a cTag.xsd file whose version is at least as high as the highest version of any cTag in the document
- contain only valid XML. Well-formed and valid with respect to the cTag.xsd file
The absolute path to a file is given as:
where [version#] is a substitution variable indicating a version of Akinity (not necessarily a stable release).
The example below is a valid reference to a XML schema because the path format is valid and the specified version exists:
It is RECOMMENDED that XML cTag should reference the xsd using a version path (as above).
The alternative path below will always contain the latest stable version of the xml schema document. If no released version is availbale yet, it will contain the latest draft.
This path will work for all stable released versions, but it is not guaranteed to always work with draft versions of the Akinity specification. Any cTag using this path is at risk of breaking compatibility with draft cTags, should full backward compatibillty with draft cTags not be maintained in some version of the specification.
Sample cTags in XML
Sample cTags in JSON
Binary in XML
According to the XML Schema document (XSD) which constrains this reference application, binary components of an XML cTag may be encoded in either hex or base64 text, so as to conform to XML, whose specification does not support binary data.
If an application supports the relevant technical formats, cTags implemented in different substrates must be no less interoperable than cTags implemented in a common substrate. This implies that Akinity must meet a minimum standard of functionality in any substrate and must not exceed the minimum. Minimum funcrtionality is defined by the capabilities of the XML implementation.
If additional functionality is required Akinity may be extended. since it is not part of the core specificatin, extension is the responsibility of an application's designers and not of Akinity.org.
Akinity.org is the master repository for an open reference implementation for any officially supported technology.
cTag is denoted by Mime Media Type of the form :
where the implementatoin technology such as XML can be appended e.g.
This media type is currently informal (indicated by the /x-), However there is a plan to push for standardisation at an appropriate time.
Conception and Akin are the key functions in Akinity. Respectively they produce and use cTags.
Synthesis takes data input and produces a new cTag
Meiosis takes two cTags as input and produces a new cTag
Akin takes two cTags as input and returns a distance measured in entropy.
Separately, each function is OPTIONAL for any application. However, at least one of these three functions MUST be implemented, since intercourse is defined as an application performing one of the key functions.
All three of the key functions produces results which are determinate. An application can and SHOULD validate its implementation of the functions against the results of the reference implementation for the same inputs.
Rule of consistency
Under the same version of Akinity, the contents of two synthetic cTags, which were independently created by synthesis from exactly the same source data, SHOULD always be consistent. This rule allows common input to synthesis to be recognisable, even where synthesis occurred in different applications in different cultures.
To ensure that the consistency rule always holds, it is important that developers of Akinity applications MUST adhere closely to the specification. Developers can validate adherence to the consistency rule by testing their application's synthetic cTags against the sample synthetic cTags provided.
Method of Synthesis
Version 0.1 of Akinity specifies a single method of synthesis (Mos). It is expected that the method will be unchanging. However, until version 1.0 it is possible that MoS could be re-specified.
The MoS in Version 0.1 is SHA-256.
An important corollary to the Method of synthesis is Expansion. Expansion is necessary because applications and their users may have different requirements for precision. A cTag of breadthn+1 has twice as many bits in its contents as a cTag of breadthn and is consequently of higher precision. Two synthetic cTags can still be consistent despite having different breadths.
PoE takes as input an array of n bits, where n is the length of the output of MoS. For output, PoE gives an array of 2p bits, where p is the required cTag breadth.
In version 0.1 MoS is the SHA-256 algoithm, which produces output of length 256 bits. PoE therefore takes 256 bits and expands these contents to the required breadth.
PoE is a deterministic component of the deterministic MoS algorithm. For any organelle there is only one correct output of breadth=p.
Like a daisy chain, PoE takes the (non-cumulative) output of the previous discrete expansion step as input to the current step. Unlike a daisy chain, the PoE algorithm allows a degree of parallelism. Each previous step's output is input to two current steps of expansion. At the discretion of the application designer, these steps may be run in separate process threads.
In version 0.1, PoE is SHA-256. For more details on the specifics of this algorithm, refer to the Pattern of Expansion appendix to this specification.
Meiosis arguably is the most novel algorithm in Akinity. It is the function that simulates sexual reproduction for binary data.
At pos 213, the X-parent and the Y-parent are each either one or zero. And the Z-child, likewise can have one of two values. This makes eight possible input/output scenarios, which are numbered below. The final column in the table below is the number of similarities scored to both parents under meiosis.
If inputs and outputs were all at random we could expect, over many iterations of meiosis and many pos, to get high entropy between all cTags. However, these eight scenarios are not all favoured equally by meiosis. Meisosis produces low entropy between output and each input. It also tries to improve the liklihood of high entropy between future inputs by contolling its own output.
For scenarios where both parents at a pos have the same value, the outcome of meiosis is straight-forward:
Accept: (scenarios 0,7) output is the same as both inputs. Meiosis will always produce these outputs for these inputs.
Reject: (scenarios 1,6) output is different to both inputs. Meiosis will never produce output as in these scenarios.
The other scenarios (2,3,4,5), where the two parents each have a different value, are called ambivalent scenarios. Here, the outcome is determined by another scheme. Before going into this scheme, we can clearly see that whatever the scheme is, Z will inevitably resemble one of its two parents and not the other one, since the parents are different at this pos.
For all eight scenarios, there is a total of 8 * 2 = 16 opportunities for Z to resemble one of its two parents. Under a purely random output, the similarity scores (over many runs) would approach 8/16. But due to its discrimination for some scenarios, meiosis produces a lower entropy outcome than random. In the table below, rejected outcomes are replaced by the equivalent accepted ones.
Meiosis scores 12 /16 similarity to both parents.
It is the gap between 8/16 (max entropy) and 12/16 (meiosis) which the akin function uses to discern apparent kin relation. This works because over many pos, the law of large numbers ensures a very low statistical chance of 12/16 simiarity (many times over) being caused by chance alone.
To protect the base case (high entropy when no kin relation), Meiosis must strive to ensure that all input scenarios are equally likely to occur. This is the main purpose of reversal, which keeps the system-wide probability of zero / one at any pos close to maximum entropy.
Relative to each other, the two inputs to meiosis must be more similar, less similar or have exactly 50% similarity.
If the inputs are more similar to each other, the initial condition is deemed 'alike'. If the inputs are less similar, initial condition is 'unalike'. If neither 'alike' nor 'unalike' then initial condition is 'neutral'. See the median line in a binomial distribution chart.
e.g. breadth=8 denotes length=256 bits. From 0 to 256, there are 257 possible input similarity conditions. So 127 bits or fewer is 'unalike', 128 bits is 'neutral' and 129 bits or more is 'alike'.
when initial condition is 'unalike' :
One of the inputs (Y-Parent) is reversed. This action is for the benefit f this meiosis process, it does not affect the underlying cTag data.
Output is not treated
when initial condition is 'neutral' :
Inputs are not treated
Output is not treated
when initial condition is 'alike' :
Inputs are not treated. Meiosis proceeds with the inputs in their original polarity.
After meiosis selects values for the new cTag at every pos, those values are all reversed. The new cTag is represented in this reversed polarity.
Akinity maintains relatively low entropy between related cTags, even through many generations of meiosis.
Meiosis achieves this low entropy by means of the scheme that is used to decide which of the two parents to resemble when their contents values are different (scenarios 2,3,4,5). This is the purpose of the offset.
The scheme for maintaining offset values through meiosis is specified in the wiki.
Meiosis has to operate on input cTags of equal breadth. By default this is he lower breadth of the two parents (X or Y). Meiosis accepts an optional parameter that specifies the child's breadth should be somehting other than the default.
The logic for determining the Z child's breadth depends on the value of cTag breadth of parents X and Y and on the parameter specified (if any). The Z child's breadth also depends on whether X and Y are meiotic or synthetic.
In the decision table below min breadth <= a <= b <= c<= max breadth.
When the required breadth of the Z child is less than a parent, that parent is first contracted
When the required breadth of the Z child is greater than the breadth of a synthetic parent, that parent is first expanded.
Meiosis can retain description and URI data from either or neither of the two parents through to the next generation. The child cannot inherit both parents' organelle data.
Meiosis takes two optional parameters.
The Transaction parameter
This is typically set by the user for a single instance of meiosis. If set, the Transaction parameter takes precedence over all others. There is no default for this parameter.
The Application parameter
This typically set by the user for all instances of meiosis. The Application parameter has valid values "NULL", "X", "X|Y", "Y", "Y|X". If not set, the default value for this parameter is "Y|X".
The decision table below shows whence organelle data in the Z child is derived:
The depth of the Z child created by meiosis is by default is the depth of the parent (X or Y) which has the lowest depth incremented by one. However, a lower depth can be specified in a parameter to meiosis.
In the decision table below 0 <= a <= b < c = maxDepth (where maxDepth is the maximum depth allowed by the version of Akinity specified as the version required for the child).
The assignments of new contents values made in this part of the algorithm represent the section of meiosis that maximises similarity over many generations. It works like two captains alternately picking their preferred football team from a pool of 20 other payers. In this case, the player pool comprises all 'ambivalent' pos. ie pos which were not effectvely the same and therefore were not automatically accepted in an earlier step of meiosis. ('Effectively' refers to contents values after reversal, if applicable).
The two captains are the parents. Each of whose objective is to pick in order to preserve maximum similarity to their own ancestry of the Z child.
The selection process goes in rounds, each round consists of one pick for each parent. There are as many rounds as there are ambivalent scenarios. In every round, each captain picks the member that they prefer from all pos with ambivalent scenarios that have not yet been picked by either parent. There is a slight advantage to the captain who gets to picks first in each round, which is the X-Parent.
Picking rounds continue until all ambivalent pos have been exhausted.
To rank their selections, the parents each use available offset values from both parents at each pos . They follow the same procedures, but each according to its own interests first. There are three prioritised sort criteria.
- Offset values in 'own' cTag. i.e. X-Parent ranks according to the offset in the X cTag. Highest first. (self-interest)
- Reverse offset values in 'other' cTag. i.e. X-Parent ranks according to the offset in the Y cTag. Lowest first. (be nice)
- Pos number. Lowest first. (arbitrary, finally determistic)
The second criterion is used as a tie-breaker for the first. The third as a tie-breaker for the second.
When a pos is picked by either parent the Z child acquires the picking parent's boolean value in contents for that pos.
Since each parent ranks all ambivalent pos according to their own preferences there is no incentive for a process to attempt to cheat by, say, artificially increasing offset values in a cTag.
Like the other key functions in Akinity, akin is a deterministic function. For any given inputs, akin's output is always the same in any implementation. Notwithstanding rounding due to variable precision.
distance = akin (source, target, breadth)
The two input cTags are known as source and target. Nevertheless, because Akinity has the property of symmetry, the order in which the inputs are presented to akin SHOULD not affect the function's result.
The offset values of the input cTags do not affect the result of akin.
The output of akin is a measure of normalised entropy between the two input cTags. The lower and upper bounds of the scale are respectively zero and one.
Akinity specifies a minimum requirement for precision of akin's output. Output must be accurate to at least eight decimal places, with no upper bound on precision.
Trailing zeros imply precision to the end of the sequence.
For instance, if the result of akin to 15 decimal places is 0.998414426937447 then the following are examples of correct output:
0.998414426937447; 0.99841442693745; 0.9984144269374; 0.998414426937; 0.99841442694; 0.99841442694; 0.9984144269; 0.998414427; 0.99841443
and the following are examples of incorrect output:
0.9984144; 0.998414426937450; 0.9984144269375
A user may optionally specifiy one parameter to the akin function :
This parameter restricts the range of pos used to calculate the result of akin. Its purpose is to limit the cost of processing in cases where broadly accurate results are sufficient.
The calculation's range is from pos=0 to the number of pos implied by the breadth parameter.
e.g. breadth parameter = 9 implies the pos range to calculate the result from is 0 to 511. ( 511 = 29 - 1)
Breadth parameter value MUST be an integer between (inclusively) the higher minimum breadth of both input cTag versions and
(if both inputs are meiotic) the lower breadth of both cTags.
(if both inputs are synthetic) the lower maximum breadth of both cTag versions
Default value is the lower breadth of both input cTags.
Akin calculates distance using the standard formula for calculating entropy.
Because we are dealing with the binomial distribution, entropy calculations in Akinity use log base2.
distance = -
/* polarity0 */ (( score / length) * (Math.log(( score / length)) / Math.LN2) +
/* polarity1 */ (1 - (score / length)) * (Math.log((1 - (score / length))) / Math.LN2))
distance is the result of the calculation; the measure of entropy in the observed similarity.
score is the number of bits of similarity observed between the contents of the two cTags.
length is the number of pos from which score was derived.
The documents linked below are part of this specification:
The documents linked below support, but are not part of, the specification: