Category Archives: Computation

On the complexity of PI (π)

Introduction

There is no doubt that since the origins of geometry humans have been seduced by the number π. Thus, one of its fundamental characteristics is that it determines the relationship between the length of a circumference and its radius. But this does not stop here, since this constant appears systematically in mathematical and scientific models that describe the behavior of nature. In fact, it is so popular that it is the only number that has its own commemorative day. The great fascination around π has raised speculations about the information encoded in its figures and above all has unleashed an endless race for its determination, having calculated several tens of billions of figures to date.

Formally, the classification of real numbers is done according to the rules of calculus. In this way, Cantor showed that real numbers can be classified as countable infinities and uncountable infinities, what are commonly called rational and irrational. Rational numbers are those that can be expressed as a quotient of two whole numbers. While irrational numbers cannot be expressed this way. These in turn are classified as algebraic numbers and transcendent numbers. The former correspond to the non-rational roots of the algebraic equations, that is, the roots of polynomials. On the contrary, transcendent numbers are solutions of transcendent equations, that is, non-polynomial, such as exponential and trigonometric functions.

Georg Cantor. Co-creator of Set Theory

Without going into greater detail, what should catch our attention is that this classification of numbers is based on positional rules, in which each figure has a hierarchical value. But what happens if the numbers are treated as ordered sequences of bits, in which the position is not a value attribute.  In this case, the Algorithmic Information Theory (AIT) allows to establish a measure of the information contained in a finite sequence of bits, and in general of any mathematical object, and that therefore is defined in the domain of natural numbers.

What does the AIT tell us?

This measure is based on the concept of Kolmogorov complexity (KC). So that, the Kolmogorov complexity K(x) of a finite object x is defined as the length of the shortest effective binary description of x. Where the term “effective description” connects the Kolmogorov complexity with the Theory of Computation, so that K(x) would correspond to the length of the shortest program that prints x and enters the halt state. To be precise, the formal definition of K(x) is:

K(x) = minp,i{K(i) + l(p):Ti (p) = x } + O(1)

Where Ti(p) is the Turing machine (TM) i that executes p and prints x, l(p) is the length of p, and K(i) is the complexity of Ti. Therefore, object p is a compressed representation of object x, relative to Ti, since x can be retrieved from p by the decoding process defined by Ti, so it is defined as meaningful information. The rest is considered as meaningless, redundant, accidental or noise (meaningless information). The term O(1) indicates that K(i) is a recursive function and in general it is non-computable, although by definition it is machine independent, and whose result has the same order of magnitude in each one of the implementations. In this sense, Gödel’s incompleteness theorems, Turing machine and Kolmogorov complexity lead to the same conclusion about undecidability, revealing the existence of non-computable functions.

KC shows that information can be compressed, but does not establish any general procedure for its implementation, which is only possible for certain sequences. In effect, from the definition of KC it is demonstrated that this is an intrinsic property of bitstreams, in such a way that there are sequences that cannot be compressed. Thus, the number of n-bit sequences that can be encoded by m bits is less than 2m, so the fraction of n-bit sequences with K(x) ≥ n-k is less than 2-k. If the n-bit possible sequences are considered, each one of them will have a probability of occurrence equal to 2-n, so the probability that the complexity of a sequence is K(x) ≥ n-k is equal to or greater than (1-2-k). In short, most bit sequences cannot be compressed beyond their own size, showing a high complexity as they do not present any type of pattern. Applied to the field of physics, this behavior justifies the ergodic hypothesis. As a consequence, this means that most of the problems cannot be solved analytically, since they can only be represented by themselves and as a consequence they cannot be described in a compact way by means of formal rules.

It could be thought that the complexity of a sequence can be reduced at will, by applying a coding criterion that modifies the sequence into a less complex sequence. In general, this only increases the complexity, since in the calculation of K(x) we would have to add the complexity of the coding algorithm that makes it grow as n2. Finally, add that the KC is applicable to any mathematical object, integers, sets, functions, and it is demonstrated that, as the complexity of the mathematical object grows, K(x) is equivalent to the entropy H defined in the context of Information Theory. The advantage of AIT is that it performs a semantic treatment of information, being an axiomatic process, so it does not require having a priori any type of alphabet to perform the measurement of information.

What can be said about the complexity of π?

According to its definition, KC cannot be applied to irrational numbers, since in this case the Turing machine does not reach the halt state, and as we know these numbers have an infinite number of digits. In other words, and to be formally correct, the Turing machine is only defined in the field of natural numbers (it must be noted that their cardinality is the same as that of the rationals), while irrational numbers have a cardinality greater than that of rational numbers. This means that KC and the equivalent entropy H of irrational numbers are undecidable and therefore non-computable.

To overcome this difficulty we can consider an irrational number X as the concatenation of a sequence of bits composed of a rational number x and a residue δx, so that in numerical terms X=x+δx, but in terms of information X={x,δx}. As a consequence, δx is an irrational number δx→0, and therefore δx is a sequence of bits with an undecidable KC and hence non-computable. In this way, it can be expressed:

K(X) = K(x)+K(δx)

The complexity of X can be assimilated to the complexity of x. A priori this approach may seem surprising and inadmissible, since the term K(δx) is neglected when in fact it has an undecidable complexity. But this is similar to the approximation made in the calculation of the entropy of a continuous variable or to the renormalization process used in physics, in order to circumvent the complexity of the underlying processes that remain hidden from observable reality.

Consequently, the sequence p, which runs the Turing machine i to get x, will be composed of the concatenation of:

  • The sequence of bits that encode the rules of calculus in the Turing machine i.
  • The bitstream that encodes the compressed expression of x, for example a given numerical series of x.
  • The length of the sequence x that is to be decoded and that determines when the Turing machine should reach the halt state, for example a googol (10100).

In short, it can be concluded that the complexity K(x) of known irrational numbers, e.g. √2, π, e,…, is limited. For this reason, the challenge must be to obtain the optimum expression of K(x) and not the figures that encode these numbers, since according to what has been said, their uncompressed expression, or the development of their figures, has a high degree of redundancy (meaningless information).

What in theory is a surprising and questionable fact is in practice an irrefutable fact, since the complexity of δx will always remain hidden, since it is undecidable and therefore non-computable.

Another important conclusion is that it provides a criterion for classifying irrational numbers into two groups: representable and non-representable. The former correspond to irrational numbers that can be represented by mathematical expressions, which would be the compressed expression of these numbers. While non-representable numbers would correspond to irrational numbers that could only be expressed by themselves and are therefore undecidable. In short, the cardinality of representable irrational numbers is that of natural numbers. It should be noted that the previous classification criterion is applicable to any mathematical object.

On the other hand, it is evident that mathematics, and calculus in particular, de facto accepts the criteria established to define the complexity K(x). This may go unnoticed because, traditionally in this context, numbers are analyzed from the perspective of positional coding, in such a way that the non-representable residue is filtered out through the concept of limit, in such a way that δx→0. However, when it comes to evaluating the informative complexity of a mathematical object, it may be required to apply a renormalization procedure.

Reality as an irreducible layered structure

Note: This post is the first in a series in which macroscopic objects will be analyzed from a quantum and classical point of view, as well as the nature of the observation. Finally, all of them will be integrated into a single article.

Introduction

Quantum theory establishes the fundamentals of the behavior of particles and their interaction with each other. In general, these fundamentals apply to microscopic systems formed by a very limited number of particles. However, nothing indicates that the application of quantum theory cannot be applied to macroscopic objects, since the emerging properties of such objects must be based on the underlying quantum reality. Obviously, there is a practical limitation established by the increase in complexity, which grows exponentially as the number of elementary particles increases. 

The initial reference to this approach was made by Schrödinger [1], indicating that the quantum superposition of states did not represent any contradiction at the macroscopic level. To do this, he used what is known as Schrödinger’s cat paradox in which the cat could be in a superposition of states, one in which the cat was alive and another in which the cat was dead. Schrödinger’s original motivation was to raise a discussion about the EPR paradox [2], which revealed the incompleteness of quantum theory. This has finally been solved by Bell’s theorem [3] and its experimental verification by Aspect [4], making it clear that the entanglement of quantum particles is a reality on which quantum computation is based [5]. A summary of the aspects related to the realization of a quantum system that emulates Schrödinger cat has been made by Auletta [6], although these are restricted to non-macroscopic quantum systems.

But the question that remains is whether quantum theory can be used to describe macroscopic objects and whether the concept of quantum entanglement applies to these objects as well. Contrary to Schrödinger’s position, Wigner argued, through the friend paradox, that quantum mechanics could not have unlimited validity [7]. Recently, Frauchiger and Renner [8] have proposed a virtual experiment (Gedankenexperiment) that shows that quantum mechanics is not consistent when applied to complex objects. 

The Schrödinger cat paradigm will be used to analyze these results from two points of view, with no loss of generality, one as a quantum object and the other as a macroscopic object (in a next post). This will allow their consistency and functional relationship to be determined, leading to the establishment of an irreducible functional structure. As a consequence of this, it will also be necessary to analyze the nature of the observer within this functional structure (also in a later posts). 

Schrödinger’s cat as a quantum reality

In the Schrödinger cat experiment there are several entities [1], the radioactive particle, the radiation monitor, the poison flask and the cat. For simplicity, the experiment can be reduced to two quantum variables: the cat, which we will identify as CAT, and the system formed by the radioactive particle, the radiation monitor and the poison flask, which we will define as the poison system PS. 


Schrödinger Cat. (Source: Doug Hatfield https://commons.wikimedia.org/wiki/File:Schrodingers_cat.svg)

These quantum variables can be expressed as [9]: 

            |CAT⟩ = α1|DC⟩ + β1|LC⟩. Quantum state of the cat: dead cat |DC⟩, live cat |LC⟩.

            |PS⟩ = α2|PD⟩ + β2|PA⟩. Quantum state of the poison system: poison deactivated |PD⟩, poison activated |PA⟩.

The quantum state of the Schrödinger cat experiment SCE as a whole can be expressed as: 

               |SCE⟩ = |CAT⟩⊗|PS⟩= α1α2|DC⟩|PD⟩+α1β2|DC⟩|PA⟩+β1α2|LC⟩|PD⟩+β1β2|LC⟩|PA⟩.

Since for a classical observer the final result of the experiment requires that the states |DC⟩|PD⟩ and |LC⟩|PA⟩ are not compatible with observations,  the experiment must be prepared in such a way that the quantum states |CAT⟩ and |PS⟩ are entangled [10] [11], so that the wave function of the experiment must be: 

               |SCE⟩ = α|DC⟩|PA⟩ + β|LC⟩|PD⟩. 

As a consequence, the observation of the experiment [12] will result in a state:

            |SCE⟩ = |DC⟩|PA⟩, with probability α2, (poison activated, dead cat). 

or:

            |SCE⟩ =|LC⟩|PD⟩, with probability β2, (poison deactivated, live cat). 

Although from the formal point of view of quantum theory the approach of the experiment is correct, for a classical observer the experiment presents several objections. One of these is related to the fact that the experiment requires establishing “a priori” the requirement that the PS and CAT systems are entangled. Something contradictory, since from the point of view of the preparation of the quantum experiment there is no restriction, being able to exist results with quantum states |DC⟩|PD⟩, or |LC⟩|PA⟩, something totally impossible for a classical observer, assuming in any case that the poison is effective, that it is taken for granted in the experiment. Therefore, the SCE experiment is inconsistent, so it is necessary to analyze the root of the incongruence between the SCE quantum system and the result of the observation. 

Another objection, which may seem trivial, is that for the SCE experiment to collapse in one of its states the OBS observer must be entangled with the experiment, since the experiment must interact with it. Otherwise, the operation performed by the observer would have no consequence on the experiment. For this reason, this aspect will require more detailed analysis. 

Returning to the first objection, from the perspective of quantum theory it may seem possible to prepare the PS and CAT systems in an entangled superposition of states. However, it should be noted that both systems are composed of a huge number of non-entangled quantum subsystems Ssubject to continuous decoherence [13] [14]. It should be noted that the Si subsystems will internally have an entangled structure. Thus, the CAT and PS systems can be expressed as: 

            |CAT⟩ = |SC1⟩ ⊗ |SC2⟩ ⊗…⊗ |SCi⟩ ⊗…⊗ |SCk⟩,

            |PS⟩= |SP1⟩⊗|SP2⟩⊗…⊗|SPi⟩⊗…⊗|SPl⟩, 

in such a way that the observation of a certain subsystem causes its state to collapse, producing no influence on the rest of the subsystems, which will develop an independent quantum dynamics. This makes it unfeasible that the states |LC⟩ and |DC⟩ can be simultaneous and as a consequence the CAT system cannot be in a superposition of these states. An analogous reasoning can be made of the PS system, although it imay seem obvious that functionally it is much simpler. 

In short, from a theoretical point of view it is possible to have a quantum system equivalent to the SCE, for which all the subsystems must be fully entangled with each other, and in addition the system will require an “a priori” preparation of its state. However, the emerging reality differs radically from this scenario, so that the experiment seems to be unfeasible in practice. But the most striking fact is that, if the SCE experiment is generalized, the observable reality would be radically different from the observed reality. 

To better understand the consequences of the quantum state of the ECS system having to be prepared “a priori”, imagine that the supplier of the poison has changed its contents to a harmless liquid. As a result of this, the experiment will be able to kill the cat without cause. 

From these conclusions the question can be raised as to whether quantum theory can explain in a general and consistent way the observable reality at the macroscopic level. But perhaps the question is also whether the assumptions on which the SCE experiment has been conducted are correct. Thus, for example: Is it correct to use the concepts of live cat or dead cat in the domain of quantum physics? Which in turn raises other kinds of questions, such as: Is it generally correct to establish a strong link between observable reality and the underlying quantum reality? 

The conclusion that can be drawn from the contradictions of the SCE experiment is that the scenario of a complex quantum system cannot be treated in the same terms as a simple system. In terms of quantum computation these correspond, respectively, to systems made up of an enormous number and a limited number of qubits [5]. As a consequence of this, classical reality will be an irreducible fact, which based on quantum reality ends up being disconnected from it. This leads to defining reality in two independent and irreducible functional layers, a quantum reality layer and a classical reality layer. This would justify the criterion established by the Copenhagen interpretation [15] and its statistical nature as a means of functionally disconnecting both realities. Thus, quantum theory would be nothing more than a description of the information that can emerge from an underlying reality, but not a description of that reality. At this point, it is important to emphasize that statistical behavior is the means by which the functional correlation between processes can be reduced or eliminated [16] and that it would be the cause of irreducibility

References

[1] E. Schrödinger, «Die gegenwärtige Situation in der Quantenmechanik,» Naturwissenschaften, vol. 23, pp. 844-849, 1935.
[2] A. Einstein, B. Podolsky and N. Rose, “Can Quantum-Mechanical description of Physical Reality be Considered Complete?,” Physical Review, vol. 47, pp. 777-780, 1935.
[3] J. S. Bell, «On the Einstein Podolsky Rosen Paradox,» Physics,vol. 1, nº 3, pp. 195-290, 1964.
[4] A. Aspect, P. Grangier and G. Roger, “Experimental Tests of Realistic Local Theories via Bell’s Theorem,” Phys. Rev. Lett., vol. 47, pp. 460-463, 1981.
[5] M. A. Nielsen and I. L. Chuang, Quantum computation and Quantum Information, Cambridge University Press, 2011.
[6] G. Auletta, Foundations and Interpretation of Quantum Mechanics, World Scientific, 2001.
[7] E. P. Wigner, «Remarks on the mind–body question,» in Symmetries and Reflections, Indiana University Press, 1967, pp. 171-184.
[8] D. Frauchiger and R. Renner, “Quantum Theory Cannot Consistently Describe the Use of Itself,” Nature Commun., vol. 9, no. 3711, 2018.
[9] P. Dirac, The Principles of Quantum Mechanics, Oxford University Press, 1958.
[10] E. Schrödinger, «Discussion of Probability Relations between Separated Systems,» Mathematical Proceedings of the Cambridge Philosophical Society, vol. 31, nº 4, pp. 555-563, 1935.
[11] E. Schrödinger, «Probability Relations between Separated Systems,» Mathematical Proceedings of the Cambridge Philosophical Society, vol. 32, nº 3, pp. 446­-452, 1936.
[12] M. Born, «On the quantum mechanics of collision processes.,» Zeit. Phys.(D. H. Delphenich translation), vol. 37, pp. 863-867, 1926.
[13] H. D. Zeh, «On the Interpretation of Measurement in Quantum Theory,» Found. Phys., vol. 1, nº 1, pp. 69-76, 1970.
[14] W. H. Zurek, «Decoherence, einselection, and the quantum origins of the classical,» Rev. Mod. Phys., vol. 75, nº 3, pp. 715-775, 2003.
[15] W. Heisenberg, Physics and Philosophy. The revolution in Modern Science, Harper, 1958.
[16] E. W. Weisstein, «MathWorld,» [En línea]. Available http://mathworld.wolfram.com/Covariance.html.

Information and knowledge

What is information? 

If we stick to its definition, which can be found in dictionaries, we can see that it always refers to a set of data and often adds the fact that these are sorted and processed. But we are going to see that these definitions are imprecise and even erroneous in assimilating it to the concept of knowledge.

One of the things that information theory has taught us is that any object (news, profile, image, etc.) can be expressed precisely by a set of bits. Therefore, the formal definition of information is the ordered set of symbols that represent the object and that in their basic form constitute an ordered set of bits. However, information theory itself surprisingly reveals that information has no meaning, which is technically known as “information without meaning”.

This seems to be totally contradictory, especially if we take into account the conventional idea of what is considered as information. However, this is easy to understand. Let us imagine that we find a book in which symbols appear written that are totally unknown to us. We will immediately assume that it is a text written in a language unknown to us, since, in our culture, book-shaped objects are what they usually contain. Thus, we begin to investigate and conclude that it is an unknown language without reference or Rosetta stone with any known language. Therefore, we have information but we do not know its message and as a result, the knowledge contained in the text. We can even classify the symbols that appear in the text and assign them a binary code, as we do in the digitization processes, converting the text into an ordered set of bits.

However, to know the content of the message we must analyze the information through a process that must include the keys that allow extracting the content of the message. It is exactly the same as if the message were encrypted, so the message will remain hidden if the decryption key is not available, as shown by the one-time pad encryption technique.

Ray Solomonoff, co-founder of Algorithmic Information Theory together with Andrey Kolmogorov. 

What is knowledge?

This clearly shows the difference between information and knowledge. In such a way that information is the set of data (bits) that describe an object and knowledge is the result of a process applied to this information and that is materialized in reality. In fact, reality is always subject to this scheme.

For example, suppose we are told a certain story. From the sound pressure applied to our eardrums we will end up extracting the content of the news and also we will be able to experience subjective sensations, such as pleasure or sadness. There is no doubt that the original stimulus can be represented as a set of bits, considering that audio information can be a digital content, e.g. MP3.

But for knowledge to emerge, information needs to be processed. In fact, in the previous case it is necessary to involve several different processes, among which we must highlight:

  • Biological processes responsible for the transduction of information into nerve stimuli.
  • Extraction processes of linguistic information, established by the rules of language in our brain by learning.
  • Extraction processes of subjective information, established by cultural rules in our brain by learning.

In short, knowledge is established by means of information processing. And here the debate may arise as a consequence of the diversity of processes, of their structuring, but above all because of the nature of the ultimate source from which they emerge. Countless examples can be given. But, since doubts can surely arise that this is the way reality emerges, we can try to look for a single counterexample!

A fundamental question is: Can we measure knowledge? The answer is yes and is provided by the algorithmic information theory (AIT) which, based on information theory and computer theory, allows us to establish the complexity of an object, by means of the Kolmogorov complexity K(x), which is defined as follows:

For a finite object x, K(x) is defined as the length of the shortest effective binary description of x.

Without going into complex theoretical details, it is important to mention that K(x) is an intrinsic property of the object and not a property of the evaluation process. But don’t panic! Since, in practice, we are familiar with this idea.

Let’s imagine audio, video, or general bitstream content. We know that these can be compressed, which significantly reduces their size. This means that the complexity of these objects is not determined by the number of bits of the original sequence, but by the result of the compression since through an inverse decompression process we can recover the original content. But be careful! The effective description of the object must include the result of the compression process and the description of the decompression process, needed to retrieve the message.

Complexity of digital content, equivalent to a compression process

A similar scenario is the modeling of reality, where physical processes stand out. Thus, a model is a compact definition of a reality. For example, Newton’s universal gravitation model is the most compact definition of the behavior of a gravitational system in a non-relativistic context. In this way, the model, together with the rules of calculus and the information that defines the physical scenario, will be the most compact description of the system and constitutes what we call algorithm. It is interesting to note that this is the formal definition of algorithm and that until these mathematical concepts were developed in the first half of the 20th century by Klein, Chruch and Turing, this concept was not fully established.

Alan Turing, one of the fathers of computing

It must be considered that the physical machine that supports the process is also part of the description of the object, providing the basic functions. These are axiomatically defined and in the case of the Turing machine correspond to an extremely small number of axiomatic rules.

Structure of the models, equivalent to a decompression process

In summary, we can say that knowledge is the result of information processing. Therefore, information processing is the source of reality. But this raises the question: Since there are non-computable problems, to what depth is it possible to explore reality? 

Biology as an axiomatic process

The replication mechanisms of living beings can be compared with the self-replication of automatons in the context of computability theory. In particular, DNA replication, analyzed from the perspective of the recursion theorem, indicates that its replication structure goes beyond biology and the quantum mechanisms that support it, as it is analyzed in the article Biology as an Axiomatic Process.

Physical chemistry establishes the principles by which atoms interact with each other to form molecules. In the inorganic world the resulting molecules are relatively simple, not allowing establishing a complex functional structure. On the other hand, in the organic world, molecules can be made up of thousands or even millions of atoms and have complex functionality. It highlights what is known as molecular recognition, through which the molecules interact with each other selectively and which is the basis of biology.

Molecular recognition plays a fundamental role in the structure of DNA, in the translation of the genetic code of DNA into proteins and in the biochemical interaction of proteins, which ultimately form the basis on which living beings are based.

The detailed study of these molecular interactions makes it possible to describe the functionality of the processes, in such a way that it is possible to establish formal models, to such an extent that they can be used as a computing technology, as is the case of DNA-based computing.

From this perspective, this allows us to ask if the process of information is something deeper and if in reality it is the foundation of biology itself, according to what is established by the principle of reality.

For this purpose, this section aims to analyze the basic processes on which biology is based, in order to establish a link with axiomatic processing and thus investigate the nature of biological processes. For this, it is not necessary to describe in detail the biological mechanisms described in the literature. We will simply describe its functionality, so that they can be identified with the theoretical foundations of information processing. To this end, we will explain the mechanisms on which DNA replication and protein synthesis are based.

DNA and RNA molecules are polymers formed from the ribose and deoxyribose nucleotides, respectively, bound by phosphates. On the basis of this nucleotide chain, one of the four possible nucleic acids can be linked. There are five different nucleic acids, adenine (A), guanine (G), cytosine (C), thymine (T) and uracil (U). In the case of DNA, nucleic acids that can be coupled by covalent bonds to nucleotides are A, G, C and T, whereas in the case of RNA they are A, G, C and U. As a consequence, molecules are structured in a helix shape, fitting the nucleic acids in a precise and compact way, due to the shape of their electronic clouds.

The helix structure allows the nucleic acids of two different strands to be bound together by hydrogen bonds, forming pairs A-T and G-C in the case of DNA, and A-U and G-C in the case of RNA, as shown in the following figure.

Base-pairing of nucleic acids in DNA

As a result, the DNA molecule is formed by a double helix, in which two chains of nucleotides polymers wind one on top of the other, remaining together by means of hydrogen bonds of nucleic acids. Thus, each strand of the DNA molecule contains the same genetic code, one of which can be considered the negative of the other.

Double helix structure of DNA molecule

The genetic information of an organism, called a genome, is not contained in a single DNA molecule, but is organized into chromosomes. These are made up of DNA strands bound together by proteins. Thus, in the case of humans, the genome is formed by 46 chromosomes, and so, the number of bases in the DNA molecules that compose it being about 3×109. Since each base can be encoded by means of 2 bits, the human genome, considered as an object of information, is equivalent to 6×109 bits.

The information contained in the genes is the basis for the synthesis of proteins, which are responsible for executing and controlling the biochemistry of living beings. The proteins are formed by the bonding of amino acids, through covalent bonds, which is done from the sequences of the bases contained in the DNA. The number of existing amino acids is 20 and since each base codes 2 bits, 3 bases (6 bits, 64 combinations) are necessary to be able to code each one of the amino acids. This means that there is some redundancy in the assignment of base sequences to amino acids, in addition to control codes for the synthesis process (Stop), as shown in the following table.

Translation of nucleic acids (Codons) to amino acids

However, protein synthesis is not done directly from DNA, since it requires the intermediation of RNA. This is called the translation process and involves two types of different RNA molecules, the messenger ARM (mRNA) and the transfer RNA (tRNA). The first step is the synthesis of mRNA from DNA. This process is called transcription, in such a way that the information corresponding to a gene is copied into the mRNA molecule, which is done through a process of recognition between the molecules of the nucleic acids, carried out by the hydrogen bonds, such as shows the following figure.

DNA transcription

Once the mRNA molecule is synthesized, the tRNA molecule is responsible for mediating between mRNA and amino acids to synthesize proteins, for which it has two specific molecular mechanisms. On the one hand, tRNA has a chain of three amino acids called anticodon at one end. On the opposite side, tRNA binds to a specific amino acid, according to the translation table of nucleic acid sequences into amino acids. In this way, tRNA is able to translate mRNA into a protein, as shown in the figure below. 

Protein synthesis (mRNA translation)

But perhaps the most complex process is undoubtedly DNA replication, so that each molecule produces two identical replicas. Replication is performed by unwinding each strand of the molecule and inserting the nucleic acid molecules on each of the strands, in a similar way to that shown in the mRNA synthesis. DNA replication is controlled by enzymatic processes supported by proteins. Without going into detail and in order to show its complexity, the table below shows the proteins involved in the replication process and their role.

The role of proteins in the DNA replication process

The processes described above are defined as the central dogma of molecular biology and are usually schematically represented schematically as shown in the following figure. It also depicts the reverse transcription that occurs in retroviruses, which synthesizes a DNA molecule from RNA.

Central dogma of molecular biology

The biological process from the perspective of computability theory

Molecular processes supported by DNA, RNA and proteins can be considered from an abstract point of view as information processes. As a result, input statements corresponding to a language are processed resulting in new output statements. Thus, the following languages can be identified:

  • DNA molecule. Sentence consisting of a sequence of characters corresponding to a 4-symbol alphabet.
  • RNA molecule – protein synthesis. Sentence consisting of a sequence of characters belonging to a 21-symbol alphabet.
  • RNA molecule-reverse transcription. Sentence composed of a sequence of characters belonging to a 4-symbol alphabet.
  • Protein molecule. Sentence composed of a sequence of characters belonging to a 20-symbol alphabet.

This information is processed by the machinery established by the physicochemical properties of control molecules. To better understand this functional structure, it is advisable to modify the scheme corresponding to the central dogma of biology. To do this, we must represent the processes involved and the information that flows between them, as shown in the following block diagram.

Functional structure of DNA replication

This structure highlights the flow of information between processes, such as DNA and RNA sentences, where the functional blocks of information processing are the following:

  • PDNA. Replication process. The functionality of this process is determined by the proteins involved in DNA synthesis, producing two replicas of DNA from a single molecule.
  • PRNA. Transcription process. It synthesizes a RNA molecule from a gene encoded in DNA.
  • PProt. Translation process. It synthesizes a protein from an RNA molecule.

This structure clearly shows how information emerges from biological processes, something that seems to be ubiquitous in all natural models and allows the implementation of computer systems. In all cases this capacity is finally supported by quantum physics. In the case of biology in particular, this is produced from the physicochemical properties of molecules, which are determined by quantum physics. Therefore, the information process is something that emerges from an underlying reality and ultimately from quantum physics. This is true as far as knowledge goes.

This means that, although there is a strong link between reality and information, information is simply an emerging product of reality. But biology provides a clue to the intimate relationship between reality and information, which are ultimately indistinguishable concepts. If we look at the DNA replication process, we see that DNA is produced in several stages of processing:

DNA → RNA → Proteins → DNA.

We could consider this to be a specific feature of the biological process. However, computability theory indicates that the replication process is subject to deeper logical rules than the physical processes themselves that support replication. In computability theory, the recursion theorem determines that replication of information requires at least the intervention of two independent processes.

This shows that DNA replication is subject to abstract rules that must be satisfied not only by biology, but by every natural process. Therefore, the physical foundations that support biological processes must verify this requirement. Consequently, this shows that the information processing is essential in what we understand by reality.

Natural language: A paradigm of axiomatic processing

The Theory of Computation (TC) aims to establish computational models and determine the limits of what is computable and the complexity of a problem when it is computable. The formal models established by TC are based on abstract systems ranging from simple models, such as automatons, to the general computer model established by the Turing Machine (TM).

Formally, the concept of algorithm is based on TM, so that each of the possible implementations will perform a specific function that we call algorithm. The TC demonstrates that it is possible to build an idealized machine, called Universal Turing Machine (UTM), capable of executing all possible computable algorithms. In the case of commercial computers, these are equivalent to UTM, with the difference that their memory and runtime are limited. On the contrary, in the UTM these resources are unlimited.

But the question we can ask is: What does this have to do with language? The answer is simple. In TC, an L(TM) language is defined as the set of bit sequences that “accepts” a given TM. In which the term “accept” means that the TM analyzes the input sequence and reaches the Halt state. Consequently, a language is the set of mathematical objects accepted by a given TM.

Without going into details that can be consulted in the specialized literature, the TC classifies the languages into two basic types, as shown in the figure. Thus, a language is Turing-decidable (DEC) when the TM accepts the sequences belonging to the language and rejects the rest, reaching the Halt state in both cases. On the contrary, a language is Turing-recognizable or RE if it is the language of a TM. This means that, for the set of languages belonging to RE but not belonging to DEC, TM does not reach the Halt state when the input sequence does not correspond to the language.

It is necessary to emphasize that there are sequences that are not recognized by any TM. Therefore, if the formal definition of language is taken into account, they should not be considered as such, although in general they are defined as non-RE languages. It is important to note that the latter concept is equivalent to Gödel’s incompleteness theorem. As a consequence, they are the set of undecidable or unsolvable problems, that is, they have a cardinality superior to the one of the natural numbers.

Within DEC languages, two types, regular ​​and context-free (CFL) can be identified. Regular languages ​​are those composed of a set of sequences on which the TM can decide individually, so they do not have a grammatical structure. Examples of these are the languages ​​of the automatons we handle every day, such as elevators, device controls, etc. CFLs are those that have a formal structure (grammar) in which language elements can be nested recursively. In general, we can consider CFLs to programming languages, such as JAVA, C ++. This is not strictly true, but it will facilitate the exposure of certain concepts.

But the question is: What does this have to do with natural language? The answer is easy again. Natural language is, in principle, a Turing-decidable language. The proof of this is trivial. Maybe a few decades ago this was not so, but nowadays information technology shows it us clearly, without the need for theoretical knowledge. On the one hand, natural language is a sequence of bits, since both spoken and written language are coded as bit sequences in audio and text files, respectively. On the other hand, humans do not loop when we get a message, at least permanently ;-).

However, it can be argued that we did not reach the Halt state either. But in this context, this does not mean that we literally end our existence, although there are messages that kill! This means that information processing concludes and that, as a result, we can make a decision and tackle a new task.

Therefore, from an operational or practical point of view, natural language is Turing-decidable. But we can find arguments that can be in conflict with this and that materialize in the form of contradictions. Although it may seem surprising, this also happens with programming languages, since their grammar may be context sensitive (CSG). But for now, we are going to leave aside this aspect, in order to make the reasoning easier.

What can intuitively be seen is a clear parallel between the TM model and the human communication model, as shown in the figure. This can be extended to other communication models, such as body language, physicochemical language between molecules, etc.

In the case of TC, the input and output objects to the TM are language elements, which is very suitable because the practical objective is human-to-machine or machine-to-machine communication. But this terminology varies with the context. Thus, from an abstract point of view, objects have a purely mathematical nature. However, in other contexts such as physics, we talk about concepts such as space-time, energy, momentum, etc.

What seems to be clear, from the observable models, is that a model of reality is equivalent to bit sequences processed by a TM. In short, a model of reality is equivalent to an axiomatic processing of information, where the axioms are embedded in the TM. It should be clear that an axiom is not self-evident, and therefore does not need proof. On the contrary, an axiom is a proposition assumed within a theoretical body. Possibly, this misunderstanding is originated by the apparent simplicity of some axiomatic systems, produced by our perception of reality. This is obvious, for example, in Euclidean geometry based on five postulates or axioms, in which such postulates seem to us evident, due to our perception of space. On this, we will continue to insist since the axiomatic processing is surely one of the great mysteries that nature encloses.

Returning to natural language, it should be possible to establish a parallelism between it and the axiomatic processing determined by TM, as suggested in the figure. As with programming languages, the structure of natural language is defined by a grammar, which establishes a set of axiomatic rules that determine the categories (verb, predicate) of the elements of language (lexicon) and how they are combined to form expressions (sentences). Both the elements of language and the resulting expressions have a meaning, which is known as semantics of language. The pertinent question is: What is the axiomatic structure of a natural language?

To answer, let’s reorient the question: How is the semantics of natural language defined? To do this, we can begin by analyzing the definition of the lexicon of a language, collected in the dictionary. In it we can find the definition of the meaning of each word in different contexts. But we soon find a formal problem, since the definitions are based on one another in a circular fashion. What is the same, the defined is part of the definition, so it is not possible to establish the semantics of language from the linguistic information.

For example, according to the Oxford dictionary:

  • Word: A single distinct meaningful element of speech or writing, used with others (or sometimes alone) to form a sentence and typically shown with a space on either side when written or printed.
  • Write: Mark (letters, words, or other symbols) on a surface, typically paper, with a pen, pencil, or similar implement. 
  • Sentence: A set of words that is complete in itself, typically containing a subject and predicate, conveying a statement, question, exclamation, or command, and consisting of a main clause and sometimes one or more subordinate clauses. 
  • Statement: A definite or clear expression of something in speech or writing
  • Expression: A word or phrase, especially an idiomatic one, used to convey an idea. 
  • Phrase: A small group of words standing together as a conceptual unit, typically forming a component of a clause

Therefore:

  • Word: A single distinct … or marks (letters, words, or other symbols) on … to form a set of words that … conveying a definite or clear word or a small group of words standing together … or marking (letters, words, …. ) …

In this way, we could continue recursively replacing the meaning of each component within the definition, arriving at the conclusion that natural language as an isolated entity has no meaning. So it is necessary to establish an axiomatic basis external to the language itself. By the way: What will happen if we continue to replace each component of the sentence?

Consequently, we can rise what will be the result of an experiment in which an entity of artificial intelligence disconnected from all reality, except from the information on which the written language is based, analyzes the information. That is, the entity will have access to grammar, dictionary, written works, etc. What will be the result of the experiment? To what conclusions will the entity arrive?

If we mentally perform this experiment, we will see that the entity can come to understand the reality of language, and all the stories based on it, provided that it has an axiomatic basis. Otherwise, the entity will experience what in information theory is known as “information without meaning”. This explains the impossibility of deciphering archaic scripts without having cross-references to other languages ​​or other forms of expression. In the case of humans, the axiomatic basis is acquired from cognitive experiences external to the language itself.

To clarify the idea of what the axiomatic processing means, we can use simple examples related to programming languages. So, let’s analyze the semantics of the “if… then” statement. If we consult the programming manual we can determine its semantics, since in our brain we have implemented rules or axioms to decipher the written message. This is equivalent to what happens in the execution of program sentences, in which it is the TM that executes those expressions axiomatically. In the case of both the brain and TM, axioms are defined in the fields of biochemistry and physics, respectively, and therefore outside the realm of language.

This shows once again how reality is structured in functional layers, which can be seen as independent entities by means of the axiomatic processing, as has been analyzed in:

But this issue, as well as the analysis of the existence of linguistic contradictions, will be addressed in later posts.