Back to AGB home page.

  Beyond Shannon's Channel 

A.G.Booth     Original 15 March 1994     Copyright A.G.Booth, London 1994-2002 All rights reserved
First Web published 7 July 2002
Document ident: Last updated 9 December 2005 Beyond Shannon's Channel. A.G.Booth
Keys: cybernetic physical observation observer emergence engineering attractor information theory

Musings following reading and dialogue on the newsgroup in 1994.
With thanks to the many contibutors, especially John Collier, Koichiro Matsuno, Bob Ulanowicz,
and above all Tom Schneider for running the newsgroup.



**    The Limits to Shannon's Channel Concept

Shannon's work on information theory is based upon the idea of a channel connecting a sender to a receiver operating under a certain type of independently defined overall purpose. This contains a fixed sense of objectivity, and in particular presumes the ability of the observer to consider specific states of the receiver in an objective way. The measures of information resulting from this theory are limited by this restriction to a single global point of view of mechanism and its purpose and cannot accommodate changes in this point of view, especially generalised ones. It is only under this sort of constraint that the theory is made to conform to a universal notion of value and hence ceases to concern itself with value in general.

As far as it goes the Shannon theory does a wonderful job, but the reality of complex systems (they are in some senses by definition unknowable) cannot be expressed sufficiently well without first expanding the scope of the system model. The need for a minimum of two stages of expansion of this scope can be identified in:

  • a)   The explicit identification of a framework of state interpretation in (at least) the receiver.
  • b)   The accommodation of a sense of self definition of structure in (at least) the receiver.
  • Shannon's approach starts with an assumption that analysis is freely available somewhere so that an a priori probability has some validity as a basic value underlying the model. In the important cases of either those processes which operate in their own right or the processes which constitute our own mental mechanism this assumption rests on nothing. It is only valuable when the thing being viewed has purpose in terms of the consciousness of the speaker (analyst, theoretician, agent). In particular it was valid for the Shannon scenario of gaining maximum economic performance from artificial signalling channels with fixed artificial structures. For us to handle the emergent properties of autonomous and even self adapting processes then the state of the receiver in both structure and content must be treated as the essential a priori factor, not the state of the mind of the analyst or theorist who is delivering the theory. Furthermore, this total state of the receiver is something which is always subject to errors of estimation and descriptive expression. We are more concerned to embark upon an operational task of synthesis than to face an objective problem of analysis.


    **    Natural Channels

    If we look at the sorts of signalling channels found in nature there are some which are far from the ideal format suggested by the Shannon theory. In one form Shannon's channel capacity limit theorem states:

    C =   W. log2(1 + S/N)
        where   C   is channel information capacity in bits/second.
    W is available bandwidth on the channel in Hertz.
    S is received signal power.
    N is received noise power.

    This suggests that the best sorts of channels should make use of most of their bandwidth, that is they should transmit signals covering the entire spectrum and coded in certain ways, because that way they could achieve good economy.

    In nature we observe many types of signalling which appear not even to attempt to achieve this optimum. Practically all animal calls, and also human aural conversation deviate far from the ideal in this sense. Why might this be?

    Much seems to rest on the different needs between on the one hand establishing or maintaining contact, and on the other hand carrying out communications under the assumption of that contact. Shannon's theory is directed only at situations which can be characterised by statistical models with constant parameters whereas the need to make contact involves change. This usually leads to some form of idiosyncratic methods of using available channels, and these methods are almost as far as possible from the optimum for continuing communication. Indeed, wherever there is change occurring in the circumstances of the parties to a communication there tends to be some need for revision or maintenance of the channel(s), and this leads to a need for methods of use which would be called redundant by the standards of Shannon's theorem. But are they really redundant? If you want to experience the broad nature of communication using such so called "redundantly coded" channels then consider your own reactions to the various voices which can be heard in any cocktail party situation, and how the overall scene relies for its decoding upon various relativities of timbre and loudness of voice before verbal decoding becomes practically useful.

    Perhaps our use of the word redundant at this point often just shows how little we understand about what the purpose of the communication really is. We would perhaps do better to talk of "Shannon redundancy" in the same sense that a heat engine can be quoted a "Carnot efficiency". That way we would at least reduce the spurious sense of self proclaimed universality in our expression.

    In complex systems and in living things there is an essential sense of change occurring. The assumption of closed systems, equilibrium models and even the second law of thermodynamics are by this token distanced from the domain of real value in living systems. They can apply only to what has to some degree already happened. If we are either as biologists to understand or as engineers to serve living things then a restriction to those types of communication which are characteristic of such fixed systems does not seem suitable!. Where life is burgeoning then change must be occurring, and there our models and methods must still be able to operate.


    **    Transcending the Limits of Common Rationalistic Views

    Subjectivity is not the opposite of objectivity in this game; it is more a sort of mirror image of it reflected in some actual framework of interpretation (study of this is called hermeneutics). The person who speaks of either subjectivity or objectivity without qualifying it is implying the assumption of a universal model, and if that model were not available then neither term would have value. In the face of this difficulty the best we may be able to do is to base a local version of objectivity upon explicit agreement of a model rather than on its "universal truth". Unfortunately the sense of universal truth is so appealing that it is often accepted as the basis for work when the more laborious and painful path of establishing agreement of a model would be the honest and ultimately more valuable course of action, and sometimes would even lead to the simpler decisions for action. Also the path involving the use of strong local model creation demands a higher level of basic respect between parties to the communication than does any sort of externally obtained and therefore, in the given local sense, universal model. Nevertheless, in any handling of complexity the judgmental compromise between assumption of absolute given models and the creation of new ones is critically important.

    In biology and more recently in engineering too we wish to interact with the basis of identification of purpose in our models of processes. Because of this the generalisation which is needed to formulate the requisite new models involves a denial of the simple objectivity of Shannon's fixed purpose channel. The old scientific style of study of knowledge (epistemology) from a presumed universal frame of view needs to expand to include some degree of study of models of our self or our point of view (ontology).

    The work of the biologist Humberto Maturana (plus Varela and others around 1978) is reported to deal with these issues though not necessarily in these precise terms. I have not read the original work (it is a shame that it is no longer in print), but have encountered their biological view transformed into a sort of modern extension of artificial intelligence by Winograd and Flores in their book "Understanding Computers and Cognition" Ablex 1986. Central to these ideas is a concept which Maturana apparently termed "autopoiesis", which roughly speaking means the ability for a system to define its own boundaries or structure. Thus the Winograd and Flores work is ironically an engineering re-appraisal of what was originally biological systems work, so biologists and engineers are in this together!


    **    How shall we Model

    It is probably not essential as the only basis for expression of a model, but we might try by starting in the old fashioned way with the classical model of a state idealised as an infinitesimal region (a point) determined as somewhere in a continuously occupiable space (such as Euclidean n-space). If we do this then we must pay for the extravagance of this idealism by foregoing certainty as to the location of that point. In other words we can allow reality as existence of nothing but probability distributions over the state space (and even that still leaves a further problem needing attention regarding the limitation of certainty of the definition of the form of the state space or system structure itself, but leave that out for now). The need to curtail the model in this way arises as a consequence of the extravagant use of the infinitesimal concept of point state to describe a system for which we could never afford infinite means for its description. If we introduce redundant form to any model we must take care not to impute real meaning to the consequent excessively rich structure in our dealings with the model. I would prefer a model which started out by avoiding the redundancy, but our academic tradition is so fixated on using real number as our basis that we have to carry its unwanted apparatus for defining a sense of infinite limit on our backs even whilst we travel in lands where no such limits confront us.

    Both biologists and engineers have difficulties with moving beyond expressions based on fixed models, though I believe for slightly different reasons in their respective histories (biologists inherit morphological study habits, and engineers, modernist industrial habits). The need to posit meta-models in the form of spaces in which the real models may be variously conceived challenges the constant and hence independent basis of value. This prospect of the loss of a safe and reliable position of universal view often tends to evoke an antagonistic reaction. Today that is an unwarranted form of intellectual laziness in both of these disciplines. If universality of view is so important then meta-models of model spaces are a necessary minimum tool. Speaking for myself as a working engineer I am less finicky; much good work, and possibly even the better work, can be done with less pure forms of expression as is typical of pragmatic critical tradition in many of the arts. The growing use of "fuzzy logic" may be a reaction to this need; I have not yet been able to reach a conclusion in this respect about it.


    **    Thermal Noise

    As mentioned above, we can choose to conceive a practical system as having a state which can be treated as a vector point defined in a continuous space. We may often also need to consider it as existing at a practical temperature which is finite and not zero. In this model we can also accommodate many pseudo thermal situations and other forms of endemic noise where we choose to regard any disturbing influence of high dimensionality as having no predictable structure beyond some characterising distribution. As an immediate consequence of this model it is apparent that there is no bound to the energy and hence also the negative entropy required to be received from a channel in order to define with indefinite surety even a single binary state, because it would necessitate establishing a probability distribution over the state space in the form of an impulse (Dirac) function.

    Thus practical transmission, expression or observation of discrete data must involve some degree of loss of certainty of its value. Some conventional entropic measures of the p.log(p) sort from Shannon theory can be established to characterise these uncertainties of expression very nicely because thermal noise can be characterised well by rather low order probabilistic models. However this does not concern the actual patterns or values existing in such systems but only the costs and uncertainties of expression. So things like Kolmogorov complexity measures relating to discrete valued patterns of data are in a different domain from the measures of their observational certainties, and in real systems we are typically concerned with both domains.

    As a curious counterpoint to this sad necessity of loss of certainty in observation we can cite the possibility of certain types of transmission channels wherein the structural definition of the channel is achieved using media at some practical temperature, but for which the coupling between the transmitted energy and the thermal ambient of the transmission structure is at least in principle indefinitely low. I suppose we might consider these as "tightly coupled channels" in that they are indefinitely insensitive to their surroundings. However, to put this possibility into perspective, transmission losses in a channel militate against this form of low noise coupling according to a simple relationship involving the loss ratio and the ambient temperature of the transmission structure. On this basis we can distinguish between channels at the tight coupling end of the range from those at the loose coupling end. Some molecular couplings can apparently approach remarkably close to the tight coupling ideal, and the uncertainty can then presumably also approach close to the lower limit set by the Heisenberg quantum mechanical uncertainty induced by observation of results of state changes through the discriminative nature (i.e. the value paradigm) of the structure.


    **    A Basis of Interpretation

    In practical reception of discrete data the receiver must reconstitute discrete states given only noisy signals. Such a system must contain some sort of state dynamics which will enhance or select for discrete values of the state vector. For this to be possible there has to be a built in preference for some states as against others, and further, the dynamics must be energy lossy so that disturbances of state from the ideal preferred values (presumably at energy wells) shall sooner or later become dissipated.

    In such a system the framework of interpretation of data is usually at least to some degree implied by the structure of this dynamic state mechanism in the receiver, and also the sense in which occupancy of particular energy wells by the state point is to be regarded as the desirable result. If for instance the detailed trajectory of state as a whole were all that was required then an analysis of informational value based upon energy well occupancy would be spurious. This demonstrates the way in which an autopoietic system will interfere with any objective model which we might choose to use as the basis for Shannon information assessments. It will merely make nonsense of our measures and metrics if we disregard its implicit scheme of structure.

    To deal with this latter problem we must first set up a meta-model, recognising the receiver structure itself as having a state, and then consider the value of information in the context of this usually hypothetical structural state (c.f. synaptic state of a neural net). An immediate consequence of this approach is that any case where the structure imparted to a signal at its transmitter is not sympathetic to the current structure of the receiver dynamics can grossly attenuate the value of information transferred, even if noise and other such considerations approach perfect conditions. In this sense the information which can be received is strictly relative to the framework of its interpretation in the receiver.

    Worse still, there is no fundamental (i.e. generally objective) sense in which this information can be measured because as observers we might postulate different interpretations of value corresponding to state in the receiver. In practice this problem must be handled by means of some additional source of definition of what is good value. We might consider the framework based upon state space attractor points all having say an equal value, but that is frequently clearly not a practical basis (systems often carry lots of nonsense regions of state which are hardly ever entered, and which mean almost nothing in value). We might consider the entropic framework based upon the probability density distribution of occupancy of the various attractor regions. This is more interesting, but it defies objective definition in practical cases because of its self referential nature in the absence of a structural definition of the intent of the transmitter agent.


    **    Conclusion

    It is from this intellectually somewhat insecure point that we must set out upon the modern (should I say post-modern?) pioneering mission to formulate and discuss models for complex systems, not just using rigid single model objectivity, even when it is as elegant as Shannon's theory.


    Phone and e-mail   see foot of AGB home page.
    Back to AGB home page. Copyright A.G.Booth, London 1994-2002 All rights reserved