February 2014

‘Compressiveness’, back by request:

Jürgen Schmidhuber’s Theory

Jürgen Schmidhuber, an AI theorist and theoretical computer scientist, has proposed a computational account of aesthetic judgments. On his view, a stimulus is judged to be beautiful or attractive by a subject T to the extent that the stimulus is compressible for T. Schmidhuber’s notion of compressibility is taken from algorithmic information theory, but concerns actual rather than ideal compression: it refers to the actual # of bits in T’s mental representation of the stimulus, bounded and fallible as T may be. Beholden to the limitations of T’s computational resources, two kinds of stimuli should be the most compressible: stimuli with evident internal structure (e.g. fractals or a chessboard), and stimuli with noticeable similarities to stimuli already stored in T’s history1 (e.g. English words, or the sight of a friend’s face). Experimental psychology supports both a preference for stimuli with internal patterns and a preference for stimuli with a similarity to past stimuli.

There are obvious problems with this account if we take it as a full account of beauty. A chessboard, while very simple, would rarely be called beautiful. Moreover, it seems that the most profound aesthetic experiences often come from complex stimuli: the city of Rome, the philosophy of Plato or Wittgenstein, art by Picasso, Joyce or Stravinsky. Schmidhuber argues for explaining beauty as compressibility, but it may be better to identify compressibility with attractiveness, pleasantness, or niceness. Schmidhuber explains the lack of strong preference for very simple stimuli by their not being interesting. Schmidhuber gives interestingness a simple formal analysis in terms of compressibility. Whereas beauty (or attractiveness, etc.) is the subjective compressibility of a stimulus, interestingness is the rate at which the subjective compressibility changes over time as T processes the stimulus: Beauty (etc.) of stimulus S for subject T = -# bits in T’s mental representation of S Interestingness of S for T = rate of change in the # of bits used to represent S by T over time = d(Beauty) / d(Time).

Extending Schmidhuber: art, compressibility, and ‘compressiveness’ 

I think that ‘interestingness’ puts Schmidhuber on the right track, but that considering a stronger property relating a stimulus to compression progress can further contribute to our understanding of aesthetic objects in art, math, philosophy and music. Recall that (if you accept Schmidhuber’s basic approach) when a subject T encounters a novel stimulus S, T searches for ways to encode S on the basis of the extant objects in her history.  (We say that an object x can be used as a basis for encoding S if the length of the shortest code for x+S is smaller than the sum of the lengths of the shortest code for x and the shortest code for S. This relation doesn’t guarantee that every history containing x can be used to effectively compress S, but this further fact should hold whenever specifying a pointer to x in the history is sufficiently cheap.)  Could this search have implications beyond determining the length of T’s representation of S? I believe it can: If T’s search reveals multiple different effective ways to compress S using (respective) different objects in her history, this may give T indication that she can use S to improve the compression of her extant history2.  In the strongest case, it may indicate that T should use S to encode the various objects that could have each been used to encode S. More modestly, it may indicate that there is an unexploited compressive relationship between the various objects that could have each been used as a basis for encoding S.

We can define a property called ‘compressiveness’ to formalize the above idea3. Vitanyl et al. present a metric called the normalized information distance (‘NID’) between two strings, defined by NID (x,y) = max {K(x|y),K(y| x)}/ max{K(x),K(y)}, where K(x) is the Kolmogorov complexity of x. Let us define SK(x) as the subjective complexity of a string x, s.t. SK(x) is the length of the subject’s actual program for generating string x. Let us then define a subjective normalized information distance, SNID, by replicating NID in terms of SK. We can now use SNID to define compressiveness:

A stimulus z is compressive for a subject T if it violates the triangle inequality SNID (x, y) ≤ SNID (x, z) + SNID (z, y) for some objects x, y in the subject’s history.

The idea is the following: Because the triangle inequality always holds for the objective NID (Vitanyl et al.), if T detects that her SNID violates the triangle inequality for z, T learns that there are unexploited patterns in her history, and that z is the ‘key’ to these patterns. Compression gains reducing SNID (x, y) to SNID (x, z) + SNID (z, y) or below follow under plausible assumptions. (Prima facie, compressiveness is a stronger property than Schmidhuber’s ‘interestingness’: exposure to compressive stimuli constitutes a net reduction in the absolute # of bits used to represent one’s history.) Informally, a novel stimulus S is ‘compressive’ for T if S is tractably (for T) related to multiple other objects whose relation to one another was not independently tractable (for T). We might think of compressive stimuli as previously undiscovered ‘prototypes’ for objects in T’s history of stimuli, allowing the construction of new prototype-based concepts that cluster previously disparate objects together. (Compare with Poincaré: “The [beautiful]4 mathematical facts are those which, by their analogy with other facts, are capable of leading us to the knowledge of a mathematical law… They are those which reveal to us unsuspected kinship between other facts, long known, but wrongly believed to be strangers to one another. … Among chosen combinations the most fertile will often be those formed of elements drawn from domains which are far apart.”) 

I suggest that compressive stimuli have a key role in aesthetics. A major part of modern aesthetic discourse concerns stimuli that ‘resonate’ with many previously disconnected things one has encountered, felt or thought, and in so resonating reveal new affinities between these previously disconnected things. One often praises an artwork for being ‘uncanny’ or ‘strange yet familiar’, or for being ‘richly evocative’, or for ‘concentrating a very great number of experiences’ (cf. Eliot) or ‘revealing the before unapprehended relations of things’ (cf. Shelley). These aesthetic merits are often understood to be closely related to the capacity of art to define new concepts via prototype (cf. Shelley, Coleridge, Carnap, Dilthey): when the novel affinities revealed by an artwork are sufficiently strong, one talks about an artwork ‘articulating’ a general phenomenon or pattern that is otherwise hard to pin down, or about an artwork serving as the prototype that defines a category that is hard to otherwise define. (E.g. ‘Kafkaesque experience,’ ‘Pinteresque conversation,’ ‘Orwellian society.’) The idea of compressiveness thus seems strongly implicit in certain aspects of modern aesthetic discourse, both in aesthetic theory and in the practice of literary criticism. One sees a particularly strong connection to the role of indeterminacy, hybridity and abstraction in Modernist art and literature: the ideal stimuli for ‘breaking’ the triangle inequality between two objects are the minimal — i.e. the most compressible — exemplars of a structure common to both objects (or of a structure closely resembling the respective structure of each object.) For example, it is often stated that Kafka’s short stories capture a structure of experience — the ‘Kafkaesque’ — that one finds in a range of disparate experiences (or conceptions of experiences), making a Kafka story equally evocative of e.g. the experience of going to the bank, the experience of being broken-up with, the experience of waking up in a daze, the experience of being lost in a foreign city, or the experience of a police interrogation. It’s plausible that hybrid elements – mixtures of elements from different scenarios – work together with the structural resonance of the text to draw out these respective scenarios as candidate bases of reference, whose fitness to function as bases of reference a compression-test then confirms. The story thus functions as a nearly-minimal concrete model of the abstract structure shared by the disparate experiences that fall under the predicate ‘Kafkaesque,’ allowing to group together experiences  that embody this structure the at whatever level of abstraction. 

___________________________________________________________________

1 We can define a basic ‘programming language’ L for T and then define the compressibility of a stimulus S as T’s shortest known efficient program (in L) for generating S given use of T’s stored history of stimuli, or we can regard T as constantly optimizing her ‘programming language’ to match observed probabilities and define the compressibility of S directly as T’s shortest known efficient program for generating S.

2 Joint algorithmic complexity is symmetric up to an additive constant, so K(x,S) doesn’t determine K(S,x) but does put a bound on it. 

3 Of course, this only applies to imperfect compressors like humans. An optimal compressors would simply come up with S a-priori.

4 Notice that a drive to discover compressive stimuli is already implied by Schmidhuber’s ‘compression progress drive,’ which we accept without modification. We differ from Schmidhuber in differently relating the compression progress drive to aesthetic judgements of stimuli, by distinguishing compressive stimuli from merely interesting stimuli as the class of stimuli whose relation to compression progress corresponds to aesthetic beauty.

5 Poincare originally writes this of 'the mathematical facts worthy of being studied’. Later, Poincare writes:'The only mathematical facts worthy of fixing our attention and capable of being useful are those which can teach us a mathematical law. So that we reach the following conclusion: The useful combinations are precisely the most beautiful.“ 

(Parts of the text are taken from shared work with Owain Evans, MIT.)

It’s barely represented in TV and film that couples spend most of their private time talking weird gibberish. 

Ten years ago my taste – like, only liking things that have this alienated German-Romantic frigidness to them, in a Schlegel irony sense but more importantly a Schiller pathos sense where frigidness necessitates an emotional earthquake to constitute itself as a sublation of  –  was considered *so* masculinist, if arguably queer. Now it’s kind of the defining feminine aesthetic of the moment, with Gaga and Lana Del Rey and Lorde. (Previously seen in 2003-2006 hip hop, between when Jay-Z finished teaching everyone how to do it and when Jay-Z forgot how to do it.) 

An Integral Over The New Inquiry: ‘Facebook is affective labour for the NSA.’