Gongshi—spirit stones—along with the stone formations of Joshua tree have been great inspirations for me when studying emergence. Glad to see someone else appreciating this connection
Hey Thomas! So \deltaCP can be negative or zero. A coarse-graining might make it worse, for instance. It’s only a small subset of scales for which it’s positive, and that’s what gets included in the emergent hierarchy.
Ok, that makes sense (too many years of thinking abut PID has clearly infected my brain). So when \deltaCP is negative, that means that this particularly coarse graining has lower causal effectiveness?
Since CP (effectiveness) is non-monotonic on the lattice, I don’t think any sensible dCP would be nonnegative. It’s also *not* the Möbius inversion of CP, so it’s different from PID in many ways. (Though there are other ways to create a nonnegative decomp of causality: https://arxiv.org/abs/2501.11447)
The degeneracy is trickier to understand but, in principle, quite similar. Degeneracy would be maximal (= 1) if all the states deterministically led to just one state (i.e., all causes always had the same effect in the system). If every cause led deterministically to a unique effect, then degeneracy would be 0.
i have hard time to understand how degeneracy can be either 1 or 0 for the same situation ? did i miss something ?
Ah, yes, the phrasing there is confusing. Technically correct but, on a re-read, confusing (I'll edit to add a bit of context).
So all causes -> [one effect] with p = 1 would mean that degeneracy = 1. Think of it like a die roll but all the sides are painted with the same number.
If all causes -> [a different effect for each] with p = 1, then degeneracy = 0. So think of it like there are as many effects as there are causes, and each cause "owns" a unique effect.
The inverse of degeneracy is called "specificity," and is more intuitive, in that a high specificity means that each cause has a specific effect.
This is fascinating, thanks. I'm interested in understanding emergence and social networks and things for my own research interests (ethics and social epistemology, mainly). Beyond the readings you've mentioned, could you recommend any (additional) primers on information theory, network science, complexity science?
Bravo! Well done, and beautifully explained. But I confess to some minor annoyance at the end that you defer applying this theory to the question of free will. I hope you can rest from your labors for a bit, then gird your loins and come back to help clarify this controversial issue, which has been muddled in just about every possible way, from the microscale to the macroscale....
Thank you. And fair on the annoyance - I just figured that after 5,000 words I'm pushing the limits of readability.
Basically, I think free will is just when your consciousness (or its neural correlates, if we are to be metaphysically neutral) has a strong causal contribution in the hierarchy of scales. Since presumably it must be relatively macroscale, this means that free will requires a "top-heavy" macroscale (or at least, somewhat top-heavy). Predictions would be that, e.g., the brain is much more top-heavy compared to other parts of the body in its operations.
Is this sufficient for free will? No, I think not. But I think it is necessary for it. But, at some limit of necessary conditions I think you get a sufficient condition. So for instance, if a system were: (a) top-heavy in this causally-emergent fashion in a way that corresponded to its neural correlates of consciousness/mind, and (b) had source irreducibility (it cannot be predicted except by its own actions), and (c) had the ability to consider things like counterfactuals... THEN I think this is probably sufficient for free will.
To be clear, I was gently teasing--I'm fully aware that the topic of free will would require considerably more space (and energy) to handle. I agree with your take above, but I think it would be interesting and valuable for your readership to engage the question in a more thorough-going way. Especially since Robert Sapolsky's "Determined" has become rather overwhelmingly popular ...
1. Practically, you can compute dCP for a current level node in a bottom up fashion, and only need to look at significant dCP lower level nodes, not all of them at the lower level (and certainly not 2 or more levels below), right?
2. Is there any "physical" isomorphism to the brain, in going from a network of neurons' adjacency matrix to a TPM? Is this part of how the brain does classification?
3. Is there any physical isomorphism to the brain, in going from a lower level of coarse graining to a higher level and picking out the highest dCP nodes?
4.Is there any physical meaning to the edges between highest dCP nodes between levels?
1. Yes, you can just look at all the paths to the current node.
2. I'm not sure exactly what an isomorphism would imply here, but I would say is I would be very unsurprised if there aren't strong relationships to things like classification and causal emergence.
3. I would say possible maybe concept formation, for the same thing as (2) (or maybe it could be worked out with Neural Darwinism).
4. There's "physical meaning" to every edge in that it represents some refinement, i.e., it represents the spatiotemporal structure becoming coarser (as you go "up" the edge) or finer (as you go "down.") But I don't know of any specific physical meaning other than that.
Really fascinating work Erik! It’s a shame this wasn’t published a few months earlier; I wrote an essay for the Berggruen Institute competition that you got tapped by and shared the information for here on your blog that draws on your Causal Emergence 1.0 paper with effective information. The basic premise for it argues for using a philosophical framework rooted in dialectical materialism to describe consciousness as an emergent process, and it relates the sort of scientific work in causal emergence you’ve done to that. It’s not published obviously, as I know there are restrictions for the competition, but if you were interested in any sort of philosophical work that draws on causal emergence to support its framework I’d be happy to share the material with you! I think what your work here is arguing for still supports what I wrote about, if not having potential to enhance my argument further.
This is very interesting as always, but I have a hard time understanding how irreducibility is deduced:
-The definitions of causal primitives (Determinism & Degeneracy) lead to a score for each scale for their causal contributions ∆CP.
-if this measure goes up on a particular scale, we then say that this has emergent causality that is not reducible.
But on an intuitive level, we can still get the intermediate scale from the microscale.
So how do you interpret the gain in ∆CP at that scale? Is it that it’s simply a better description of the whole system at that scale or that there is really something being explained at that intermediate level that is simply impossible at the microscale?
I think something is being "added" or "explained" - the error-correction itself.
In terms of interpretation, I would say causal emergence is a kind of "non-mysterious irreducibility."
You might very sensibly question if this is a real category!
I think it is. Let's take a completely different scenario but one that I think inarguably demonstrates "non-mysterious irreducibility," : the joint causation of an XOR gate.
An XOR gate's output cannot be reduced to just one of its inputs. In this, it is different in kind than an AND gate, e.g., where you can get some information about the output based on a single input. If one AND input is 0, then the AND gate must be outputting 0. But if just a single XOR's input is known, we cannot, in principle, say anything about the output. The output is totally irreducible to the lone input. But this irreducibility is "non-mysterious" in that, once you understand how an XOR functions, you can trace exactly how it happens.
Analogously (not identically) I think causal emergence is of this same kind of "non-mysterious irreducibility." The macroscale causal relationships really are [stronger / more powerful / offer better explanations / etc.] but they are so via error-correction. You can trace exactly how that happens, how the noise gets minimized, how the specificity increases, etc., but this doesn't shift the gain down to the microscale, it is just the explanation for how the gain happened.
I guess I would have to think about it more. The first thing that comes to mind (pushing the analogy) is that all Propositional connectors are reducible to NOT(~) and OR.
XOR is then :
(A AND ~B) OR (B AND ~A) the AND being defined from OR and NOT.
The gain at the intermediate scale here, would be that you can build a combination of connectors from basic ones where some property -that was present initially- is no longer valid.
Meaning that initially (with OR and NOT), the input always gives some info on the output, but when we combine them, this is no longer the case. I guess we can call that emergent but I am not sure about causal.
Erik you’ve sent me into a deep thought spiral with this one and trying to figure out which layers of biology are more causally ‘weighty’
I feel the molecular bio paradigm biases us to think it’s all epiphenomenal but just ‘too complex to model’ whereas here you’re formalising how different layers can be irreducible to the layers below. It would explain a lot, and suggests avenues for control of biology that aren’t simply bottom-up.
So there's the broad multiscale description (the "un-hewn" set of partitions/scales) and then there's the emergent hierarchy (the "hewn" set of remaining partitions/scales that do causally contribute). Which do you mean here?
I think the closest thing to self-similar fractality would be the case of "literal scale-freeness."
Reading this was interesting, and I gave it a like, but it reminded me of my general unease with the entire STEM endeavor, where there's always this rush to move from "how can we understand nature" to "how can we manipulate it" with little regard for how small a place we have in this larger system that we have this manic impulse to tinker with. That desire to control always bites us in the ass.
I see what you mean, and part of what motivated me to work on this is a similar worry: given that we are already intervening on so many complex systems, can we create a way to make sure certain parts preserve their agency (like an ecosystem retaining causal power, vs all causal power residing with humans). That might require being able to engineer emergence to enforce stability of causal power at the desired level.
Could this approach be applied to find a good way of dividing a phenomenon with gradients into discrete parts? ("discrete parts", which normies call "things" btw)
For instance, we can easily separate the worldmap into ocean and land, since the coastline forms a relatively clear boundary. But if we want to divide it further into ocean, plains, and mountains, could this method help us build a meaningful model?
I’m thinking of this example in relation to Turchin’s work on the historical significance of the horse and the Eurasian steppe, as well as Tomás Pueyo’s hypothesis that mountainous regions were where tropical societies developed most successfully prior to the industrial revolution.
Also, can this be used to test your theory on there being a phase change around the Dunbar number separating small tribes with personal knowledge of the members to large scale societies that need institutions?
What if you can find and demonstrate that many third world nations have lots of government corruption because the most important scale is the Dunbar scale instead of the national scale?
There is a generalized Dunbar number for each coarse grained level. In fact, the original Dunbar number is probably a product of the family DN and tribe DN (DN for family is about 5 humans and DN for tribe is about 30-50 families). And yes, there is a total analogy between free riding management by a level (without it you get corruption or other problems of misalignment between parts and higher level) and error correction in a brain or ANN.
Fantastic! So it would be possible to study actually existing societies and adjust constitutions, laws, and policies accordingly? Or to study if Colin Woodard
Gongshi—spirit stones—along with the stone formations of Joshua tree have been great inspirations for me when studying emergence. Glad to see someone else appreciating this connection
Is there a proof that \deltaCP will always be non-negative on the partition lattice?
Hey Thomas! So \deltaCP can be negative or zero. A coarse-graining might make it worse, for instance. It’s only a small subset of scales for which it’s positive, and that’s what gets included in the emergent hierarchy.
Ok, that makes sense (too many years of thinking abut PID has clearly infected my brain). So when \deltaCP is negative, that means that this particularly coarse graining has lower causal effectiveness?
Since CP (effectiveness) is non-monotonic on the lattice, I don’t think any sensible dCP would be nonnegative. It’s also *not* the Möbius inversion of CP, so it’s different from PID in many ways. (Though there are other ways to create a nonnegative decomp of causality: https://arxiv.org/abs/2501.11447)
The degeneracy is trickier to understand but, in principle, quite similar. Degeneracy would be maximal (= 1) if all the states deterministically led to just one state (i.e., all causes always had the same effect in the system). If every cause led deterministically to a unique effect, then degeneracy would be 0.
i have hard time to understand how degeneracy can be either 1 or 0 for the same situation ? did i miss something ?
Ah, yes, the phrasing there is confusing. Technically correct but, on a re-read, confusing (I'll edit to add a bit of context).
So all causes -> [one effect] with p = 1 would mean that degeneracy = 1. Think of it like a die roll but all the sides are painted with the same number.
If all causes -> [a different effect for each] with p = 1, then degeneracy = 0. So think of it like there are as many effects as there are causes, and each cause "owns" a unique effect.
The inverse of degeneracy is called "specificity," and is more intuitive, in that a high specificity means that each cause has a specific effect.
thanks for the clarification, and i <3 you !!! you made my day and even more
This is fascinating, thanks. I'm interested in understanding emergence and social networks and things for my own research interests (ethics and social epistemology, mainly). Beyond the readings you've mentioned, could you recommend any (additional) primers on information theory, network science, complexity science?
Here you go: https://fernandonogueiracosta.wordpress.com/wp-content/uploads/2015/08/yaneer-bar-yam-dynamics-of-complex-systems.pdf
Bravo! Well done, and beautifully explained. But I confess to some minor annoyance at the end that you defer applying this theory to the question of free will. I hope you can rest from your labors for a bit, then gird your loins and come back to help clarify this controversial issue, which has been muddled in just about every possible way, from the microscale to the macroscale....
Thank you. And fair on the annoyance - I just figured that after 5,000 words I'm pushing the limits of readability.
Basically, I think free will is just when your consciousness (or its neural correlates, if we are to be metaphysically neutral) has a strong causal contribution in the hierarchy of scales. Since presumably it must be relatively macroscale, this means that free will requires a "top-heavy" macroscale (or at least, somewhat top-heavy). Predictions would be that, e.g., the brain is much more top-heavy compared to other parts of the body in its operations.
Is this sufficient for free will? No, I think not. But I think it is necessary for it. But, at some limit of necessary conditions I think you get a sufficient condition. So for instance, if a system were: (a) top-heavy in this causally-emergent fashion in a way that corresponded to its neural correlates of consciousness/mind, and (b) had source irreducibility (it cannot be predicted except by its own actions), and (c) had the ability to consider things like counterfactuals... THEN I think this is probably sufficient for free will.
To be clear, I was gently teasing--I'm fully aware that the topic of free will would require considerably more space (and energy) to handle. I agree with your take above, but I think it would be interesting and valuable for your readership to engage the question in a more thorough-going way. Especially since Robert Sapolsky's "Determined" has become rather overwhelmingly popular ...
Bravo! I wonder about this applied to connectome studies.
Cool! I hope you win the Turing prize for this.
1. Practically, you can compute dCP for a current level node in a bottom up fashion, and only need to look at significant dCP lower level nodes, not all of them at the lower level (and certainly not 2 or more levels below), right?
2. Is there any "physical" isomorphism to the brain, in going from a network of neurons' adjacency matrix to a TPM? Is this part of how the brain does classification?
3. Is there any physical isomorphism to the brain, in going from a lower level of coarse graining to a higher level and picking out the highest dCP nodes?
4.Is there any physical meaning to the edges between highest dCP nodes between levels?
Kind words, ty!
1. Yes, you can just look at all the paths to the current node.
2. I'm not sure exactly what an isomorphism would imply here, but I would say is I would be very unsurprised if there aren't strong relationships to things like classification and causal emergence.
3. I would say possible maybe concept formation, for the same thing as (2) (or maybe it could be worked out with Neural Darwinism).
4. There's "physical meaning" to every edge in that it represents some refinement, i.e., it represents the spatiotemporal structure becoming coarser (as you go "up" the edge) or finer (as you go "down.") But I don't know of any specific physical meaning other than that.
Really fascinating work Erik! It’s a shame this wasn’t published a few months earlier; I wrote an essay for the Berggruen Institute competition that you got tapped by and shared the information for here on your blog that draws on your Causal Emergence 1.0 paper with effective information. The basic premise for it argues for using a philosophical framework rooted in dialectical materialism to describe consciousness as an emergent process, and it relates the sort of scientific work in causal emergence you’ve done to that. It’s not published obviously, as I know there are restrictions for the competition, but if you were interested in any sort of philosophical work that draws on causal emergence to support its framework I’d be happy to share the material with you! I think what your work here is arguing for still supports what I wrote about, if not having potential to enhance my argument further.
This is very interesting as always, but I have a hard time understanding how irreducibility is deduced:
-The definitions of causal primitives (Determinism & Degeneracy) lead to a score for each scale for their causal contributions ∆CP.
-if this measure goes up on a particular scale, we then say that this has emergent causality that is not reducible.
But on an intuitive level, we can still get the intermediate scale from the microscale.
So how do you interpret the gain in ∆CP at that scale? Is it that it’s simply a better description of the whole system at that scale or that there is really something being explained at that intermediate level that is simply impossible at the microscale?
I think something is being "added" or "explained" - the error-correction itself.
In terms of interpretation, I would say causal emergence is a kind of "non-mysterious irreducibility."
You might very sensibly question if this is a real category!
I think it is. Let's take a completely different scenario but one that I think inarguably demonstrates "non-mysterious irreducibility," : the joint causation of an XOR gate.
An XOR gate's output cannot be reduced to just one of its inputs. In this, it is different in kind than an AND gate, e.g., where you can get some information about the output based on a single input. If one AND input is 0, then the AND gate must be outputting 0. But if just a single XOR's input is known, we cannot, in principle, say anything about the output. The output is totally irreducible to the lone input. But this irreducibility is "non-mysterious" in that, once you understand how an XOR functions, you can trace exactly how it happens.
Analogously (not identically) I think causal emergence is of this same kind of "non-mysterious irreducibility." The macroscale causal relationships really are [stronger / more powerful / offer better explanations / etc.] but they are so via error-correction. You can trace exactly how that happens, how the noise gets minimized, how the specificity increases, etc., but this doesn't shift the gain down to the microscale, it is just the explanation for how the gain happened.
I guess I would have to think about it more. The first thing that comes to mind (pushing the analogy) is that all Propositional connectors are reducible to NOT(~) and OR.
XOR is then :
(A AND ~B) OR (B AND ~A) the AND being defined from OR and NOT.
The gain at the intermediate scale here, would be that you can build a combination of connectors from basic ones where some property -that was present initially- is no longer valid.
Meaning that initially (with OR and NOT), the input always gives some info on the output, but when we combine them, this is no longer the case. I guess we can call that emergent but I am not sure about causal.
Erik you’ve sent me into a deep thought spiral with this one and trying to figure out which layers of biology are more causally ‘weighty’
I feel the molecular bio paradigm biases us to think it’s all epiphenomenal but just ‘too complex to model’ whereas here you’re formalising how different layers can be irreducible to the layers below. It would explain a lot, and suggests avenues for control of biology that aren’t simply bottom-up.
Isn't your concept of multiscale descriptions just fractality? People usually conflate fractality with its most famous form, self-similar fractality.
So there's the broad multiscale description (the "un-hewn" set of partitions/scales) and then there's the emergent hierarchy (the "hewn" set of remaining partitions/scales that do causally contribute). Which do you mean here?
I think the closest thing to self-similar fractality would be the case of "literal scale-freeness."
I mean the first one. But i foolishly wrote this comment before reading all your entry. It's great! Sorry and thanks.
Good instincts though!
Reading this was interesting, and I gave it a like, but it reminded me of my general unease with the entire STEM endeavor, where there's always this rush to move from "how can we understand nature" to "how can we manipulate it" with little regard for how small a place we have in this larger system that we have this manic impulse to tinker with. That desire to control always bites us in the ass.
Real
I see what you mean, and part of what motivated me to work on this is a similar worry: given that we are already intervening on so many complex systems, can we create a way to make sure certain parts preserve their agency (like an ecosystem retaining causal power, vs all causal power residing with humans). That might require being able to engineer emergence to enforce stability of causal power at the desired level.
I had the same thought.
Could this approach be applied to find a good way of dividing a phenomenon with gradients into discrete parts? ("discrete parts", which normies call "things" btw)
For instance, we can easily separate the worldmap into ocean and land, since the coastline forms a relatively clear boundary. But if we want to divide it further into ocean, plains, and mountains, could this method help us build a meaningful model?
I’m thinking of this example in relation to Turchin’s work on the historical significance of the horse and the Eurasian steppe, as well as Tomás Pueyo’s hypothesis that mountainous regions were where tropical societies developed most successfully prior to the industrial revolution.
Also, can this be used to test your theory on there being a phase change around the Dunbar number separating small tribes with personal knowledge of the members to large scale societies that need institutions?
What if you can find and demonstrate that many third world nations have lots of government corruption because the most important scale is the Dunbar scale instead of the national scale?
There is a generalized Dunbar number for each coarse grained level. In fact, the original Dunbar number is probably a product of the family DN and tribe DN (DN for family is about 5 humans and DN for tribe is about 30-50 families). And yes, there is a total analogy between free riding management by a level (without it you get corruption or other problems of misalignment between parts and higher level) and error correction in a brain or ANN.
Fantastic! So it would be possible to study actually existing societies and adjust constitutions, laws, and policies accordingly? Or to study if Colin Woodard
's 11 North American nations actually exist?
I think there is much to be done before we tackle the problems you described. Let's not get ahead of what we know.