The world has changed quite a bit since I last wrote a blog post.
Less than a year ago, it seemed like crypto was the future and everyone was scrambling for a piece of the next gold rush. Now everyone is telling each other: crypto is dead, it was another bubble without any intrinsic value, and now ChatGPT and AI is the future; AGI is coming for our jobs, and more.
For the past couple of months, in the midst of one of the world's largest crypto collapses, I took my time trying to correctly assess this renewed hype cycle. Interestingly enough ChatGPT launched last November, exactly when venture capital money was desperately looking for an exit from the Terra and FTX drama – perfect timing to keep the market hype going.
Underlying technologies for ChatGPT (and other generative models) have been around for years; more specifically, the underlying Transformer architecture for GPT – short for "Generative Pre-trained Transformer" – has been released by the Google Brain team back in 2017. Side note: the Transformer was proposed as an attention-only model architecture to mitigate problems with trying to store the entire state vector in a fixed length context vector and the diminishing gradient problem – basically allowing the model to "pay attention" to vector traits that are weighted higher as needed, but without any of the overhead of adding attention on top of existing models, i.e. LSTMs, Seq2Seq. I won't go too deep into implementation details here as that is outside the scope of this post.
OpenAI chose this exact moment to launch an interactive chat interface based on GPT-3, their large language model (LLM) implementation, to trigger a cambrian explosion of products and new research developments in generative AI. Y Combinator applications were filled with ChatGPT wrappers that more or less builds on the same API and, in theory, are capable of exactly the same thing.
Apart from this renewed hype cycle, however, let's take a step back: now that the forbidden genie is out of the bottle, what could actually happen? Is this really another iPhone moment? Did we pull out the term "Web 3.0" too early, as some investors may tell people?
What even is thought and intelligence, anyways?
Since the inception of human civilization, we as a species developed a unique tool no other creature on this planet have ever acquired: thought, otherwise known as intelligence.
This event is so important it is even cited by the Bible as the cause for humanity's intrinsic sins and evil. But usually we don't give second thoughts to why and how we are given the ability to think, as it is so prevalent and a defining feature of us as humans. No other alien species we know of have ever acquired intelligence either.
At least, until now.
The reason why ChatGPT gathered so much attention in a short period of time is because it's the first publicly available service to be even remotely close to be considered having human-like intelligence. Well, sort of. AI researchers will always tell you large language models do not have intelligence like we do, rather it is only making a probabilistic guess learned from its training data on what word (or token) is the most likely to come after a given set of inputs.
But for all purposes ChatGPT (and its derivatives, like Bing Chat) seem like it has some sort of intelligence – hence the name artificial intelligence. If a language model can write university essays or pass bar exams better than humans do, what does it even mean for a language model to technically not have intelligence – when, in practice, it can definitely be used to replace human intelligence at some capacity?
To properly answer this, we need to bring some axioms buried deep, deep down with ancient philosophy back from the dead – when science still wasn't properly separated from philosophical research and practices.
Axiom 1: thought and intelligence is an invention and a tool; they are not something we are born with.
Few people ever come to confront this, but thought itself is a human invention. In other words, we were taught to think the way we think, talk the way we talk, and act the way we act. We were taught to make logical conclusions. We were taught to write and draw. We were taught how to hang with other people. We were even taught how to use the toilet properly.
A great example that demonstrates this are feral children. There are very few examples of children being raised by animals completely isolated from humans, but when they do, they act more like animals that raised them and not like other humans. This means the concept of thought and intelligence in itself is an invention – or, a discovery, depending on how one sees it – like the first human-controlled fire.
So before humanity acquired their ability to think, who taught the first human how to think and make logical conclusions?
Bing AI gave me the following answer:
The question of who taught the first humans how to think is a complex one. According to a source, early humans could express thoughts and feelings by means of speech or by signs or gestures. They could signal with fire and smoke, drums, or whistles.
One possibility is that humans learned to think through social learning via technological innovations across generations - known as cumulative cultural evolution. Another possibility is that humans evolved the ability to think as they developed more complex tools. However, there is no clear answer as to who taught the first humans how to think.
Clearly recent research is saying intelligence came as part of evolution, but does not know where intelligence came from – i.e. it is impossible to pinpoint who acquired intelligence first. But that is enough to demonstrate my point here: that thought in itself, regardless of how it came to be, is a result of evolutionary algorithms to optimize for given environments. Generations of humans needed to invent and teach thought to later generations, and iterated them over time.
Once we understand intelligence is not something we were given with, and is a form of invention from incremental, evolutionary iterations across generations, it becomes easier to understand the following arguments.
Axiom 2: thought and intelligence may have different forms, but builds on two common foundations: intuition and logic.
This is trivial, but extremely important to properly assess before moving on.
Like intelligence, logic and reasoning is also a form of invention built on top of intelligence. They are tools to understand, analyze, and propose solutions to a given problem by defining a minimal set of deterministic and universal truths, and explaining everything as a combination of these universal truths.
While also impossible to pinpoint where exactly logical reasoning started, we know why they are required: human minds are extremely subjective and limited in single-session processing (i.e. no human can keep all information required for logical reasoning in memory), therefore we need to start from a set of information that can be universally accepted among any educated human. We record this information in another human invention we call writing and language, such that other humans can pick up from what others have worked on to solve problems no single human can ever achieve on its own.
However, what we inherited from our animal ancestors – intuition – is also an important part of our intelligence. We cannot explain art, music, and our own emotions without them. We have learned to weave these two foundations together to create an extensive collection of knowledge, culture and creations – something we thought machines will never be able to replicate.
Not for long.
Thinking is an optimization problem, and humans suck at it
Quite literally, thinking is an optimization problem.
Any problem can be denoted as an optimization problem. Humans have developed logical thinking as a tool to effeciently tackle this; our species have been able to conquer diseases, build entire cities & civilizations, and even set foot on the moon with this incredibly powerful tool.
However, fundamentally, any solution to any problem is inferior to solutions that tackle the optimization problem directly.
Argument 1: all forms of intelligence are tools for solving an optimization problem, and different forms of intelligence solve for different optimization problems.
This is where things get incredibly interesting.
We are used to only accepting what can be explained (i.e. composed from existing logic that are proven to be true) with our own logic and knowledge as universal truths, as we know our brains often make incorrect, biased conclusions if we do not. Any other form of argument or statement, regardless of how convincing it sounds, are not accepted as universal truths until it can be proven with logical thinking – whether through scientific (experimental) or mathematical (logical) methods.
Unfortunately, what acted as protection against human cognitive bias for thousands of years is no longer relevant once we have enough computation power.
Why? Well, think about it this way:
The answer to any problem we want to solve or optimize for can be denoted as a mystery vector. We don't know in what dimension it is in, how long it is, or in what denomination.
What matters is the ground truth for anything can be denominated as a probability cloud with multiple perspectives, and therefore, as long as we have enough scalar values to represent our mystery vector such that it is close enough to the ground truth, then that is enough for our purposes.
Now, the way any deep learning algorithm works is through iteration. You have the ground truth encoded as some vector defined internally (done with an encoder and decoder layer), and you run a computer program that calculates how much loss a single guess has compared against the ground truth. The program shifts its vector values on the $x$-dimensional plane to make that loss value smaller, and iterates this over and over again. Often the algorithm incorporates test values not included with the training dataset to make sure it is not learning undesirable noise traits (often known as "overfitting") and only learning what is actually common over a desired dataset, i.e. the ground truth.
At first glance, this seems like a fairly stupid way to learn something. Why iterate over something very trivial when you can just, well, memorize or understand how things work? The magic of this approach can be summarized with two properties:
- Costs of iterating some simple, generalizable computation operation closer to infinity gets exponentially cheaper because economies of scale.
- We don't need to come up with logic to "explain" or "prove" how things work to others.
Let's look at property 1 first. As previously explained with Axiom 2, the reason why logical thinking is so powerful for humans is because it can explain everything in smaller, common elements that are universally acceptable to any human. There is no room for error, bias or prejudice here. Everything is perfectly replicable.
The problem with our ways of logical thinking is that it still isn't broken down into even smaller steps, and therefore has fundamental flaws within itself that cannot be solved (we will look at this further with Argument 2). This is because, at human scale, any composible logic should not be too simple and repititive simply because human brains are subject to fatigue – especially for repititive tasks. Our logical systems are built around these human properties that are simply ineffective for solving problems directly at its core – probability optimization.
Computers are not prone to this limitation. In fact, when we break down operations to even smaller ones, it gets even more effective and cheaper to compute, especially with parallel computation processors like GPUs. And thanks to economies of scale, mass production of these chips get cheaper as we produce more of them at a time. For any generalizable problem with its ground truth denotable as a probability vector, computers will always perform better than humans.
Property 2 is derived from the argument that our logical ways of thinking cannot be broken down to even simpler steps; every block of logic must remain within a particular size constraint. And because human logic itself was built to solve problems regardless, for any problem domain where the answer can be denoted into a probability vector, human reasoning will always fall behind.
How do we verify whether an answer is correct, or at least, close to the ground truth, when the answer isn't explainable with blocks of logic that we can understand? That brings us directly to Argument 2:
Argument 2: it is perfectly acceptable for intelligence to not be self-explanatory. in other words, intelligence may not be able to properly explain with its own logic how and why this is the best decision, and that could still be the best decision possible (even if that decision cannot be explained within its own systems).
This means that not only solving problems do not require logical understanding of humans, but also that – it may be fundamentally impossible to explain why this is the correct answer in human logic terms.
Again, logic is merely a mental tool for humans to understand the world and solve problems. We developed this tool because (i) we need a way to solve problems in a universally acceptable way such that anyone can continue where someone left off to build civilizations, and (ii) we as a species are ... simply curious; we want to understand how the universe works.
As demonstrated with Argument 1, denoting any problem as an issue of iterable probability vector optimization makes solving any problem exponentially cheaper, and infinitely better than humans do. But there is another issue: our logical systems are fundamentally flawed.
The two theorems that prove logic is incomplete within the boundaries of our logic itself is otherwise known as Gödel's incompleteness theorems.
The first incompleteness theorem says "any consistent formal system $F$ within which a certain amount of elementary arithmetic can be carried out is incomplete; i.e., there are statements of the language of $F$ which can neither be proved nor disproved in $F$." In other words, any logical system with arithmetic has a statement that cannot be proven nor disproven within its own logical statement. We don't know how many of them exist, nor how big this logical hole is; all we know is that any logical system we know of is incomplete, and there always will be facts that we will have to accept as facts, but will never know why.
The second incompleteness theorem says "For any consistent system $F$ within which a certain amount of elementary arithmetic can be carried out, the consistency of $F$ cannot be proved in $F$ itself." Not only there are statements that can never be proven or disproven, we don't even know whether our logical systems make sense at all.
This tells us there is a huge hole in the logical systems we use to explain pretty much everything, and there will always be statements that we will have to simply accept as the truth without a way to verify them ourselves. How do we even know whether a problem like this that could potentially matter for us exists in the first place? We don't know either.
But if we skip logic entirely and get directly to solving the problem itself using simple iteration, in theory we can compute the answer, or the ground truth, to any problem space we have – we just don't have a way to understand why that is true. This sucks, but is much better than not being able to solve something at all.
What if we really want to understand why a computed answer is true? If the answer falls within the logical hole described by Gödel's incompleteness theorems (of which I am assuming will be significantly large), unfortunately there will be no way for us to logically understand why. But if the answer falls within the logical spectrum that we can understand with our own mental tools, we might be able to ask a large language model with enough context to try and come up with an explanation. There is also a significant chance the answer given by the LLM will be absolute nonsense – and trial and error of proofs by humans might work out better for this particular problem – but this also has a fair bit of chance of working.
In fact, assessment of split brain patients shows this is exactly what is going on with our biological brains as well.
The speech cortex is located on the left brain. Therefore, with split brain patients, when someone asks the right brain to choose an object it likes with its left hand, instructs it to pass the object to the right hand (controlled by the left brain), and ask the left brain why they have chosen this object, the speech cortex will try to come up with a compelling reason to justify its actions – regardless of why the right brain has actually chose the object (of which, we might never know). In other words, whatever reason we are giving for our actions might be a justification our brains are making up regardless of the actual reason – in fact, we may not have had a reason for a particular action we did that is reasonably explainable, but the speech cortex is might be making up something just to justify itself. As a reminder: this is very similar to what ChatGPT is already doing when asked to justify something totally absurd. It will happily come up with a very compelling argument for something that makes zero sense in practice.
So – ending this with a question: does logic and spoken reasoning even matter?
Correctly defining the problem domain and solving optimization problems is everything for humans post-singularity. Everything else might not even be relevant.
Argument 3: intelligence built on traditional logic and reasoning is no longer the best solution to a problem.
If our logic is indeed fundamentally flawed, iterative compute is better at solving problems than human intelligence, and the only reason for logical explanations to exist is for verification and intellectual joy, then this also means any form of intelligence built on our existing forms of logic is no longer the best solution for any problem scope.
This isn't even limited to logical ways of thinking; forms of intelligence built primarily on intuition, like art and music, can do better than humans as well in theory assuming they are targeting the correct problem scope. If the problem scope is defined as creating something that humans consider as "new" and "refreshing", and that scope can be correctly vectorized, it's a matter of time optimization models would do better than humans within primarily intuitive scopes too.
What's even stopping them from mimicking emotions as long as the correct data vectors are given? And what would be the difference between a model that acts like it has emotions versus an actual human with actual tears and laughter? This is not a technical question – it's more of a philosophical one; if a machine is capable of copying human characteristics, is it human or not?
What we know for sure is that this boundary between what's real and fake will only fade over time.
Argument 4: domain-specific intelligence will become less useful than multi-domain intelligence, assuming a given problem must provide its input vectors on nondeterministic space.
One of the key takeaways of the arguments illustrated above is that domains defined by humans also won't matter anymore. In other words – all forms of classification also becomes meaningless. What matters is correct vectorization.
Let's say I am trying to build a good restaurant recommendation model. Usually people would try to collect as much domain-specific data as possible: a list of all of the restaurants I have been to, for example.
However, domain-specific data typically don't tell the whole story. What restaurant I like is closely correlated with who I am fundamentally, as a person. I may like a restaurant because of what music it usually plays; that has to do with my music tastes. Or I may like a restaurant because it is perfect for chatting with my best friends that live nearby that place; that has to do with my friend relationships.
Eventually to become a good restaurant recommendation model, the goalpost moves from being less domain-specific to much more generic – it should have a fundamental understanding of who I am as a person, just like a good old friend or my partner. This makes sense – someone who knows all of the good restaurants around an area is less likely to make a good recommendation than my partner or friend who knew me as a person for years.
This means everyone's personality traits embedded into a pattern will eventually become one of the most sensitive data formats anyone can possibly access. It's like my fingerprint or iris, except with it someone can use it to gain access to my personal property or impersonate me online. This is also already kind of happening: companies that primarily rely on ad revenue like Google and Meta require personalizing ads as much as possible, and thus they are already tring to collect as much information on who I am across the Web – although not to a level where impersonations are possible. How do we avoid this? I don't believe we have come to a point to start worrying about this problem, but it will eventually happen, and is definitely something to start think about.
Conclusion: fundamentally shifting our way of thinking
That was a lot to swallow. What I wanted to argue here was: not accepting what is logically proven but is statistically proven should be the way human thinking should move forward.
Solving problems with statistics is infinitely better than logical thinking: it covers all cases that logic cannot prove, can be automated with exponentially cheaper costs, and be deployed across multiple instances to improve itself internally. Computers also don't need rest or weekends, in theory they can work 24/7.
The only issue here: recall why logical thinking was developed in the first place. Logic can be proven independently by anyone, because they are built on common building blocks that any human being with enough knowledge can understand. In other words, logic was designed to be universally trusted, but statistics and compute fundamentally cannot. This is also one of the most common arguments for XAI (even though it is highly unlikely to happen in the near future): what's the point of powerful compute if we can't independently verify something is true without a huge datacenter?
Therefore, we need a new way to prove statistical compute without trusting a single claim from a computation provider, just like what humanity did with universal proofs and logic. But the thing with statistics is that, it is expensive to recompute everything from scratch just to prove a claim or data is correct. Human logic is relatively cheap to recompute by anyone with the right knowledge, iterative computation is not.
But we already have something that, hopefully, will be able to solve this. It's called zero-knowledge proofs. The point of zk is to succinctly prove some statement is correct without redoing all of the computations with the power of math and cryptography. As long as we have definite proof that computation done by an arbitrary datacenter is indeed correct, we no longer have to rely on human logic to re-verify a claim made by software that is literally a black box. All we have to verify is whether the given claim is actually close to a given ground truth statistically, and zk can do this without repeating all the iterative computation steps.
Eventually everything shall make sense
That's not all – we also need a new way to define identity, not inferred from biology, but from a set of characteristics and patterns a single entity exhibits – as intelligence won't limited to human beings. This is the moment where flesh and bone becomes (almost) indistinguishible from software – and, therefore, for the first time, it makes sense for both humans and computers to share the same interface to interact with others.
The common denominator here would of course be software: it abstracts away everything else, including where you were born in, which passport you are holding, and where you are physically located. All that matters is you and I can properly communicate over some common interface and channel. I mean – how are humans supposed to compete with intelligence accessible anywhere over the Internet when we are geographically tied to a single country?
Software should be indistinguishible from humans, and this is why we need to build a new foundation of identity to start collaborating with others – regardless of who you actually are; being made out of metal or flesh and bone. I have a theory on how identity should work in this world, even without AI, that I will discuss with a separate post – but the theory here is that actions (encoded as data) form identity, and not the other way around.
Everything shall make sense. Hype cycles will come and go. Startups will grow and fail. What actually matters is the future we are able to build together, with every one of our technologies combined in one place. How cool is that? $\blacksquare$