Search This Blog

Wednesday, April 27, 2005

Review of "On Intelligence"

[Audio Version]

I recently finished reading a brief but refreshing book on intelligence titled "On Intelligence", by Jeff Hawkins with Sandra Blakeslee. I found the ideas espoused concise, yet penetrating; bold, if perhaps a little hasty. I want to recommend that anyone with any serious interest in the future of AI read this book. It may well have as much impact on AI in the coming decade as things like classifier systems, neural networks, and genetic algorithms have had to date.

I hope that Sandra Blakeslee, coauthor and surely a bright light in her own regard, will not take too much offense if I attribute the synthesized concept and work of this book primarily to Jeff Hawkins. I apologize if it is unwarranted, but it's certainly simpler for the purpose of writing a review.

First, it's worth pointing out that Jeff Hawkins has had a successful career developing handheld computer technologies, including for the venerable Grid computers, as the chief technology officer (CTO) and founder of Palm Computing - makers of the PalmPilot - and of Handspring, makers of the best known knockoff of the PalmPilot. He's obviously got a respectable pedigree as a computer engineer. He's also dabbled in aspects of artificial intelligence in his work, like when he provided an elegant solution to the handwriting recognition problem by inventing the Graffiti mechanism used in PalmPilot, Microsoft PocketPC, and some other handheld computers, for example. As you might imagine, that doesn't give him any special insight into intelligence, per se. If one can be a good programmer without being able to write a clean sentence in English, why should one expect a good programmer to have any keen insights into intelligence?

Hawkins is also one of those sorts of people who's had a long held passion for understanding intelligence. "I am crazy about brains," he writes in the prologue of On Intelligence. Unlike many of us techies in the AI field, he doesn't describe his interest in how our brains work as a simple source of inspiration for solving certain problems. He seeks to understand how the human brain - particularly the part that he considers the essential seat of human intelligence - works as an end unto itself. Yet he states unambiguously, "I want to build truly intelligent machines." One can say that Hawkins is among those of us who truly believe in the basic feasibility of achieving the longstanding goal of the field of artificial intelligence. What is most impressive is that he doesn't simply believe that we can understand the brain. He actually believes it's not all that complicated, even when viewed both bottom-up or top-down. That sort of optimism and focus is essential, I believe, to keeping the faith and making real progress toward AI's goal.

Whatever you might think of it, Hawkins would not like to associate himself with the AI crowd. In fact, he divides On Intelligence up into three basic sections to serve three basic goals: debunking AI, explaining human brain essentials, and waxing on the implications of his theory. The first two of eight chapters are devoted to the first of these goals. Perhaps it's the understandable awkwardness of associating the word "artificial" with the engineering challenge of making intelligent machines. No doubt the bad taste left in one's mouth by the basic failures of many AI efforts fuels the animosity. Perhaps he was bothered by the fact that early in his career, he found, as I did, that few business people - even among those on the "cutting edge" of technology - have any stomach for researching or applying AI in business. Perhaps being shunned by MIT, an ivory tower of AI research, because he wanted to actually study human brains, too, had its impact. Whatever the reasons, Jeff Hawkins clearly doesn't want to have anything to do with AI.

Although I believe Hawkins' hand waving of traditional AI is unduly summary and harsh in many ways, I think it's fair to say he has basically valid criticisms of what I would describe not as "AI", writ large, but of many of the concepts and experiments that have dotted the popular history of AI. The most sweeping criticism is that many AI researchers have eschewed studying the brain. It's probably fair to say we mostly tend to think in terms of solving practical engineering problems with crafty tricks that seem intelligent, rather than seeking to create genuine intelligence. Only bozos like me who do this in our free time, in a way, are free to experiment willy-nilly, while professionals generally sweat over showing tangible progress month to month and sometimes week to week.

Hawkins readily acknowledges that lots of cool things have been done using things like neural networks and swarm theory, but also pointedly argues, as I've noticed, that many of the same things can and have been done more efficiently with less auspicious, more traditional engineering methods. As such, he rightly calls into question whether those solutions have been essentially over-engineered for the sake of AI branding and career advancement instead of being driven by sound business practices.

Whereas I would say that AI research is still stumbling because AI researchers still lack a basic framework of conceptualization, Hawkins takes it a step further. He states that, "computers and brains are built on completely different principles." I understand why he says this, but the nit-picker in me can't really agree. Although Hawkins completely eschews any notion of what he calls a "special sauce" to how a brain works, his aversion to AI does lead him to make some claims that I believe are unsubstantiated about how computers work and why they can't be made to be intelligent.

Hawkins points to the famous Turing Test, which essentially postulates that if a computer can fool a human into thinking he's talking to another human, as the archetype of what's wrong with AI. By now, most of us in the AI field are familiar with the humiliating results of computer programs which, exploiting simple tricks and basic human gullibility have been able to pass the Turing Test with flying colors without requiring even a whiff of what any of us would call intelligence. Hawkins doesn't just quip about the inadequacy of the formulation of the Turing Test, though. He stabs at the most basic premise of it: that behavior defines intelligence. Of Turing's now generally accepted definition of artificial intelligence, Hawkins says, "Its central dogma: the brain is just another kind of computer. It doesn't matter how you design an artificially intelligent system, it just has to produce humanlike behavior."

It's obvious we have no means to measure intelligence without reference to behavior, and Hawkins readily acknowledges this. But Hawkins points out that we don't have to do anything to prove that we understand something. Behavior may be a necessary component of proving intelligence, but intelligence does not require behavior, per se. "A human doesn't need to 'do' anything to understand a story. I can read a story quietly, and although I have no overt behavior my understanding and comprehension are clear, at least to me." I think few could seriously deny this basic and intuitive claim, lest we deny our own obvious, daily experience of quiet contemplation of the things that happen in our lives.

Hawkins rightly waves aside the quip by some AI advocates that we could potentially model a human brain in software. While he admits it could potentially be done, he recognizes that that assertion bears almost no resemblance to the way AI has actually been proceeding to date. To his thinking, we're still seeking new ways to fool people using great parlor tricks like bipedal walking and anthropomorphic facial expressions, not trying to actually model the brain.

You might think Hawkins would have kind things to say about neural networks. After all, for years, the public has been fed market-speak about how neural nets work like the human brain. While Hawkins gives a nod to nervous nets as an interesting step up from AI, he reserves harsh criticisms of the fundamental approaches taken to date by connectionists for their basic lack of progress and even of potential. "On the surface, neural networks seemed to be a great fit with my own interests. I quickly became disillusioned with the field."

The first thing Hawkins notes seems missing from traditional neural networks is an appreciation of time. "Real brains process rapidly changing streams of information. There is nothing static about the flow of information into and out of the brain." I share this perspective. Were you to take long walks with me and discuss AI, you'd hear me babble for hours about the importance of and mechanisms for representing and tracking events in time. But when it comes time to put fingers to keyboard, it's hard not to quickly get lost in other details and leave aside that. I guess I assume I'm just as guilty as other AI researchers of thinking of time as something that's dealt with when there's more computing power and other basics are taken care of. Take vision. My recent "bubble vision" experiments deal exclusively with static images. I imagined bubbles morphing with each passing image in a video stream, but didn't ever get around to doing something like that. Not essential. I know it's naive and wrong. Guilty as charged, but I swear it's on my to-do list.

Hawkins' second criticism of neural nets is how they don't really have feedback mechanisms. In my own recent neophyte exploration into neural net simulation, I had a pure feed-forward learning and processing model. More sophisticated models using "back propagation", which looks a little like feedback, but isn't really. Contrasting this with the brain, Hawkins points out, "for every fiber feeding information forward into the neocortex, there are ten fibers feeding information back toward the senses. Feedback dominates most connections throughout the neocortex as well." Connectionists - neural network researchers - might argue that they aren't out to literally mimic the brain, but to simply do something useful with their creations. But one of Hawkins' points in this criticism is that it's not enough to have a piece of brain switching between learning and "doing" modes like classical neural nets do. Learning and thinking are part of the same basic neural algorithm in the human brain.

Tying neural networks back to AI, Hawkins levels his basic criticism. "In my opinion, the most fundamental problem with most neural networks is a trait they share with Al programs. Both are fatally burdened by their focus on behavior. Whether they are calling these behaviors 'answers', 'patterns', or 'outputs', intelligence lies in the behavior that a program or a neural network produces after processing a given input. The most important attribute of a computer program or a neural network is whether it gives the correct or desired output. As inspired by Alan Turing, intelligence equals behavior."

Before leaving the topic of neural networks as a seemingly dead-end alley behind, Hawkins does go on to mention a backwater of connectionism dealing with what are called "auto-associative" neural nets. Without going into much detail, the basic point of auto-associativity is to make it so a neuron that would normally recognize the whole of some pattern - perhaps of a human face - would be able to do so even if part of it were incomplete - perhaps because something else in the image is partly covering the face - by feeding the missing parts back into itself, as though "imagining" that the missing parts are actually there. Hawkins introduces these as a prelude to his eventual explanation of his model of how the brain works.

Hawkins ultimately believes that traditional AI approaches are doomed to fail because they fail to account for an actual model of intelligence. Although I don't believe that the way the human brain works is the only way an intelligent machine can work, I think it's at least fair to entertain his belief that we won't find human-like intelligence in the trash heap of traditional AI. "To succeed, we will need to crib heavily from nature's engine of intelligence, the neocortex. We have to extract intelligence from within the brain. No other road will get us there." For me, the best reason to at least entertain this anthropomorphic view is that Hawkins is the first person to actually describe the essentials of human intelligence in a compact, algorithmic sort of way to my satisfaction. (Leonard Piekoff and Ayn Rand, especially in their 1979 tome, Introduction to Objectivist Epistemology, come in a close second. Regrettably, they had little interest or belief in the idea of endowing machines with intelligence, as their insights have much to offer AI researchers.)

The stage set, Hawkins moves on to summarily describe the human brain. What first struck me as a surprise was that he has chosen to focus his attention on just the neocortex (AKA, the "cortex"), the topmost roughly 2mm of the wrinkled surface of the brain we're all familiar with. Honestly, I found this explanation a little confusing, as I thought the neocortex was the entire wrinkly bit that sits on top of the "reptile" and cerebellum portions of the brain. Perhaps its just that I'm not a neuroanatomist; maybe we're talking about the same thing. Hawkins makes clear that he doesn't believe that all of human thinking arises from the cortex, to the exclusion of the rest of the brain. He has simply chosen to focus on the cortex because in his view this is where the key to intelligence lies. As he says, "I am not interested in building humans. I want to understand intelligence and build intelligent machines. Being human and being intelligent are separate matters." For example, Hawkins wholly rejects the view, now becoming popularized in AI, that a machine must have emotions to be able to function. I must admit that my own view of the importance of value judgments borders on this emotionalistic view, but I have to agree with Hawkins that emotions should not be essential for intelligence, per se.

Still, I think I have to side with the people Hawkins indicates would disagree with his assertion that the cortex is sufficient to explain intelligence. As Hawkins later explains, the cortex is virtually homogenous in its basic structure, meaning that all parts of the cortex perform the same song and dance, just using different information. He probably knows much better than me, but I have trouble believing that there isn't some important gadgetry in the brain that's evolved specifically to deal with vision or audition, for example; or with locomotion, for another. It seems hard to imagine the neocortex beginning, in Ayn Rand's words, "tabula rasa" (clean slate), with no built-in mechanisms as tools upon which to build an understanding of the world.

Still, my quip is no reason not to suspend judgment long enough to understand what alternative Hawkins proposes to this staple idea of psychology that some parts of human intelligence are just there from birth. Hawkins states simply that the cortex is not a computing machine in a sense that traditional computer scientists would be familiar with. Instead, it is a memory machine whose primary purpose is to make predictions. (Of course, I could quip that the vast majority of hardware in the processing machinery of most modern computers is devoted to memory, also, and that most attention in computer science goes to information that goes in that memory, but these are a small points. I just get tired of the revolutionary, "this is better than a computer" rhetoric, sometimes; even from people I admire.)

To help explain how such a wide variety of senses as we have basically work the same way, Hawkins explains that they all boil down to two kinds of information: spatial and temporal. Vision illustrates both quite well, but each of the senses appears to have more or less of each. I found his explanation of touch in spatial-temporal terms particularly fascinating. Using a thought experiment, he asks you to imagine waking up in the dark with a small pile of gravel stones having been placed on your hand while you slept. You would probably not be able to recognize it as gravel until you started wiggling your hands and so feeling the tell-tale protrusions, vibrations, abrasion, and so forth that comes as a time-varying stream of signals to your cortex from the different, spatially represented parts of your hand. This is the sort of thinking I've been entertaining for years, but expressed in a far more elegant way than I've been able to. It may not surprise you that Hawkins believes that not only do all the senses work in the same way in the cortex, but also that motor control uses the same exact mechanisms, too.

Hawkins gives credit for much of his basic view of the homogeneity of the cortex and the cortical algorithm to neuroscientist Vernon Mountcastle. "When I first read Mountcastle's paper I nearly fell out of my chair. Here was the Rosetta stone of neuroscience ... In one step it exposed the fallacy of all previous attempts to understand and engineer human behavior as diverse capabilities."

The homogeneity of the cortex and the consequent reliance on a one-size-fits-all algorithm sounds interesting, but not very helpful at first. It begs for an explanation of the algorithm that accounts for most of what we consider intelligent behavior. Since he claims that prediction is the essential function of the cortex as memory system, Hawkins identifies two essential ingredients that go into prediction that operate in conjunction: hierarchy and pattern invariance. The first, hierarchy, seems straightforward. It seems natural to expect that as information bubbles up from the lowest levels of the senses, it should take on more and more abstract forms.

The concept of pattern invariance is a little harder to grasp, but very important. My experience with the pattern matching capabilities of the neural network simulation I made some time back confirmed what I already suspected: that a neural net isn't actually very good at recognizing letters. With enough neurons in a "middle layer" and sufficient training, it can be cajoled into recognizing characters in a few different fonts and making best guesses as to what it sees. But it pales by comparison to our own ability to pick out characters, even among the most appallingly noisy backgrounds. Try rotating a letter 45 degrees and the neural net chokes. Yes, you could train the net to recognize an entire copy of its characters with a 45 degree angle. And you could even, in theory, go so far as to do so for, say, every 10 degree increment. But surely you never trained your eyes to do this, yet you would probably have no problem identifying any single character at any arbitrary angle and in any basically legible font. Why should we think this is a good solution to an AI problem, then?

Hawkins contends that each node in the cortical hierarchy is responsible for coming up with a way of recognizing and representing patterns that define things like printed characters in such a way as to be able to abstract away the lower level details. A printed letter "T" will be presented higher up the hierarchy as the same thing, no matter its orientation.

Hawkins points out something I didn't know previously: that the mammalian cortex has largely taken over the motor functions previously managed by evolutionarily older portions of the brain. My understanding is that evolution tends to layer the new on top of the old and only rarely bypasses existing solutions to technical problems in favor of new approaches. So the idea that something done for a very long time by older portions of our brains - and done quite well - would be outmoded by the nerdier upgrades seemed like heresy to me. Hawkins provides a very important incentive for nature to do this, though. The very machinery responsible for our ability to imagine taking action is the same machinery that actually implements action. To his thinking, we can suppress the motor control "output" and ponder the consequences of our actions or simply allow it to happen. When you imagine taking a walk around your house, for example, you are making predictions about what you will see or otherwise experience, and those predictions are being fed back into other parts of your cortex that would actually do the perceiving of a real walk.

It's particularly interesting that, as Hawkins explains, it's not like the horse must go before the cart in this predict-action versus take-action scheme. Predicting action, if allowed, will cause action to be taken. But on the other side, taking action causes prediction of action, too. Actually, it causes prediction of effects, not just action. With each step as you walk, your brain is busy predicting when your foot will hit the ground, how hard it will hit, what parts of your foot will hit first, and myriad other details about what's expected to happen.

Here's where Hawkins' memory-prediction concept really kicks into being astonishing and useful. To his thinking, we don't have or need super senses in order to have a seemingly supernatural awareness of the world. Just think of how well you can navigate your own house in complete darkness. Think it's because you have good senses? Try doing it in a house you're not familiar with. Do you even know where the bathroom is? You wake up in your house at night with a need and your brain is already busy before you're awake making predictions about everything about your short trip. The floor is 2 feet down from the bed and has a carpeted covering. In a few steps, you'll hear a creak from a loose floorboard. You'll be taking a left turn there. When you stick your right hand out, you should soon feel the door jamb. You'll turn right and take about six steps. And so on.

In fact, you don't even need to fully wake up to do this. It's not that you have a spare brain that takes care of nightly urges. It's that the lower levels of the hierarchy are busy making predictions about what you should experience going forward and taking action to fulfill those predictions. The upper levels only get alerted when the world does something unexpected; something that doesn't fit your predictions. The part of the carpet you just stepped on isn't soft and dry like you expected. That part of your brain expecting the soft, dry feel of carpet reports its confusion to the next higher level. Maybe it knows what to make of it. That higher area remembers that you have a dog and entertains different scenarios - predictions, really - that might result in a non-dry carpet.

The mechanist in me loves the elegant simplicity and seeming completeness of this concept of memory-based prediction. The skeptic in me, though, has a little trouble with the suggestion that we're able to predict everything. Surely we learn along the way. To his credit, Hawkins does clearly state that learning happens and even gives some thoughts on how learning occurs. My sense, though, is that the majority of his focus in On Intelligence is on prediction. I think Hawkins would agree that the learning part remains a bit more of an unknown. Still, this doesn't seem to detract from the value of the core of the concept of a memory-prediction framework.

Actually, one of the more interesting ideas Hawkins puts forth is an explanation of the hippocampus. Many people with some familiarity with brain anatomy 101 will recognize this from its popular association with memory formation. A significantly damaged hippocampus will leave a person unable to form new memories and thus eternally caught in a moment in time before the damage and unable to function independently. As with the other parts of the brain, such as the cerebellum and medulla, Hawkins says he tried to ignore the hippocampus for a long time as nonessential to intelligence. He didn't like the thought that the cortex, which is quite capable of learning and is where knowledge ends up any way, should pass information to the hippocampus, only to have it come back to the cortex again. It seemed a pointless journey. Hawkins puts forth an idea he came across that the hippocampus is not miscellaneous to intelligence, but is actually the top level of the cortical hierarchy. The basic idea is that any stimulus that doesn't fit the predictions made by the various levels of the hierarchy bubble their way up to ever higher levels until the raw data is simply dumped in the lap of the hippocampus. This structure is capable of quickly forming memories, one assumes of the raw data. These memories don't last long, though. If the brain ruminates on these new patterns long enough, they will eventually be imprinted on the lower levels that take longer to learn but retain memory much longer.

In fact, Hawkins indicates that he believes this same downward push of patterns happens at all levels of the hierarchy. It's as though once a higher level is able to understand - make predictions about - an idea, it attempts to delegate responsibility for memorizing and making predictions about the information to lower levels. To this, I think I would add my view that repetition is probably key here. Learning to ride a bike, for example, is repeated often enough that the raw data that starts out streaming as if from a fire hose into the hippocampus that the higher levels of your consciousness are able to make sense of it and tell you in some crude manner how to deal with it. It takes your full, high level focus at first. But with time, lower levels figure out how to predict the details, freeing your "highest level" of consciousness to ponder other unexpected information. Stop riding a bike there and the knowledge may just stop there. But keep riding and still lower levels will take over more of the details, freeing those slightly higher levels from having to remember much about the finer details and increasing your ability to quickly and automatically react to the many circumstances that may come along.

The hippocampus is very much a key player in learning, in this model. Based on it, I would suggest that, if this relatively small part of your brain is where short term memories are stored for a few seconds or sometimes minutes, then the hippocampus may be thought of as the seat of the "awareness" aspect of consciousness we are very directly cognizant of. It's surely not like a theater, though, where every aspect of the things you imagine will be projected to. It seems more sensible to assume that the entire cortex-hippocampus hierarchy is the "theater". The hippocampus should only be aware of the parts of the current situation or thought that don't fit the predictions of the millions of lower level nodes. And it's not as though the hippocampus "gets it". We don't want to introduce a classic "brain within the brain" argument. In this model, the hippocampus just records it and expects lower levels to "get it", meaning juggle the information around until the patterns match or otherwise learn to make predictions based on the new patterns.

I should point out that Hawkins makes a sincere attempt to explain how the cortex works at a lower level. He uses graphic abstractions meant to appeal to computer geeks like me like the one illustrated here. And he uses the terminology familiar to neuroscientists for naming components and explaining the concepts. Honestly, though, if I read nothing but chapter 6, "How the Cortex Works", I would probably not be left with any sense that I had learned much, simply because I got somewhat lost and so the material got dry. In fairness, though, the earlier part of this chapter was a bit easier for me to digest and helped to more graphically explain things like the hierarchy and pattern-invariant nature of the cortex. For the person unfamiliar with notions like fovea and eye saccading, this chapter holds nice summary introductions. Perhaps the most enlightening part for me was the introduction of the concept of a "column" of layers of neurons. Each column has its own hierarchy of a sort and is heavily interconnected internally. A column would tend to deal with the same snippets of raw information coming from lower levels, but would have differing levels of abstraction of that information. The columns then form the nodes of the overall cortical hierarchy. In the context of traditional neural networks, I found the concept of a column similar in some ways to a complete artificial neural network with its various layers. In my experimentation, I assumed that a better design would take these in turn as building blocks in some sort of loose network or more rigid hierarchy of such "clusters" of nets, so Hawkins' explanation was familiar ground.

After presenting his memory-prediction framework for intelligent thinking and indicating how the neocortex is progenitor of this capability in humans, Hawkins moves on to talk more about the implications. He addresses quite a few common questions about intelligence. He describes creativity in terms of memory-prediction, for example. For the question "what is consciousness," in addition to pointing out how icky of a subject this seems to be for neuroscientists and posing some thought experiments as a sort of common-sense philosophical background, Hawkins states that, "I believe consciousness is simply what it feels like to have a neocortex." I know it's a different context, but my own view is that consciousness is defined by a basic awareness of the world, so even simple bacteria can be thought of as being conscious on some level - and clearly there are different degrees of awareness, with humans having the most of all the organisms we know of. Still, I don't begrudge Hawkins for narrowly focusing on mammals because we have neocortices. Many people would not consider creatures that can't ponder the future or recognize themselves as being conscious, and that fits Hawkins' formulation.

To summarize, On Intelligence is a book by Jeff Hawkins, with Sandra Blakeslee, that presents a new view of what the authors consider the key element of human intelligence: the neocortex. Hawkins claims that the neocortex is composed primarily of millions of repeating patterns, all responsible almost exclusively for remembering patterns and making and testing predictions based on them. These repeating "columns" are organized into a hierarchy, with the lower levels representing the most concrete sensations and motor activity and the highest levels representing the most abstract and "stable" concepts. Hawkins uses the term "memory-prediction framework" to identify this concept and mechanism.

While I do believe Hawkins does sell traditional AI short, I give him credit for properly exposing the core weaknesses and limitations of many AI technologies. We should take this chastening as a warning not to be too bold in making claims about how the kernels of cool ideas we come up with are the holy grails we've all been searching for. I think Hawkins should take this message seriously of his own good idea, too. I believe there's a lot of value in the memory-prediction framework, but it simply doesn't sufficiently explain everything about intelligence to satisfy my own curiosity. For instance, the concept of pattern invariance is brilliant and crucial, but On Intelligence has virtually nothing to say about how it works in our brains.

In the end, I would predict that Hawkins will be recognized for his solid contribution of a very valuable conceptual tool to the AI community. The terms he has coined or brought together from other sources will be used by me and surely others for their expressive power. And the attention he brings to largely ignored features like the temporal nature of information, making and testing predictions, and pattern invariance will become the means by which we measure the value of our own concepts and experiments. And, lest I be rude, I should point out that I'm sure his ideas will serve a similar purpose for the field of neuroscience. Perhaps his work will help to bring the two fields closer together, offering new opportunities for us to share and relate the things we know and learn, to the benefit of both fields.

As a side note, I chose this time to buy this book in electronic form instead of as a hard- or soft-cover printed book. This is the first e-book I've bought. I wanted it in part because I like the fact that I can do text searches to find information. It's especially valuable for reviews like this. But since I have a PocketPC (sorry, Jeff) that I carry with me everywhere - I call it "my brain" - I thought I'd get more chances to read the book if it were always in my pocket. Besides, it was cheaper than the soft-cover, and the local bookstore will only sell the more expensive hard-cover version for these first few months of the publication. Since the Microsoft Reader program the electronic copy works with allows one to share copies of a book between both handheld and desktop PC, I was able to go back and forth without paying for two copies. And the small screen wasn't too much of an issue, as MS Reader allows you to zoom and pan in on illustrations. Man, this was a worthwhile experiment. If you have a handheld computer with a good screen, I'd strongly recommend you start buying books that are mostly text as e-books. There are probably other good readers out there, but I was quite happy that MS Reader is pretty well crafted. Besides, Microsoft will probably win the e-book standards war, soon, so it seems a good choice of standard for now.

Friday, April 15, 2005

Bubble Vision

[Audio Version]

I just completed a small project into general purpose machine vision. The essential concept is to grow "bubbles" within regions of an image that have the same color or smooth gradients of color that shift gradually from one to another.

The method is a bit like a traditional flood-fill algorith, but uses a continuous loop of nodes that move and multiply to push the loop ever outward until they hit individual obstacles. The growth is controlled primarily by cellular automata style rules. There's also an algorithm for dealing with cases where the bubble wraps around "islands" of obstacles. Rather than leave a seam behind as the bubble grows, it engulfs these islands by connecting the touching parts of the loop and discarding the parts of the loop left inside.

I also critique the shortcomings of the bubble concept and indicate opportunities to build on its successes.

I've included the source code for download and provided extensive explanation of how the algorithm works. I invite you to check out the project site.