Ask a Wizard: 2008

Friday, December 12, 2008

Holiday Bash

I'll be holding a little party at my terminal for the holidays. Please join me by running the following command in your shell.

((while true; do echo -en "\e[31m." >&2; echo; done) | (while read line; do echo -en "\e[32m."; done))

Thursday, November 27, 2008

On Faith

I grew up in two religions—I promise to tell you which ones later. Contemplating the agreements and arguments between these two faiths helped me realize something very important.

I am wrong.

I'm wrong for a variety of reasons. But, by a twist of logic that's hard to accept until you've been alive for a few weeks, the best reasons for me to believe that I'm wrong are the ones I don't know yet.

If I change my mind about a proposition, I either am wrong, or was wrong.
I have changed my mind many times.
I have either been wrong, or become wrong many times.
I will probably change my mind many more times.
I'm probably wrong.

On several occasions, I've realized that I was so wrong that I had to become a completely different person to become right. While I was in school, this occurred so frequently and dramatically, that by the time I was in junior high school, I had become the kind of person that I would have despised when I was in early elementary school. And now, I pity the boy I was in junior high for his bitterness. I'm furthermore haunted by the persisting memories and following cognizance of the people I have failed or hurt along the way.

The reason I was bitter in elementary school was that no-one wanted to be my friend. In hindsight, I've realized that I had alienated my peers and thus brought this ill fate upon myself. I was, however, kind. I was not condescending, like similar kids in my situation. I did not pick fights, nor did I take what I did not fairly deserve. If anything, I retained few pleasures for myself for fear of taking them from others. For ample, to this day, I have never ridden a shiny-silver tricycle like the ones that were so popular in my kindergarten playground, and I always waited for everyone to lose interest before I tried the swings.

My fault was more subtle than any of the unkind things that children can cause upon each other. I refused to play their game, a game that you and I play together every day. I refused to play a role. It was my opinion that every person should always wear their true feelings on their faces, and say only what they believed. I believed that it was dishonest to do anything else and could only lead to deferred hurt. I couldn't bear to stand around with a bunch of people and have trite conversations about immaterial matters or play fake dramas among friends. It would have violated my sense of honesty.

This is why I was a wholly uninteresting, boring, stick-in-the-mud person for the rest of that decade. I did eventually learn how to play a role, to play power-games, and to act out dramas about how I certainly "must" feel for the amusement of the situation. I got over myself, a little bit, and now I enjoy the more important truth. As I came to accept wholly the moment Dr. Janice Daurio uttered it over coffee in my final year of junior college, life is about building relationships with other people.

So, yes. I was wrong. I'm likely still wrong about many things. But, more importantly, even if I do find the ultimate truth, the general theory of everything, I will have no reason to ever be completely certain about it. No matter how small the universe is, it's larger than me, and it's larger than I can perceive. It is finer than the least perceptible granularity. There is no vantage in the universe from which I can see its entirety. Furthermore, since I am part of this universe, the last impediment to my full understanding of the universe, is the ability for me to fully understand myself. I take it to be a paradox for a box to contain itself completely, and no more do I believe it possible for my mind to contain a complete and detailed model of itself.

I used to think that the word "abstraction" meant to find a better, more general, and reusable theory. When I was in Clark Turner's class on professional responsibilities for software engineers, he called out the real definition of abstraction: to omit detail. The price of a bigger idea is less detail. That's the limit of our mind, and it's the limit of to our understanding of the universe. We can only claw at the shapes around us with our minds, bending our minds to better fit the model, but never fully contain it.

For these reasons, I believe that if there is a sin, it is certainty.

However, apart from never being sure about anything, there are many things worth believing. There are three good reasons to believe a proposition: correspondence, coherence, and pragmatism.

Correspondence is a theory that there is a universe. A thing is true because it corresponds to that universe.
Coherence is the theory that all truths must be logically consistent. There are bodies and combinations of premises that are consistent among themselves. If a premise contradicts an accepted truth, it may not be a member of that body of accepted truths. It, however, may be the basis for a new body of truths.
Pragmatism is the harsh realization that neither of the other two theories gives you a place to start believing things. You need a basis, a first truth, and you need to assume that it's inherent to the universe. The problem with coherence is that there are many bodies of coherent truths, not all of which necessarily correspond to the universe. To be pragmatic, or practical, we have to make a guess. With regard to correspondence, the question is, "Which universe?", and the pragmatic answer is, "This one!".

Based on these three mechanisms, consider this minimal guide to finding the truth:

Guess.
Always remember that the guess might not correspond to the universe. Test it through observation.
Always remember that the coherent body of truths based on the guess might be wrong too. Test it with logic.
With gathered knowledge and experience, always try to make better guesses such that over time you can asymptotically approach certainty.

Correspondence, coherence, and pragmatism come out of the discipline of philosophy. The ideas are innate, but it's in philosophy classes and text books that I heard them thus succinctly articulated. The guide to finding the truth is the empirical method: science. Oddly enough, both ultimately boil down to faith and uncertainty, traditionally the purview of religion and agnosticism respectively. The discord between religion and science, under this light, looks more like a popular myth.

Becoming more right requires a lot of observation and thought, but there are some pretty good tools for contemplating truth. For one, coherence is probably the most solid theory of the three. If you focus on culling incoherent premises, or at least organizing them into piles of propositions to be evaluated as a whole, you might not find the truth, but you can mitigate a lot of probable falsehoods. Coherence is math, a concept that's free of the messy details of the physical world, a metaphysical archetype. As such, its employment is the most satisfying of the three reasons to believe.

The next best behavior is observation. Keep sense open. Gather and mull premises in hope that between perception and abstraction, some mote of truth about the universe to which those observations correspond leaks into your mental model of the universe.

That brings us to pragmatism. Of the three reasons to believe a thing, pragmatism is the most tenuous. Pragmatism is about faith, specifically making leaps of faith. In math and logic, these leaps are called postulates, certain suppositions that you make so that you have some basis for evaluating the coherence of other ideas. The trouble with a populate is that, by necessity, it must be the foundation for any certainty that follows. An article of faith is further beyond question or doubt than any conclusion arrived at through the trial of observation and coherence. For that reason, it's best to use pragmatism as seldom as possible.

In mathematics, we presume that there's something called incrementation. There's no proof that incrementation exists, and if we found one such proof, behind it would be terms that would have to be assumed instead. We accept incrementation, that there is a void called zero and a unit away from it in some direction called one. From there, we have all the material we need to converse about all the various kinds of arithmetic and algebra that are possible in the metaphysical verse in which there's incrementation.

Likewise, to engage in meaningful discourse about our universe, we have to make certain leaps of faith.

The universe exists (otherwise, correspondence theory would be out).
The universe is consistent (otherwise, coherence theory would be a wash too).
You exist (cogito ergo sum, except that there's a paradox inherent to the assumption that if you think you must be. Let's just stick to the assumption, "I am.").
You are actually you (this one really gives me the shivers once in a while).
The universe is significant.

That's a sample of good things to believe in the absence of proof or even it's lesser cousin, evidence. In particular, if you're going to talk about the universe with anyone, it's pretty safe to assume that these are your common ground. If they're not, one of you is either yanking the other person's chain, or the conversation is pointless. Either way, it's time to move on.

The trouble with leaps of faith is that, the more of them that you make, the more likely they are to be inconsistent. Coherence and correspondence are much more reliable allies. For that reason, it's best to minimize leaps of faith. It takes especial care to frame explanations for observations without making careless leaps of faith to which you haven't already committed. They tend to sneak in for less wholesome reasons than the search for truth, and tend to get in the way when you try to reconcile your theory with the next observation.

Occam's Razor does not mean that the truth is simple. The quest for a General Theory of the Universe, wherein all fields (electric, magnetic, gravitational) are explained in the terms of a simple formula like Einstein's relation between energy and matter, is predicated on the faith that the universe does have a simple mathematical root. However, this is not an application of Occam's Razor. Occam's Razor states that if you have a simple explanation for a phenomenon and an elaborate explanation for the same phenomenon, it's best to assume that the former is true. That is, chose the explanation that requires the least faith.

My father is fond of recanting a story from my childhood wherein, to my chagrin, I am not the protagonist. I learned how to write my name with crayons roughly when I was four years old. About a day later, my parents discovered my name written on every surface that could retain a crayon's impression, including my sister's back. Producing Kathy's marked verso as evidence, my parents accused me of the crime. In my defense I claimed that she had done it herself. Bear in mind, Kathy was half my age.

Apart from unlikely stories for easily explicable phenomena, a person who believes nothing that they do not perceive, induce, or deduce would be a sad and empty shell.

One problem for atheists, agnostics, and skeptics is that there is no natural source of Hope. In fact, you don't have to learn much physics to get to the part where the universe appears to be spiraling down toward a low-energy drain, the "Heat Death of the Universe". At some point, you really do have to make a leap of faith. So, it's important to add another precept to the catalog of postulates.

There is hope.

In Yoda's words, "Fear leads to anger, anger leads to hate, hate leads to suffering.". While fear often comes from rational distrust and can drive us to excel, it can also mire us in depression. Hope is the cure for fear. Hope is our reason to try to make the future better than the past, and without it we are lost. Since there is no definitive evidence that the future will be better than the past, and the case is piled high that it will ultimately be worse or end in a zero sum, we must postulate that hope exists, that we can craft a better future.

Sometimes, when I am almost completely broken, I lay awake and let that truth fill me to the brim. I let the warmth flow from my core to my pores and to the bottoms of my feet. Hope, the critical endogenous morphine, eventually fills my dreams with blurry but bright visions of solace. I banish my doubts with a crucial lie and my strength returns.

Consider that power though. Can't blind belief drive us to do wicked things? Perhaps it's better to let hope stand alone, without attaching it to a litany of prerequisites or consequences. Everything you attach to a blind supposition, increasing its complexity, risks motivating you to do cruel things without good reasons. It's better to let hope stand alone, a single pure island of beauty on the horizon, or otherwise risk contaminating that vision with zeal for something less perfect.

There are, after all, many reasons to believe a proposition apart from need, logic, and observation. These include wont, fear, and pressure to conform with peers. Despite my disdain for the vapid writings of Terry Goodkind in the Sword of Truth novels, I find myself quoting The Wizards' First Rule, "People…will believe a lie because they want to believe it's true, or because they are afraid it might be true.". Terry wrote this in the spirit of condescent and bitterness, but it serves a greater purpose as an admonition. There are certain conditions where it's important to elevate skepticism:

you want to believe
someone else wants you to believe
you're afraid not to believe

One of the greatest quotes said in service of hope and truth was that all we had to fear was fear, and one of the worst offenses against those same ideals was the creation of a rainbow of codes for how afraid we must be. It takes a powerful will to see coercion through fear, particularly communal fear, and these are the precise conditions when a healthy skepticism and faith in hope against what another would have you believe, what you are afraid might be true, and what you most desperately want to believe, might serve you best.

There are many good reasons to build communities, but getting together to publicly profess your agreement with one another, or to be fed the truth without argument, smothers that mote of latent uncertainty that helps people listen each other's ruminations, doubts, and observations. It repels people who disagree, the people who need your fellowship most. For those who grow up in such communities, they either grow with a hindered analytical sense, or buried revulsion and fear of rejection or guilt. If you gather to bolster your faith through the comfort of your peers agreement, your community submerges the relevance and welcomeness of pragmatism, coherence, and correspondence. A community should feel free to express their actual beliefs, argue, and converge through trial and support.

An article on Slate called, Does Religion Make you Nice?, explores the possibility that religion and atheism are orthogonal to how friendly a person grows to be. American atheists tend to be less nice than European atheists. The dependent factor in how well adjusted a person is in a society is acceptance in a community. We need to have communities that welcome people without regard to the details of their faith. Using tenets of faith to build walls around your church ultimately alienates people with the potential to be good.

If communities gather to ponder difficult choices on the fringes of desire and righteousness, they might discover that they actually hold very different beliefs. It's unhealthy to delude yourself into believing that every member of a community subscribes to the same faith. Our differences in experience, values, and chemistry, from birth, bring us to very different conclusions. We need to value each-other's experience more than our collective synthesis.

Regarding the philosophy of behavior, ethics, there is an ancient test for whether to do or not do a thing. You first generalize the act to a rule. For example, if I'm pondering whether to wear a silly hat in public, I would generalize that thought into a rule, like, "People should be welcome to wear silly hats in public.". Then you ask yourself whether society would function if everyone were to follow that rule. What would happen if everyone were welcome to wear silly hats in public?

Emanuel Kant called it the Categorical Imperative, but with subtle variations and derivations, it's a timeless strategy embodied by the Golden Rule. "Do unto others as you would have them do unto you.". Or, via my father the Cowboy Western fanatic, "Before judging a person, walk a mile in their moccasins.". Or, the corollary, "Do for others as you would have them do for you.".

"Categorical imperative" literally means, "the most important rule.". It's the recipe for peace and stability. However, like all mortal truths, it is far from perfect. Often to find a balance between reason and love, you must consider why every situation is special and carefully weigh how specific your situation is and how far away from your path you would, should, or can go for another soul. It's often best to err on the side of altruism. That is, defy your mortal laziness and do everything in your power to help others, and expect nothing from them.

That leads us to a flaw in the imperative. The categorical imperative also tends to imply that you can expect others to treat you with the same degree of cordiality with which you would treat them.

To address that flaw, consider a special rule. In order for the categorical imperative to bring about stability in a community, we have to create a bias, a pair of poles within which the imperfect universe can safely vacillate and gradually approach peace. The categorical imperative implies that your behavior must exactly match the expectations of all people in society. That cannot be the case. So we consider two rules.

In so far as you can carry your mortal husk, be strict about your own behavior.
In so far as you can defy your mortal indignation, be tolerant of the behavior of others. For those acts you cannot bring yourself to tolerate, try to bring yourself to forgive.

The beauty of these rules is that they fit within the mechanics of the categorical imperative. If every person in the world were to follow these two rules perfectly, we could eventually have peace. It also establishes the reason and necessity of forgiveness, and a solid basis for discussing the lossy abstraction of "human rights".

The last remaining flaw of these rules is that they can promote self-neglect. As I learned from the Star Trek episode, Evolution, self-neglect serves no-one, so there's a third important rule from Guinan: always take for yourself what you can fairly assume for your own health and happiness.

I've used the term, "universe", extensively. I've expounded upon the roles of uncertainty, empiricism, and hope in the quest for truth and peace. Throughout, it's important to have a solid definition of "The Universe". Many unfortunate arguments can be rendered moot by consensus on the meaning of "universe" and a thorough reevaluation of dependent ideas in terms of that meaning.

Let's consider an inductive definition of the universe.

basis: I am in the universe.
recursive step: anything that has impact on a part of the universe is also in the universe.

Apart from discussions about causal-domain-sheer, this definition is concise, simple, and readily applicable. We could consider much more detailed and complicated definitions with intricacies to suit our particular belief systems or definitions of other words we know only descriptively. For the sake of argument, let us take this definition as a postulate and let the definitions of other word and ideas move freely around it.

Imagine for a moment that you have achieved a vantage from which you can see the entire universe. This vantage, by necessity, is outside the universe and your perception pierces its occlusive barriers. You truly see the universe in its entirety.

Zoom out.

The universe is surrounded by void: a conceptual region in which the universe does not exist. No matter how many transendences you make to get to the vantage of the universe, there will still be an eternal nothing beyond it, at least one transcendently infinite scope larger than the universe itself.

Zoom out. Zoom out until you have taken in enough of the void into your perspective that the entire universe's size divided into the size of the infinite void is merely a point. There is only void.

From this vantage, there is an infinity of nothing filled with infinitesimal points, one of which is our entire universe. This does not diminish the significance of our universe, but it frees you to consider all of those other points. Each one is a universe, apart from our own, some perhaps sharing attributes but each distinct.

Mathematical concepts are special. There are uncountably infinite mathematical concepts. Each one is its own metaphysical realm whose complexity emerges from simple rules. This flushes well with the assertions that we all make whether consciously or not:

A point is true because it corresponds to the universe.
The universe is perfectly coherent.

One nice thing about this model of the universe is that it clears up that existential question about creation. Almost all explanations about how our universe came to be rely on intervention outside our universe. Consider this sequence of propositions:

A exists. B does not exist.
A creates B.
B does exist.

The trouble is that A clearly has an impact on B and thus is part of the universe, B. That would necessitate A and B to both exist initially, which is a contradiction. What if A and B simply exist? Mathematical concepts do not require creation events to be internally consistent. This liberates us of the contradiction of creation, but also leaves room for the possibility that A and B existed initially, and for A to exist before B within the confines of the real universe, C. This opens the possibilities of towers of tortoises, big explosions, or sub-universal progenitors.

From a programmer's perspective, among fractals, programming languages, and Conway's Game of Life, there's compelling evidence that suggests that extraordinarily complex patterns can emerge from simple rules.

Define a collection of simple rules.
Apply those rules with mathematical vigor.
Observe that complexity emerges.

Thus, it's possible to become comfortable with the idea that from a simple set of rules, a universe of great complexity can emerge. It does not exist any more than it does not exist. Since we're in it, it's significance to us, more so than any other metaphysical construct that my small mind can begin to fathom, is relevant. Furthermore, a universe arrived at by math, is unassailable perfect.

Let's define "nature" as the set of rules inherent to the "universe". Consider the notion of "super-natural" things or events. If we refer back to our definition of the universe, the only things that can be apart from the universe are those that have no impact on the universe. From there, let's consider the notion of "miracles", events in the universe that defy our conventional understanding of how the universe works, or "nature". One confusion that I think we can avoid is the bundling of the ideas "super-natural", "para-normal", and "miracle". By the definition of the universe, for a miracle to actually be a part of the universe, it must abide by "nature". Thus, for a miracle to exist, it must be "para-normal", not "super-natural".

Various tomes at various times talk about "God". Particularly, I've heard the following attributed to God:

perfection
anthropomorphism
creation of the universe
omniscience
omnipotence
super-natural
performance of miracles

Let's take a step back. God can only be one of the following:

beyond and outside the universe,
within the universe, or
be the universe itself.

If we take the universe to be perfect in whole or part, God could be any one of these things, but the term perfect suffers the diminution of its meaning. Let's say that perfection cannot be broken, thus a whole may be perfect and any part of it falls from perfection in so much as it is not complete. If we adhere to the notion that the universe is perfect, by this reasoning, the god would have to be either the universe or beyond it.

Omniscience denotes the ability to see the entire universe. That would imply that God is beyond the universe, or is the universe as well.

Omnipotence denotes the ability to do anything. If the universe is defined by natural rules, to break those rules could only occur in a universe with different rules. I would contend for God to be omnipotent, omnipotent would have to mean "capable of performing miracles", rather than "capable of anything, and thus capable of super-natural acts". There's quite a bit of room for the unexpected within the realm of possibility. I would prefer to pick a different word than omnipotence rather than conjure a new meaning from a word that clearly means "all-capable". How about, "super-powerful", or "ultra-potent".

Referring to the universe as a mathematical construct based on rules, the universe needs no creator any more than a story needs a story-teller in order to exist. Creation means to make something from nothing. I would contend that this definition is unsuitable for any real act. For example, creating a building is really construction from baser materials. If we substitute the term "creation" for "construction", the term could easily apply to the so-called "creations" of people, and also to the construction of a region of the universe by some super-powerful being. In order for this to be the case, God would have to be a part of the universe, and while not the creator of the universe, the constructor of a world.

The meaning of the word "death" is "end of life". To rationalize the notion of life after death, it is necessary to distinguish corporeal life from ethereal life. In order for either to be meaningful, they both most be real, that is to say, part of the universe. Ethereal life, therefore is either real, or part of another universe. Either possibility is plausible. For example, ethereal life could be in a distinct universe, where it can have no impact on our universe. However, there are infinite verses apart from our universe abiding infinitely diverse rules, none of which are relevant to ours. This notion of ethereal life does not have any meaning in the context of our corporeal lives. For example, such a life would not be bounded by consequences of our mortal lives, but rather bounded by the infinite possibilities of patterns of thought. The other possibility is that ethereal life exists within our universe, is bounded by the same rules, and thus the actions of our mortal lives could have an impact on them. If this is the case, ethereal life is a matter of a mysterious science, not math. In either case, ethereal life is probably not even similar to anything we would want to believe of it.

There are a lot of reasons to want to believe in life after death. For some people, it counters their fear of death. For others, it fills the gap in justice that the universe does not appear to serve during life. Another reason to believe in life after death is to lend an additional purpose to the immortal soul: if one's soul is separate from their body, living an ethereal life instead of a corporeal life, it's easy to reconcile determinism and free choice.

If we take the universe to be a mathematical construct, bounded by rules, the universe is deterministic. That is, if one had complete knowledge of one state of the universe, perfect knowledge of the rules of the universe, and infinite computational power, one could extrapolate the entire universe. Naturally, this is beyond our means as mortal components of the universe, but the notion permits us to use inductive reasoning, the notion that, for a statistically representative sample of the universe, if a rule appears be in action, it can be expected to occur in other samples. Inductive reasoning is imperfect in that our sample, and our perceptions of the rules are imperfect, but the idea is predicated on a predictable universe, the kind we can reason about. It's only rational to make this assumption.

So, the argument occasionally comes up that the universe can only be either deterministic or individuals have the ability to make choices. The notion is that, for a choice to be freely made, it must not be constrained by the rules of the universe. If the universe "made our choices for us", we would not be accountable for our actions. It's furthermore complicated because some people believe that God broke the world so that we could make choices, thus giving our choices meaning.

One way to reconcile determinism and free choice is to assert that a part of the mind, the ethereal soul, is not bound by the rules of the universe. This would contradict the definition of the universe, wherein all things that impact another thing in the universe are part of the universe.

There's another potential resolution. There are wheels within wheels in the universe, and our mind is one of them. In our minds, we are capable of formulating abstractions of the universe, metaphysical verses wherein we make many different choices. These metaphysical verses are in the realm of mathematical possibility because they are imperfect and incomplete renditions of the universe. These are not choices. They are options, and they are no more real than their perceived consequences. Also, the simple fact that we only ever choose one option reinforces the notion that the universe is deterministic. Making a choice is bounded by physical laws, sure, but that does not mean that we are any less responsible for them. It's a necessary assumption that all people are responsible for their actions and that gives our choices meaning.

Neither does this diminish the truth underlying the story of how God broke the world to give us choice. Consider a perfect universe, where we define perfection in this case to be one where good things happen by the rules of the verse. In this realm, choices do not render actions good or bad, but the law of the universe. For choices to be meaningful, good and evil must be independent of the laws of the verse. This explains why bad things can happen in a perfect universe. The story does not, however, address the premise that the rules of the universe are independent of an act of creation.

There's also consolation in relegating one's soul to oblivion after mortal death. If anything, mortality heightens the importance of life, this one chance to make good choices. We are custodians of the universe and for each other. We will have a legacy forever, and from a vantage outside our universe, our impact appears to be timeless.

I mentioned that I was raised in two religions. I'm an unconfirmed, non-practicing Roman Catholic. I also watched Star Trek. Laugh, but episode after episode, it's a series of complex metaphysical and moral thought experiments: a recipe for premature awakening for a child.

When I was in elementary school, I attended a Catholic program called CCD, or Confraternity of Christian Doctrine. At that age in public school, teachers and administrators appealed to our sense of justice and fairness. It was presumed that people my age only understood quid pro quo, a philosophy of balance and revenge, not forgiveness and stability. In CCD, teachers and administrators appealed to our desire to be loved. But, having watched more than one morality play on Star Trek, I was already an idealist, both of justice and reason. I wanted altruism in school, and I wanted philosophy and logic in church.

I never managed to escape school. As soon as I learned modesty, I deduced that my mother could not take me to school if I refused to get dressed in the morning.

I was wrong.

My attendance was impeccable for the rest of my academic career.

However, after a couple years of coloring in pastoral scenes with Jesus and being told that I should believe a boatload of facts mostly relating to how I should behave, because he loved me so completely, I told my mother that I no longer wanted to go to CCD. My mother is a well-intentioned person with a heart that fills every part of her. She does not fathom why a person would question the doctrine of the church. She also had no idea how to convince me that I needed to go to CCD. So, she telephoned the director of religious education, an intimidating pear-shaped woman with a smoker's voice, and put me on the line. She asked, "Why do you not want to come to CCD." I was mortified. I really had no idea how to articulate my thoughts. I feared offending her. I feared the loss of love. I had nothing to say. I also never went to CCD again. It wasn't until junior college that I met Catholics who believed that there were children for whom CCD was not appropriate.

Reading through this article, you have probably noted points that you agreed with or disagreed with. In the past, I've accidentally alienated people with a small number of my thoughts, perhaps not because they disliked me, but because they did not want to be exposed to my point of view and mar their conviction to their beliefs. It's a hard thing to have your faith questioned. It's my hope and care while writing this article to keep you all, my friends and other readers, in my fellowship, my community. It is inevitable that any two of us will disagree on any number of points, but if we hold in our hearts a mote of uncertainty and a well of hope, we can asymptotically approach the truth through discourse together.

Monday, November 24, 2008

Google Eyes 2

Ryan Paul finally found a use for my Google Eyes graphic in his article, Security Expert: Google not anonymous enough, which was posted a couple months ago.

Saturday, November 8, 2008

Blogosphere Links

My apologies once again: the previous post (which I again hope you haven't read yet), was scraped by Google before I could fix various links that broke in the transition from my Subversion staging area to GData. I wrote a migration script for my old blog that had a hack to fix various Cixar-relative URL's. Unfortunately, when I started fully qualifying all of my links, these hacks started over-correcting my blog posts. I've now removed these hacks and fixed the broken references, but again, please view the updated version on Ask a Wizard if you need to follow links. Sincerely, The Management.

Blogosphere

When I moved my blog over to askawizard.blogspot.com, the quickest and easiest way to give my page a unique look was to make my own banner graphic. I decided to make a depiction of the "blogosphere" as I envisioned it based on the mimetically virulent xkcd 239. Randal Munroe stakes claim to a literal interpretation of the popular term "blogosphere": a layer of the atmosphere where free speech enthusiasts convene aboard various flying platforms and hawk their ideas in a sort of aerial or ethereal bazaar.

According to xkcd, Cory Doctorow from the EFF is the only blogger who actually wears a cape and goggles. I caught a certain implication that he's the only person crazy enough to dress up when they passionately jotcast. This is not true. I put on my robe and wizard hat. Don't take that the wrong way.

So, I drew a depiction of the blogosphere. Over the last few weeks, I've taken some time to characterize more of my web neighborhood.

Ryan and I are in the foreground in the guises we took for our web comic, Punnished.
Cory Doctorow occupies his ironclad balloon on the opposite side of the scene. (?:T)?ron Paul lurks behind in his blimp.
Like the wizards and witches of Harry Potter, in this metaphysical representation of the web, microblogging is facilitated by sending flocks of Twitter bird messengers from balloon to balloon. I called out Josh Lewis, a prolific blogger in my network, by putting him and his family in a balloon the same initial color as his old blog, emanating a halo of tweety birds.
Ars Technica is heralded as one of the largest "blogs" on the Internet. My long-time MUD project partner and friend, Ryan Paul edits the Ars Open Ended journal. While the blogosphere is not quite high enough to qualify as "orbital" in the sense of Ars's purported "Orbital HQ", I decided to put Ryan in a nearby balloon emblazoned with a likeness of the Ars logo. Ryan is wearing a cloak of Roman imperial purple.
One of the earliest bloggers, Peter-Paul Koch flies in the dark-blue Quirksmode balloon, from which he provides a steady hail of browser compatibility tables.
Zeppelin enthusiast, Simon Willson, flies a Django themed zeppelin. He is trailed by the mimetically unstoppable Django Pony.
A man I believe saved JavaScript from horrible doom, Doug Crockford occupies a balloon in his blog's colors.
As a nod to all the fabulous people who participated in #wotw2, a Martian Unpowered Inter-planetary Attack Cannister falls through the scene on the left.

I've posted an annotated version of the current banner as a larger image, including a version in the original Scalable Vector Graphics format, wherein the annotations are a hidden layer.

I plan to add to the scene every once in a while.

iTunes Playlist Images

Google's feed scraper is a bit overzealous about my blog these days. When I uploaded my previous entry (which I hope you haven't ready yet), the image URL's were incorrect. They have been rectified. Please consider following the link to the original article if you find broken images. Thanks, The Management.

My iTunes Playlist

I have almost 9,000 songs, comprising a month of continuous play and 60GB of storage. I've actually listened to maybe half and rated about 20% of the library. Random shuffle doesn't work for me anymore.

When I was in college, I noticed that trying to do homework with lyrical music hindered my productivity, presumably because it engaged my otherwise occupied language brain-matter. I'm also a musician, so I've definitely collected a trove of anecdotal evidence supporting the assertion that exposure and repetition to particular pieces (and broadly but with weaker correlation, certain genres) expands appreciation. Also, overexposure diminishes the effect, presumably because the amygdala begins tuning out the pattern. Randomly traversing my library usually leads to irritation. Also, I thirst for new, rare, serendipitous musical experiences at the expense of occasionally hearing a piece I'm not ready for.

So, on account of what I presume to be common psychological phenomena, my ideal musical experience would consist of:

mostly music I know and like
some music I don't know occasionally
no music that I definitely don't like
some constant minimum interval between playback of a given piece.
intervals between hearing a given song inversely proportional to how well I like the piece
over time, my playlist should improve as I provide feedback to the system

A few months ago, I found a way to accomplish this with iTunes smart playlists. I construct "pools" of songs. Each pool contains only songs that have a particular rating and haven't been played recently. If there are lot of songs with a particular rating, I generally require a longer interval between plays. So, I created a pool for each rating, and maximum number of songs in each pool to tune the probability of a a song being chosen from each category. Then, I created a master smart playlist that incorporates all of the rating pools.

Playlists

5 stars

I have 76 tunes in this category. Almost all of them can make it to the master playlist. Since they play frequently, the dominant factor is when they were last played. The number of songs that haven't been played or skipped in the last 5 days hovers around 20. As a special factor for these, I broke the songs that are longer than 10 minutes into their own category so that only two of them contribute to the master mix at any time. I like Mahler's second symphony a WHOLE LOT, but a half an hour is a long commitment and I prefer to save it for maybe once every other week. I also have about ten different versions of Beethoven's Symphony 7 movement 2, so I might have to make a new category for it so only one of them comes up at a time.

5 Star Mix Predicates

4 stars

200 tunes total. 200 tune limit. 20 days between playing. Defer 10 days if skipped. Stable around 20 tunes contributed to the master playlist, so minimum interval is the dominant factor.

3 stars

800 tunes total. 100 tune limit, sampled from least recently played. Defer 1 week if skipped. Stable at 100 tunes contributed to the master playlist, so the maximum sample size is the dominant factor.

3 Star Mix Predicates

2 stars

800 tunes total. 20 tune limit, sampled randomly. Defer 4 months when played or skipped. stable at 20 contributed to the master playlist, so the maximum sample size is the dominant factor.

1 star

I reserve this rating for songs I don't want to hear in a random shuffle, but don't want to delete either. This includes Christmas tunes, apart from the soundtrack to The Nightmare Before Christmas. There are about 400 of these. I should probably use this rating for songs I would like to very rarely hear and use the checkbox to exorcise a song from my random shuffle.

0 stars

These are songs I've not rated. 6500 tunes. 30 tune limit, sampled randomly. Playing or skipping defers the next chance to play for six months. There are 30 tunes in this playlist, so the dominant factor is the max tune limit.

Mix

Then I create my main mix playlist from these rating pool lists. The trick is to create an "any" predicate and include all of the smaller playlists with playlist predicate rules.

Master Mix Predicates

So, believe it or not, this minimizes my need to fiddle with iTunes and keeps me focused on work. If you've got a ridiculous music library too, I highly recommend this technique.

Also, if you work on iTunes, I highly recommend providing a smart playlist abstraction with equalizer knobs for each rating to automate similar processes; not everyone's a programmer and I bet this problem is just beginning to surface for most folks. It would be nice to see how probable a particular song is to be played based on the size of its pool, the effective sample size of its pool, and the total of all sample sizes. I also want to be able to sort and filter and bulk set "checked" and "unchecked".

Saturday, November 1, 2008

War of the Worlds 2.0 - The Post Mortem

Stuff's just beginning to trickle in reflecting on our Halloween reenactment of The War of the Worlds on Twitter. Here are some links to what we've found so far.

The premise
The invasion status feed, @wotw2
Search for participants using the #wotw2 hashtag.
The Buzz Out Loud CNET podcast that Josh Lewis called into to advertise the event
The Call of Cthubuntu, wherein Ryan Paul plugged the event on Arstechnica.
Coverage from Wired's Underwire Blog
Diggs for the Wired article.
Coverage from Ashville, NC
French coverage from Presse Citron
Josh Lewis's video
@RhythmHippy's doctored photos
Flickr photos
@biz Stone's (Twitter co-founder's) tweet.
@MackReed's blog entry
@sea_dot's blog entry
Jimmy Roger's post on The Tech Life
@MackReed's retrospective blog entry
@sea_dot's retrospective blog entry
Josh Lewis's retrospective blog
Kevin Hendriks's retrospective blog
Archive of everything we could scrape when the event concluded and we realized we were going to want an archive to build cool visualizations from. The archive includes source code you can use to perform your own automated tweets in conjunction with @segphault's Gwibber microblog Python library.
The tweet plan spreadsheet

Each participant averaged 2.6 tweets total. The most active participants posted 60 to 100 tweets. There were around 600 people following the invasion progress. That's about 1500 tweets for all participants over the course of the event. I think it's safe to assume that about 10,000 people were touched by these apocalyptic tidings.

Hopefully, we'll have a participation histogram (total tweet count for each participant) soon. I'd like to see a timeline integrating all of the media we produced.

That was a lot of fun. We'll have to do something similar again some time.

Sunday, October 19, 2008

What is Tale Anyway?

So, I've been working on this project that's currently called Tale for nearly a decade now. The notion is to build an interesting, fun, humorous, immersive, narrative virtual reality. If you're into such things, it's more succinctly called a distributed, web-based MUD. I don't intend to go into detail about how Tale is like a MUD and unlike a MUD. This article is intended to explore some of the novel aspects of Tale.

Gameplay

Tale's landing page, tale.im, is the game. You click play and the game begins. There won't be a lengthy sign up and character creation process; everyone starts as a ghost in Limbo or gets straight back to playing from wherever they left off. To start playing, you just click or type "start" and you're consciousness is installed in a standard-issue resident of Dya, the Tale world. From there you can start doing whatever it is you like doing most: socializing, exploring, fighting, building, or getting into character.

Playing Tale is like many things you're probably already familiar with. It's like chatting with friends with an Instant Message client, or IRC, or opening a Terminal and issuing commands. MUDs in general combine the social aspects of chatting with friends, with a virtual world simulator and its own narrative and commands. Some MUDs lean in favor of socialization and others on gameplay. Tale favors neither, allowing you to gracefully shift between mostly chatting and mostly playing. Since the game is in a web-browser instead of a specialized chat application or old-fashioned command terminal, Tale can alternate among coherent interactive modes like entering quotes to say aloud like "hello, world!", entering action commands like "go north", typing quick commands with single keys for mini-games like "p" for "parry", "r" for riposte, "f" for feint if you're fencing, or browsing through menus to discover commands visually.

Already in Instant Message and IRC clients, you might be familiar with commands like "/me laughs" (that you can use in AIM) or "/nick cowbert" (that you can use to chose a name in IRC). If you use Vim, you might already be familiar with colon commands like ":w". If you use FireFox's incremental search feature, you know that you can use single-quote and slash to start searching for text as you type, alternating between single-key command mode into a full-line command mode.

Tale's input field will accept similar commands depending on whether you're mostly chatting, mostly playing, or performing quick, expert actions. If you're mostly chatting, you'll be able to issue commands with the slash "/" prefix like /go north, and when you're mostly playing you'll be able to chat with a quote prefix like "hi. If you're in an expert mode where you're, say, navigating a yacht on the high-sea with discrete arrow key and letter key commands, you can start entering a chat or command with the same keys.

Narration

Each player in the Tale world has a narrator. This is the charming fellow who listens to your commands and tells you how the story unfolds in response. The narrator is a program that keeps track of what you already know and what you're likely to be interested in and interprets the events occurring around your character like, "Joe says 'hi'.", into just that kind of textual prose, as well as taking your commands and plugging them into the simulator in the selfsame event objects. The ideas of "events" and "things" are central to how Tale works behind the scenes, but it suffices to say that if you are "Elbereth" and you're talking to "Glorfiniel", the narrator would translate events like:

Elbereth Say "Hello" to Gorlfiniel
Glorfiniel Say "Hi" to Glorfiniel

…to story-like narrative accounting for various linguistic and cognitive assumptions that you regularly make — "You say, 'Hello', to Glorfiniel and she says, 'Hi', back.". There is a lot of room for making the narrator more and more interesting as Tale development continues, but there's code in Tale already that creates reasonable paragraphs from a stream of events, accounting for pronouns and such, but still sounds fairly robotic.

Having a complex narrator serves many purposes in Tale. First, it makes the game immersive. It also relieves the burden of creating lots of static content, which is one of the limiting factors that ultimately ends with most games feeling rather shallow. A complex narrator can also abstract the tired MUD practice of BATTLE SPAM, where lines upon lines of discrete hits and jabs rapidly scroll up the chat buffer.

The most subtle benefit of the narrator is that it can introduce a significant burden for programmers seeking to automate their characters, often called "botting". "botting" tends to diminish the quality of gameplay for non-programmer players. With the narrator in our arsenal, we can engage in an arms race with "botters", continuously evolving the narrator to deter botting without banning the practice outright. This incentive structure could be a lot of fun for everyone.

Beyond the Narrative

The narrative is the centerpiece of Tale gameplay. However, unlike a Telnet or Teletype MUD, Tale is not strictly text-based. Since the game takes place in a web browser, we intend to take advantage of the opportunity to add icons, maps, inventory visualizations, sounds, and atmosphere and lighting changes to punctuate the narrative.

Among other things, we're planning to use "parametric" graphics to underscore the narrative. For example, if you're in a sailing or flying vessel, or riding a rabbit around the Tale world, your ride will have a parametric icon. We're using SVG graphic manipulation on the server-side to render permutations of layers from "macro" graphics to produce customized versions of each ride and vessel. The parameters include things like the genetic domain of your steed (Is it descended from an elephant? Did the Narwhal bless it with a tusk?), and the installed rigging on your Zeppelin. The two Ride macro graphics have around 100 layers each.

Inspired by Josh Lewis's old blog site, we intend to mix up the style sheet colors on the page slowly and subtly to reflect the time of day and lighting of your surroundings as you explore the world. Each face of the world has its own color scheme and different parts of the world are lit by the 20-sided Sun and various celestial dice, at times filtered through dappled shade, or illuminated by magma and mana. Using alpha-translucent PNG renderings of SVG graphics, and subtly tuning the foreground and background colors of the entire page, we'll gradually modify the mood of the narration.

And there's Music! The musical selection you experience as you roam the world will vary based on what face of Dya you're travelling on and the tension level of the narrative: safety, adventure, battle, and peril. Chris Pasillas has composed six brilliant themes and variations for Tale and is currently rendering them. The game uses a variation of Scott Shiller's SoundManager library that's been ported to operate as a Chiron JavaScript module. (The largest distinctions are that the Flash bridge code has been factored into its own module, and the API has been simplified. It's a complete rewrite using SM2 as an MTASC reference.)

Engine

The Tale engine is a real-time clock-based event-driven simulation. Every event can trace its cause to a tick of the world clock, which for the time being is imagined to be once per second. The difference between a player character and a non-player character (NPC or MOB in the MUD parlance) is that a non-player character generates a progression of commands on demand using a Python generator that yields Event or Verb object trees in response to observations about their environment and their current state, and a player character has a queue of the same kind of Event or Verb trees. This means that invoking a command in Tale does not render an immediate physical response from the world's engine. Rather, a command is translated by the narrator into an Event and it's enqueued to be invoked when the world clock ticks and visits everything that's not busy or waiting, in the whole world. This levels the playing field between people with nearly instantaneous Internet connections to the Tale server with those who have up to one clock tick worth of lag. It also, in a sense, regulates a certain "conservation of energy" law throughout the world, permitting game mechanics to be tuned more accurately and fairly.

World

Feature Number One in Tale is the data structure we've chosen to model the world. This data-structure underpins it's potential consistency, algorithmic subtlety, performance, scalability, and distributability. This data-structure conveniently models a world of arbitrary size, scope, and detail and provides the scaffolding for engine effort minimization for event propagation, abstraction, and synthesis.

So, what is it? Well, nothing new really: it's a tree. Depending on the scope and particular region of application, its children have 0, 4, 6, or 9 branches on each node. Most of the world is simply a quadtree. Each node has four children, one for each quadrant. The top of the quadtree has a bounding box that encompasses exactly all of contained, square region, and for purposes of Tale, conceptually encompasses all of the air-space above that region. A room is a leaf of the quadtree: a node through which players travel with a unit width and height. Each node is an abstraction of all of its contained rooms, containing the sum of its rooms contents. Roaming the quadtree, each player can divine it surroundings by analyzing the contents and events occurring in its own room, and then those events and contents of each of the more abstract rooms that contain it all the way up to the root of the world tree. Each room is directly connected to its parent and children, and traversing linearly across the plane of the world uses tree traversal algorithms where you zoom in and out of the quadtree to find your destination.

Among the algorithmic possibilities that the quadtree provides a scaffold for, we can provide multi-scale gameplay, the same inspiration behind the recently released, Spore. The most mundane way this might be applied would be to have larger scale creature roam the world in abstract bounding boxes, like a Fire Drake that would move from 16x16 room to 16x16 room. Room abstractions could also be used for flight or sea navigation, since one travels higher and faster with that medium, somewhat negating the mundane obstacles of the world floor. As an eagle, one might move about the world by zooming in and out instead of traversing in cardinal directions. Also, more abstract forms of gameplay might take place entirely on room abstractions. For example, melee would occur in 1x1 rooms, ranged attack (missiles and magic) in 8x8, tactics at 32x32 with 4x4 tiles, strategy at 256x256 with region-sized tiles, and politics at larger yet. Taking a moment to step out of a first-person player experience, garnering increasingly abstract political support from larger social organizations, gameplay could scale from thief to Queen).

Another algorithmic possibility is maintenance of the world's physical invariants, like global distribution and availability of various resources. Using abstractions to redistribute resources absolutely throughout the world would make it possible to keep the weight of gold throughout the world constant. This would prevent the most common physical inflation problems that most MUDs suffer.

The other major implication is that the quadtree permits entire branches of the world to run on independent servers with minimal cross-talk. During my last year at Cal Poly, Ryan Witt engaged his distributed computing team at Caltech to implement a Distributed Hash Table (Chord) for Tale and we extensively discussed the implications of the quadtree for purposes of distribution for scale and redundancy. Ryan and I are now roommates working at FastSoft. The topic of distributed computing comes up from time to time, even though we still need to get a vertical gameplay stack working on a single system before we begin implementing distribution.

Events

One problem Shawn Tice (who needs his own home page, because he's too awesome not to have one at this juncture) and I struggled with when we were coding up the second Python rendition of the quadtree was event management. The general notion has always been that events would escalate up the quadtree to a scope that contains all objects that can notice the event, then each object would have an opportunity to respond to the event by visiting each node in their room's "scope chain". The borrowed notion of a scope chain, in this case, means the containing room and the transitive closure† on its parents. Events would each have an "impact" attribute and each object would have a "sense" attribute. So, each time the world ticks, a bottom up visitation of the world would calculate the minimum impact that an event would need to have to be noticed by the one thing in that branch of the world with the highest sensitivity. This would be performed for all of the senses in parallel: visual, aural, olfactory, thermal, thaumaturgical, social, and such. All objects that can have effect events would have senses. For example, a wall can hear a sound so that it can echo it. Most events would also be directed, so they can only propagate in one of the cardinal directions: north, south, east, or west.

However, as you may have already had an inkling of suspicion, this quadtree-based event propagation model isn't actually realistic, and adding realism would nullify the performance and scalability benefits of the system. For example, with the described event propagation system, if you were standing in a room off-center of a quadtree, as at (2, 2) of a 4x4 quadtree, an sound that is loud enough to carry to room (3, 3) might not be loud enough to carry to room (1, 1). The sound would have to be loud enough to carry to (0, 0) before (1, 1) would hear it at all. So, to guarantee that events radiate the same distance in every direction, you need to propagate events laterally, not merely along the quadtree abstraction scope. However, radiating events laterally defeats the performance benefits of restricting perceptible events to those along the scope chain.

So, there was a trade off between performance and realism. Then we remembered we were writing a game. Thinking back to the original Zelda, we observed that each tile of the screen was analogous to a room in Tale, and that every screen in Zelda was analogous to one of the nodes in its abstract node scope chain. That is, there was some parent room that contains about 30 of the rooms Link walked around in. When he got to the edge of the screen, he was no less privy to what was going on in the neighboring room than he was on the opposite side of the screen. So, we decided that the same abstraction would be sufficient for Tale, except that in Tale, the opportunity for zooming in and out around your character was much more profound; we could provide zoom perspectives for every binary order of magnitude.

Frameworks

The current underlying technologies include Python, Twisted, AJAX, Comet, and Lighttpd. In the past, I dabbled with C++ and Boost, including an iostream system that decoupled input and output, had reasonable variable names, and supported VT100 stream functors, but that code branch has been abandoned since 2002. In addition, the project has inspired the creation of frameworks that I'm calling Chiron (a JavaScript module system), and Python on Planes (a system for persistent HTTP route and responder trees).

Parting Words

So, Tale is a pretty massive undertaking. When I started this project, I had no idea that good ideas took so much more time to implement than paltry ones, but that lesson's learned. To meet at least one of my New Year resolutions (that is, one of the milestones for Tale), I'm cutting back scope in the short term in order to get the project up and running by January. A lot of this design, if not all of its powerful implication, I plan to realize by then. This weekend, I've been working on sewing the quadtree, the Dyan world design, and the Narrator into a single piece. As always, I'm looking for people who are willing to volunteer to set themselves up with commit access to Tale so that, in the absence of time to spend on the project (which I know none of us have), a sufficient body of people can randomly wake up in a cold sweat with a bad idea and have nothing between them and implementing it for Tale!

† The excessively precise term transitive closure, in this case, means the room's parent, grandparent, great-grand-parent, ad nauseam. Ad nauseam, in this case, means, "until you retch from nausea or meet Adam and Eve: whichever comes first."

Sunday, October 5, 2008

War of the Worlds 2.0 Update

It turns out I write a lot of words and say very little. Here's what you need to know about the upcoming alien invasion:

Follow @wotw2
On Halloween, tweet in earnest about the unfolding alien invasion. Make your own story. Keep in touch with your friends like you would in any other disaster situation. Try to imagine what you would tweet as the aliens land, emerge, and lay waste to every metropolis around the world.
Email wotw2@cixar.com if you want to help plan the story or provide technical assistance.

War of the Worlds 2.0

Last week, Josh Lewis; friend, former Apple coworker, and Lex Luthor hairstyle enthusiast; got me thinking about fiction on Twitter. Then, Ryan Paul pointed out that he had written an article about Twitter fiction already. Here's the idea. Let's (and by "us", I mean everyone) reenact The War of the Worlds, on the Internet, for Halloween!

CBS ran the Orsen Welles War of the Worlds radio program in the tense times leading up to the second World War. The aforementioned Wikipedia article claims that Adolf Hitler denounced the program as a sign of democratic decadence. Without commercial breaks and interpolated among real news, the program was broadcast in earnest and public panic ensued. War of the Worlds was the Ultimate Prank, never to be reproduced. As an homage, let's take Halloween to tweet, share bookmarks and status messages, and blog in earnest about the alien invasion in progress. Everyone, tell your story.

The War of the Worlds has several phases that I believe have strong analogs in the tense times leading to World Web II (point oh).

a noteworthy astronomer speculates on the possibility of an alien invasion. This would be a good time to talk about the Drake equation in your blog, especially if astronomy is your hobby. Send an email to wotw@cixar.com with your blog so we can proliferate it on the @wotw2 Twitter account.
The alien invasion occurs. Follow @wotw2 to keep in sync with the progress of the invasion. This Twitter feed will automatically update, in general terms, the unfolding of the alien invasion like clockwork throughout the world. Coordinate with Tweeters in your area to tell local stories.

cylinders fall from the sky. Tweet about where you are. Ask your friends where they are. Form posses. Skip town or take a closer look.
tripods emerge. Flee, get stuck in traffic, or take refuge and tell us what you see.
Martians begin obliterating every Terran metropolitan area with heat rays. Don't call them heat-rays; that would be a dead giveaway. Describe what they do and come up with your own name! Do you work in a public service like hospitals or fire? What's your job and what do you do? Do you organize your coworkers and flee? Do you head for the hills with your go-bag?
Military, local militia, and national guard units get organized and attack the alien invaders. Do you serve in the military? This is your last chance to tell us where you're headed. Do you have family in a militia? Try to keep in touch and let us know how and where they valiantly fought and lost.
The invasion spreads from cities to countryside.
Tripods begin to shut down an malfunction. Are you near one? Do you take a closer look?

After the threat dies down, people begin to blog and speculate about what happened, and every topic near and dear to them.
The curtain rises. Blog, link, and tweet about the experience.

The "War of the Worlds 2.0" event will be synchronized, on Halloween Friday, with the http://twitter.com/wotw2 account and we're writing automation with Ryan Paul's Gwibber tool to automatically post Tweets to various services on that day. If you would like to make additional accounts to synchronize local events, please send an email to wotw2@cixar.com with a Twitter account name and password and we'll hook you up with edit privileges for the tweet plan on Google Docs. Tweets (rows) will be deployed for each column (account).

Let's tell a story!

JSON Basics

JSON is a strict subset of JavaScript, particularly a subset of its object-literal notation, discovered and specified by Doug Crockford. The subset is sufficient to make the creation of parsers and formatters in various languages nearly trivial, while completely trivializing the process of parsing the notation in a web browser.

Object-literals in JavaScript are very similar to their analogs in many other languages, like Perl, Python, and even AppleScript. The notation provisions text strings, numbers, booleans, and the literal null for all elemental (scalar) data. Then the notation provides Arrays for ordered values and Objects for unordered, unique, string-to-anything key-value-pair mappings. With these types, you can express most hierarchical data easily.

10
3.1415
"a"
true
false
null
[1, 2, 3]
{"a": 10, "b": 20}
[{"a": 10}, {"a": 20}]

JSON makes a couple simplifications even on JavaScript's object literal grammar. These make it even easier to write parsers and formatters.

a JSON expression contains no new-line characters.
all strings, including keys in objects, are enquoted between double quotes.

JSON joins the ranks of many other markup languages that we can use to easily transfer data among web servers, but its primary function is as a common data interchange language between web browser clients and web servers hosted in the numerous languages of the web. JSON is well adapted for this space because it can take advantage, through various channels, of every web browser's fast, built-in JavaScript interpreter.

There are two flavors of JSON services with subtle differences for both clients and servers. One is JSON proper, which I will call XHR JSON because it uses an XML HTTP Request. The other is called JSONP or "JSON with Padding". You would use one, the other, or neither based on security and performance concerns.

XHR JSON

XHR JSON uses a feature that exists in one form or another in all modern web-browsers, called an XML HTTP Request, and a function in JavaScript that lets you evaluate arbitrary text strings as JavaScript programs, called eval. With an XML HTTP Request, the client can make an HTTP Request to any URL on the same domain as the hosting page. This constraint is called the "Same Origin Policy". Unfortunately, the policy is in many cases not sufficient to isolate vulnerability to paths of a particular domain, nor to prevent cross site scripts from using the data (more on that later). Whether the security mechanism is effective or not is a longer and later discussion. The point being that this mechanism is the crux of your choice between XHR JSON and JSONP.

With an XHR, you request plain text from a URL on the same domain as your page, then you evaluate it as a JavaScript program. Performing a cross-browser compatible XML HTTP Request isn't trivial in itself, so let's assume that I'm using a library that provides an xhr function that synchronously (blocks) until an HTTP request is complete and returns the content of the HTTP response as a String. There are other variations on xhr for asynchronous requests and grabbing XML.

var text = xhr("/some.json");
var json = eval(text);

In this case, the "/some.json" contains unadulterated JSON. However, because your JSON is likely to be an Object like {"a": 10}, we need to make a special arrangement. A JavaScript program that is merely an Object literal would be mis-parsed initially as a code block. For this reason, we must force the JavaScript parser into expression context instead of statement context. For this reason, we wrap the JSON string in parentheses or assign it to a variable.

var text = xhr("/some.json");
var json = eval("(" + text + ")");

I prefer parentheses because I haven't, in my fallible memory, encountered a browser that supported XHR but wasn't standards compliant enough for the eval function to return the value of the last evaluated expression.

When you use an XHR to grab JSON, you are vulnerable to the host, depending on it to not give you an alternate JavaScript program that insidiously evaluates to the same data. For that reason, a lot of JSON libraries engage in the slower and dubious attempt to validate the JSON before evaluating it with a regular expression. The bottom line is that you're vulnerable to your server when you use XHR and eval.

That being said, it's better to get an exception early than an untraceable error later. For that reason, attempting to validate JSON is a good idea. There's another one: variable laundering. The eval function inherits the calling context's scope chain. This means that the server can read and alter any variables on your scope chain. I use an evalGlobal routine to launder my scope chain. This doesn't give security all by itself, but it could help lead to a secure system down the way and can turn a bunch of silent name resolution errors or data leaks into thrown NameErrors.

(function (evalGlobal) {
	var text = xhr("/some.json");
	validateJson(text);
	var json = evalGlobal("(" + text + ")");
	...
})(function () {
	return eval(arguments[0]);
})

I'll give a detailed explanation of evalGlobal in another article. For now, suffice it to say, this is much more elegant than using a variable to capture the JSON value, although it is possible:

var text = xhr("/some.json");
var json;
eval("json = " + text);

JSONP

JSONP is different than XHR JSON in both the way the server hosts the data and in the way that the client consumes it. On the client side, the user adds a <script> tag to their own page. The source of the script is the URL of a server side program with the name of a callback function that the client places in global scope sent as a query argument.

<script src="http://other.com/some.jsonp?callback=foo">

The client script arranges to receive parsed and evaluated JSON data asynchronously by adding a foo method to the global scope, the window object.

window.foo = function (json) {
	...
	delete window.foo;
};
var script = document.createElement('script');
script.src = "http://other.com/some.jsonp?callback=foo";
document.getElementsByTagName('head').appendChild(script);

When you add a script tag to your document, browsers are kind enough to notice and automatically send an HTTP request to fetch the request JavaScript program. In this case, your page trusts the script on other.com to return a JavaScript program that will call your "foo" function and pass it some JSON data.

foo({"a": 10});

With JSONP, the client is vulnerable to the remote server because it can opt to write an arbitrary JavaScript program before calling foo, if at all. So, you should only use JSONP to request data from domains that you trust.

Also JSONP is slower and less responsive to errors than XHR JSON can be. The only signal that you receive as to the progress or status of a JSONP request is whether your callback gets called in a timely fashion. XHR JSON is preferable in all situations where it is possible and it may even make sense to create a proxy to the other domain from your same domain server. Using a proxy also gives you an opportunity to validate the JSON on the server-side, where computation time is cheap.

Same Origin XHR JSON

There's another trick with XHR JSON, and I'm not sure what scenarios require it. Some people can use the JSONP technique to fetch JSON from a foreign domain. These folks can't send HTTP headers or cookies, and they never get an opportunity to alter the text of the HTTP request before it's evaluated as JavaScript in their browser. I would think that it would be sufficient to prevent an other domain client from intercepting sensitive data for the server to require an authenticated token in the HTTP request, as can be provided by an XML HTTP Request but not a JSONP request. However, some crackers can attempt to get data from your service by using JSON data in a cross site script, much like JSONP. However, instead of using a callback, these crackers arrange to override the Array or Object constructors and thus can monitor and transmit snippets of JSON constructed by simply evaluating JavaScript object literal notation. This technique can be prevented by padding your XHR JSON service with an infinite loop.

while (1);
{"a": 10}

In this case, your client conspires with the server, and since it does get a chance to intercept and modify the text of the JSON, it strips the known number of characters for the while loop before evaluating it.

var text = xhr("/some.json");
var json = evalGlobal("(" + text.substring("while (1);".length) + ")");

What baffles me is that, unless you've secured your JSON with an authentication token, any server capable of making HTTP requests and parsing JSON could do the same thing. Malicious clients are not required to use web browsers. The only situation where this might occur would be if the cracker had compromised your client already, had your authentication token, and needed your browser to make a cross domain JSONP request on its behalf. That could not be the case since the server would have no reason to give an XHR JSON authentication token to a client that could only fetch data with JSONP. That point is even moot since, going back, the cracker has already compromised your client and might as well use XHR JSON with all the same rights as you on even the same origin.

That being said, I noticed that GMail used this technique, so I assume there's substance to it. Perhaps someone will clarify in the comments.

Friday, October 3, 2008

Designing Django's Object-Relational-Model - The Python Saga - Part 6

Django is a web application framework in the Python language. One of the advantages that Django has over other libraries is that it was written and designed by Python experts. That is to say, they knew about variadic arguments, properties, and metaclasses. Furthermore, they knew how to cleverly use these ideas to sweep a lot of complexity under the hood so that common developers, or uncommonly good developers who want to think about other things most of the time, can gracefully suspend disbelief that anything complicated is going on when they design their database in pure Python. This article will illustrate how Django uses metaclasses and properties to present an abstraction layer where you can specify a database schema with Python classes. For simplicity, the "database" backend will be plain Python primitive objects—tables will be dictionaries of dictionaries.

In the end, we want to be able to write code that looks a whole lot like it's using Django's Object-Relational-Model:

class Cow(Model):
    id = PrimaryKey()
    name = ModelProperty()

cow = Cow(id = 0, name = 'Moolius')
cow.save()

cow = Cow.objects.get(0)

The easiest part (for the purpose of this exercise) is Django's concept of an "object manager". In Django, every model has an object manager that provides a query API and, depending on the backend, might cache instances of Model objects. Conveniently, a very narrow subset of the object manager API is almost exactly the same as a dictionary. Conceptually, the object manager boils down to a dictionary proxy for the database where you can use the get function to retrieve records from the database. For simplicity, our ObjectManager is just going to be a dictionary.

class ObjectManager(dict):
    pass

Beyond the scope of this article, the ObjectManager should be handy for grabbing lots of objects from the database at once. Django provides a very thorough and relatively well-optimized lazy query system with its object managers. The ObjectManager has get, and filter methods which, instead of simply accepting the primary key, accept keyword arguments that translate to predicate logic rules. In particular, the filter function is lazy, so you can chain filter commands to construct complex queries and Django only goes to the database once.

While it would be super-cool to model all of this with native Python, it actually is a lot of code, so that's a topic for maybe later. We'll just use the built in dict.get.

We'll also need all of the code from Part 5 since models will be another application of the ordered property pattern. This is how Django creates SQL tables with fields in the same order as the Python properties.

from ordered_class import \
    OrderedMetaclass,\
    OrderedClass,\
    OrderedProperty

We use the OrderedMetaclass to make a ModelMetaclass. The model metaclass will have all the same responsibilities as our StructMetaclass, including "dubbing" the properties so that they know their own names. The model metaclass will also create an ObjectManager for the class. This isn't the complete ModelMetaclass; we'll come back to it.

class ModelMetaclass(OrderedMetaclass):
    def __init__(self, name, bases, attys):
        super(ModelMetaclass, self).__init__(name, bases, attys)
        if '_abstract' not in attys:
            self.objects = ObjectManager()
            for name, property in self._ordered_properties:
                property.dub(name, self)

The next step is to create a ModelProperty base class. This class will be an OrderedProperty so it's sortable. It will also need to implement the dub method so it can figure out its name. Other than that, it'll be just like the StructProperty from the previous section: it will get and set its corresponding item in the given object.

class ModelProperty(OrderedProperty):
    def __get__(self, objekt, klass):
        return objekt[self.item_name]
    def __set__(self, objekt, value):
        objekt[self.item_name] = value
    def dub(self, name):
        self.item_name = name
        return self

There is a distinction in the refinement of ModelProperty from StructProperty: ModelProperty objects will eventually need to distinguish the value stored in the dictionary from the value returned when you access an attribute. In the primitive case, they're the same, but for ForeignKey objects, down the road, you'll store the primary key for the foreign model instead of the actual object. This is the same as the behavior in an underlying database backend.

class ModelProperty(OrderedProperty):
    def __get__(self, objekt, klass):
        return objekt[self.item_name]
    def __set__(self, objekt, value):
        objekt[self.item_name] = value
    def dub(self, name):
        self.attr_name = name
        self.item_name = name
        return self

Let's consider a PrimaryKey ModelProperty. The purpose of a PrimaryKey is to designate a property of a model that will be used as the index in its object manager dictionary. In Django, this can be an implicit id field at the beginning of the table. For simplicity in this exercise, we'll require every model to explicitly declare a PrimaryKey. The ModelMetaclass will identify which of its ordered properties is the primary key by observing its type. Other than their type, a primary key's behavior is the same as a normal ModelProperty, so it's a really easy declaration:

class PrimaryKey(ModelProperty):
    pass

Now we can go back to our ModelMetaclass and add the code we need for every class to know the name of its primary key. I create a list of PrimaryKey objects from my _ordered_properties and pop off the last one, leaving error checking as an exercise for a more rigorous implementation. There should be only one primary key.

class ModelMetaclass(OrderedMetaclass):
    def __init__(self, name, bases, attys):
        super(ModelMetaclass, self).__init__(name, bases, attys)
            if '_abstract' not in attys:
                self.objects = ObjectManager()
            for name, property in self._ordered_properties:
                property.dub(name)
            self._pk_name = [
                name
                for name, property in self._ordered_properties
                if isinstance(property, PrimaryKey)
            ].pop()

Now all we need is a Model base class. The model base class will just be a dictionary with the model metaclass and a note that it's abstract: that is, it does not have properties so the metaclass better not treat it as a normal model.

class Model(OrderedClass, dict):
    __metaclass__ = ModelMetaclass
    _abstract = True

The model will also have a special pk attribute for accessing the primary key and a save method for committing a model to the ObjectManager.

class Model(OrderedClass, dict):
    __metaclass__ = ModelMetaclass
    _abstract = True

    def save(self):
        self.objects[self.pk] = self

    @property
    def pk(self):
        return getattr(self, self._pk_name)

Now we have all the pieces we need to begin using the API. Let's look at that cow model.

class Cow(Model):
    id = PrimaryKey()
    name = ModelProperty()

cow = Cow(id = 0, name = 'Moolius')
cow.save()

cow = Cow.objects.get(0)

All of this works now. You make a cow model; that invokes the model metaclass that sets up Cow._pk_name to be "id" and tacks on a Cow.objects object manager. Then we make a cow and put it in Cow.objects with the save method. This is analogous to committing it to the database backend. From that point, we can use the object manager to retrieve it again.

We can refine the Model base class to take advantage of the fact that it's not just a dictionary anymore: it's an ordered dictionary. We create a better __init__ method that will let us assign the attributes of our Cow either positionally or with keywords. That makes our cow more like a hybrid of a list and a dictionary. Also, since our model instances aren't merely dictionaries, we create a new __repr__ method that will note that cows are cows and moose are moooose. The new __repr__ method also takes the liberty to write the items in the order in which their properties were declared.

class Model(OrderedClass, dict):

    …

    def __init__(self, *values, **kws):
        super(Model, self).__init__()
        found = set()
        for (name, property), value in zip(
            self._ordered_properties,
            values,
        ):
            setattr(self, name, value)
            found.add(name)
        for name, value in kws.items():
            if name in found:
                raise TypeError("Multiple values for argument %s." % repr(name))
            setattr(self, name, value)

    …

    def __repr__(self):
        return '<%s %s>' % (
            self.__class__.__name__,
            " ".join(
                "%s:%s" % (
                    property.item_name,
                    repr(self[property.item_name])
                )
                for name, property in self._ordered_properties
            )
        )

Now we can make a cow model with positional and keyword arguments, and print it out nice and fancy-like:

>>> Cow(0, name = 'Moolius')
<Cow id:0 name:"Moolius">

The next step is to introduce ForeignKey model properties. These are properties that will refer, via a relation on a primary key, to an object in another model. So, the ForeignKey class will accept a Model for the foreign model. Its dub method will override the item_name (preserving the attr_name) provided by it's super-class's dub method. The new item_name with be the attr_name and the name of the primary key from the foreign table, delimited by an underbar. This will let the foreign key property hide the fact that it does not contain a direct reference to the foreign object; it just keeps the foreign object's primary key. However, if you access the foreign key property on a model instance, it will go off and diligently fetch the corresponding model instance. If you assign to the foreign key property, it'll tolerate either a primary key or an actual instance.

class ForeignKey(ModelProperty):
    def __init__(self, model, *args, **kws):
        super(ForeignKey, self).__init__(*args, **kws)
        self.foreign_model = model
    def __get__(self, objekt, klass):
        return self.foreign_model.objects.get(objekt[self.item_name])
    def __set__(self, objekt, value):
        if isinstance(value, self.foreign_model):
            objekt[self.item_name] = value.pk
        else:
            objekt[self.item_name] = value
    def dub(self, name):
        super(ForeignKey, self).dub(name)
        self.item_name = '%s_%s' % (
            name,
            self.foreign_model._pk_name,
        )

Now we can write code with more than one model using relationships. Let's give our cow a bell.

class Bell(Model):
    id = PrimaryKey()

class Cow(Model):
    id = PrimaryKey()
    name = ModelProperty()
    bell = ForeignKey(Bell)

bell = Bell(0)
bell.save()
cow = Cow(0, 'Moolius', bell)
cow.save()

Note that you must save the bell so that when you construct the cow, it can fetch the bell from Bell.objects.

There's more to Django's ORM, of course. This article doesn't cover parsing and validation, which are both assisted by the ORM. Nor does it cover queries, query sets, the related_name for ForeignKey properties on foreign models, Django's ability to use strings for forward references to models that have not yet been declared, or many of the other really neat features.

What this article does cover though, is that you can create a powerful abstraction of a proxied database with pure-Python in less than 200 lines of code. This means that you could create a light-weight proxy over HTTP to a Django database that exposes itself with a REST API. You could also create an abstraction layer that would allow you to pump Django ORM duck-types back into Django to use pure Python objects in addition to or in stead of a database backend.

But, if this article does nothing else, I hope it communicates that Django is cool. I have read a whole lot of code from every dark corner of the web and I have liked very little of it; people I've worked with will testify that I've regularly "hated on" every library or framework I've ever seen. I've never met Adrian Halovalty, Malcolm Tredinnick, or Simon Willson and the growing developer community around Django. However, I've read their code and now I can tell you, over the course of several articles, that they're really smart and you should use their code.