Archives For author

image

In 1971 Edward de Bono published Practical Thinking, and has revised it multiple times; the last time in 1992.

It’s a charming little book, largely because – despite making some false statements – his advice is excellent, practical, and should improve thinking for almost anyone who reads (and applies) it.

The most interesting parts of the book, to me – given I have just completed a major in epistemology, or the study of knowledge (sort-of “how to think”) – was the advice about certainty. It’s well-known (now) that the feeling of certainty people sometimes have is bonkers. de Bono breaks down why it’s bonkers; but also provides ways of avoiding the issue.

I’m not going to re-hash his book, in part because he provides an excellent summary at the back of the book you can reference (and, really, it’s $4).

But the most important takeaway for managers and other “practical thinkers” is the de Bono’s discussion on the tyranny of the YES/NO system. It’s a simple insight: If you keep saying “No” to new ideas, the idea you end up with will be the first idea whose answer isn’t “No.” That is, it will not be the best idea; it will be the first mediocre idea. Abandoning the “YES/NO” system of brainstorming is really rather important.

If you’ve studied, oh, logic, Quine, cognitive psychology, and the philosophy of science, all this stuff will be old hat (and some of it wrong). If not,  I highly recommend it.

image

Today, I read The Science of Fear (2008) by Daniel Gardner. It’s a remarkably well-done book for what it is – namely, a journalist’s (informed) overview of some of the psychological components of fear, and a large number of example as to how people exploit that tendency to fear.

It’s a nice book because he relies on rather well-accepted psychological research, while going into great depth on examples. It helps people to understand that these principles actually mean something.

On the other hand, for those people who were looking for a explanation about how fear works inside the brain – such as myself – it’s a bit lacking. And the reliance on solid psychological principles means I didn’t learn any new psychology. Regardless, I enjoyed the read because it was well-written and interesting.

In lieu of a review, allow me to review some of his points. In the course of the book, Mr. Gardner outlined three ways the brain screws up, leading people to irrational fears.

1. The Availability Heuristic

The availability heuristic is a pretty good rule. It’s a general cognitive bias – pretty robust across all humans – and works out to people predicting the probability of events in proportion to how many instances of it they can recall (are available).

If you do something a lot – work on computers, go hunting, etc – then over time you establish a battery of experiences. If someone asked you how probable something was, you could reach back into your experience and get a feel for how many times you’ve seen it – and give a pretty good example.

The big advantage is that it’s computationally very fast. If you need to make a split-second decision, you want it to be fast.

It’s also extensible: that is, people don’t differentiate between their own experiences (memories) and other people’s (stories). This works out really well if you talk to people who do the same thing as you do – say, a bunch of hunters sitting around a fire swapping stories. That way, you can tap into the knowledge of your entire community (if you haven’t yet experienced it, you don’t know how common it is – hearing stories of experiences can both ameliorate your ignorance and give you ideas of what to do to deal with it).

But therein lies the rub. The media specialize in providing stories – really compelling anecdotes – about things that happen. The brain doesn’t differentiate based on sources, so the availability heuristic can be screwed in the incorrect direction. People vastly overestimate the risk of terrorism, kidnappings, and murder; but vastly underestimate the risk of car accidents, drowning, diabetes, etc.

2. Confirmation Bias

Confirmation bias is an old favorite of psychologists, simply because it explains so much.

It’s pretty simple, actually. Once you believe something, your brain tends to look for other instances of it – confirming instances. It does not, however, look for falsifying instances for your belief. Sometimes, your brain will even change it’s recollection of the facts to conform to your current belief (one example is the “rose-colored glasses” effect; you believe the past was better, so you unconsciously modify your memories of the past to make it match your belief).

But it also means once you believe something about, say, terrorism, you’ll focus on the positive (that is, supporting) instances – and ignore the others. No terrorist attacks does not affect your belief about the danger of terrorism even though a terrorist attack down – which is illogical. It’s a binary outcome, therefore one outcome value  should be just a good a predictor as the other. The brain doesn’t think so.

A rather insidious effect of confirmation bias concerns the use of statistics. If you believe someone, and you come across a statistic (or a story) you disagree with, you’re going to scrutinize it very closely. If, however, you come across a statistic which supports your belief – then, hey, no need to question the source or the methodology, it’s obviously correct. People apply different levels of evaluation to information that conforms with their existing beliefs to information that violates their existing beliefs.

3. The Urge to Conform

Conformity has been studied a great deal, and the results are pretty consistent. When people are in a group and a task is difficult, you see more conformity. That is, lower confidence in the result for any one individual means that people are more willing to accept a group consensus. Funnily enough, though, each individual’s belief in the accuracy and reliability of the group consensus goes way up – even though the confidence of any individual’s conclusion is low.

Mr. Gardner makes the important point that conformity actually serves a good purpose. If you’re on the African plains, and everyone around you begins to get worried about a tiger in the grass – well, even if you can’t see the tiger yourself, there’s a pretty good reason to take precautions. More formally, it allows all members of a group to take advantage of the knowledge from all members of the group, and not rely on their own knowledge all the time.

The problem is that once a belief has taken hold in the general population, it’s bloody hard to get rid of. The combination of conformity – people fall into line – and the confirmation bias means that as a group, people don’t deal with falsifying evidence well at all. Mr. Gardner goes through a hilarious number of examples showing that (i) people say they believe something because of the evidence, (ii) you prove the evidence is wrong, (iii) people still believe it despite accepting that the evidence is wrong.

A Passing Note

In addition to those three psychological features, Mr. Gardner notes a few other issues. Here’s one I found striking.

It has to do with pointing out how badly people deal with numbers. People have no innate ability to deal with numerical data; though they do have a pretty good ability to deal with proportions. Unfortunately, this isn’t a good thing.

Mr. Gardner gives a great example. Take two groups of people: in both, tell them they are reviewing how much money to devote to improving airport safety. Tell the first group that implementing the precautions will save 150 lives; tell the second that it will save 98% of 150 lives. Consistently, people rate saving 98% of 150 lives higher than 150 lives (that is, the second group would devote more money to the project then the first group, even though they were saving objectively fewer people).

And don’t get started on how bad people are with probability – it doesn’t bear thinking about.

A Brief Conclusion

The Science of Fear rests on some good psychology, and goes into a large number of examples as to how human reason fails us when it comes to knowing what to fear.

The real effect of the book is to persuade people to be less afraid; it reduces fear. Mr. Gardner systematically goes through most hot-button political issues, and shows how the data doesn’t back up the fear-mongering. Not only is he persuasive, but he writes in such a fashion that you’ll pick up an innate skepticism of the media (if you didn’t already have it) and a deeper skepticism for anecdotes (if you have no statistical background).

It’s certainly worth the time just for that.

The halo effect has graduated from inflating stock prices to making companies godlike. Thus, they can do anything – mere mortals can just speculate. The truth, however, is frequently mundane.

Taylor Buley, writing on the Velocity blog at Forbes, has the provocative title of “Google Isn’t Just Reading Your Links, It’s Now Running Your Code.” Mr. Buley goes onto explain that “for years it’s been unclear whether or not the Googlebot actually understood what it was looking at or whether it was merely doing "’dumb’ searches for well-understood data structured like hyperlinks.” In other word, Google has built a Javascript interpreter!

image

The source for this headline comes directly from Google:

On Friday, a Google spokesperson confirmed to Forbes that Google does indeed go beyond mere "parsing" of JavaScript. "Google can parse and understand some JavaScript," said the spokesperson.

So it’s confirmed, then.

Mr. Buler spends most of his article explaining that building a Javascript parser is really fucking hard. In fact, a quote from one of his experts isolates the key problem – how long the code will run – and says that “The halting problem is undecidable," There is no algorithm that can solve it. Well, OK, I suppose, but couldn’t you process a lot and cut it off at an arbitrary point? Sure you’d miss some stuff, but surely you’d get enough?

Actually, that’s what another expert says:

"It’s hard to analyze a program using another program," the person says. "Executing [JavaScript code] is pretty much that’s the only way they can do it."

Mr. Buler believes this is a great accomplishment, and quite unknown.

He’s right on one count.

In a previous post, I cited a paper “Data Management Projects at Google” and talked about Edward Chang. Well, the paper is actually about three projects, and one of those is “Indexing the Deep Web,” spearheaded by Jayan Madhavan. In that 2008 paper, Dr. Madhavan had this to say about Javascript:

While our surfacing approach has generated considerable
traffic, there remains a large number of forms that continue
to present a significant challenge to automatic analysis. For
example, many forms invoke Javascript events in onselect
and onsubmit tags that enable the execution of arbitrary
Javascript code, a stumbling block to automatic analysis.
Further, many forms involve inter-related inputs and accessing
the sites involve correctly (and automatically) identifying
their underlying dependencies. Addressing these and
other such challenges efficiently on the scale of millions is
part of our continuing effort to make the contents of the
Deep Web more accessible to search engine users

It would seem they solved this problem! (This is a big accomplishment). When did they solve it? Recently?

Well, sort of. In a 2009 paper called “Harnessing the Deep Web: Past, Present, and Future.” In it, they say this:

We note that the canonical example of correlated inputs,
namely, a pair of inputs that specify the make and model of
cars (where the make restricts the possible models) is typically
handled in a form by Javascript. Hence, by adding a
Javascript emulator to the analysis of forms, one can identify
such correlations easily.

So let’s back up.

What is Google going? They’re accessing structured data hidden behind form submissions. Now, we say the information is “hidden” behind form submissions because you have to submit the form to get the data. One approach – the ”dumb” approach – is to generate all possible result URLs and then crawl all of them.

But. Those clever folks at Google noticed this might be a problem:

For example, the search form on cars.com has 5 inputs and a Cartesian product will yield over 200 million URLs, even though cars.com has only 650,000 cars on sale.

The challenge, then, is making fewer URLs. Thus, they intelligently developed an algorithm with this property:

We have found that the number of URLs our algorithms generate is proportional to the size of the underlying database, rather than the number of possible queries.

How do they do this? Well, one big challenge is (as noted above) the inputs in one field can depend on the inputs in another field. Google has taken to constructing databases of “interrelated data” (like manufacturer and car model) so they can automatically detect the data the form wants and limit their indexing accordingly.

But to detect when some fields on a form are interrelated, you… need to have more than the HTML. In fact, almost all input-dependent forms rely on Javascript to change the values around after a selection.

Well, the clever researchers at Google knew they needed to determine which fields in a form were interrelated. They also figured that they only needed to determine this once, because once they knew which fields were related, they could automatically generate their URLs using their generation algorithms.

As you can imagine, if you only need to do it once (for each form), then it becomes practical to emulate. You emulate one form, and get 650,000 URLs to index with solid data. It’s cheap – so cheap, it’s almost worth getting a human to do it. (Except no Googler would think of that!).

But – and here’s the thing – to emulate the behavior of a form driven by Javascript you have to have the Javascript files. You need to download them, and then execute them.

In other words, the second expert Mr. Buley consulted is spot-on. Google is executing the Javascript code to find out something very specific (which fields on a form are interrelated, and presumably anything done in an onsubmit event that would alter the indexing URL).

This is not news. It’s publically available information – very easily, though Google Scholar, and even easier if you’re following Google’s main researchers – and there is no reason to resort to speculation to answer the question. They’ve been accessing the Deep Web – the web hidden behind forms – for years; Javascript is an obvious stumbling block; Google researchers have papers published on it (frequently presented at conferences!).

It is galling to see a reporter say that something is “unclear” when it is very difficult to make something clearer. In 2008, Jayant Madhavan wrote on the Google Webmaster Central blog talks about crawling through forms to get to the Deep Web – this stuff isn’t restricted to academic papers easily accessible through Google Scholar and surfaced in regular Google results. No, it’s even in the blogosphere.

I think I’ve gone a bit too far, so I’ll stop now.