Natural Language Understanding
Language is not magical. There is a common belief that computers can't think the way humans can. Even if we could parse a sentence into it's grammatical structure, how can the computer utilize and understand the parse tree.
There is an area of natural language processing called semantics, which takes a grammatical parse tree of a sentences and converts into an internal semantic representation for reasoning. The representation is usually some form of predicate calculus and one technique used for this conversion is called lambda abstraction.
Predicate calculus uses logical operators (and/or/not), quantifiers(all/exists) and predicates (Eat(Mary,beef)), functions(MotherOf(Mary)), constants and variables. The method by reasoning occurs is similar to theorem proving. There are more advanced forms called intensional predicate calculus that can deal with beliefs and changes over time.
You may be wondering how a computer could possibly understand what a "cat" is without ever encountering one. Such a belief falls into a theory called representational semantics, where the computer must have an internal representation of each word. Another theory, denotional semantics, which turns out to be more expressive yet simpler, states that we only need to be concerned with relationships between words. The referent is not actually important to perform intelligent reasoning. Indeed, a human does not actually have to have seen a cat to use that word.
There is a annual pseudo-Turing Test contest, where group of judges are asked to determine, by conversing with a contestant through a terminal, who is a computer and who is human. While some computer programs have been considered human by a judge (and one human considered to be a computer), it's pretty clear that computer programs have not been up to the challenge. Unfortunately, the computer programs that have competed in this competition have usually been a more advanced version of the infamous Eliza program, parroting back sentences or linking keywords to canned text from a database. I don't think there has been any program that actually parsed and analyzed the judge's input at the semantic level. Another problem is the external indicators such as the typos and typing speed that clue judges as they observed text being entered through the terminal window.
I was on a cruise vacation just recently with my parents and siblings. I brought with me a graduate level text "Meaning and Grammar: An Introduction to Semantics." A good book by the way. My father, who is a doctor, is non-technical and wondered why I was reading an esoteric book. Was I reading it for intellectual curiosity or was it related to my software product; it was the latter. I explained to him that the book exposes a mathematical system for dealing with words and languages, just as we have arithmetic, algebra, and calculus for operating on numbers, variables and functions. His interest level shifted from indifference to wonderment that language can actually be formally manipulated. My brother, who currently is working with Merrill Lynch, was also intrigued by this notion, and proceed to take down the title and author of the book to purchase, even though I don't feel that he may have the necessary preparation to tolerate the book's dry, mathematical discourse.
When we see computers as dumb calculators, it is because software publishers have avoided and realized the potential of adding natural language capabilities to their product.
Part of the problem is that natural language support is a time-consuming and specialized endeavor, especially if multiple languages are to be supported. It should be in the operating system and most developers will have to wait for Microsoft to make that happen. Microsoft is getting there, but, in the Longhorn timeframe, it looks like Windows will just have spelling and parts-of-speech tagging support.