Course Notes
Stuart C. Shapiro
Department of Computer Science
State University of New York at
Buffalo
Artificial Intelligence (AI) is a field of computer science and engineering concerned with the computational understanding of what is commonly called intelligent behavior, and with the creation of artifacts that exhibit such behavior. [Shapiro, EofAI 2nd Ed.,1992, p. 54.]
See the story "An Approach to Serenity", and its accompanying questions for some motivation.
Brian C. Smith's knowledge representation hypothesis:
Any mechanically embodied intelligent process will be comprised of structural ingredients that a) we as external observers naturally take to represent a propositional account of the knowledge that the overall process exhibits, and b) independent of such external semantical attribution, play a formal but causal and essential role in engendering the behaviour that manifests that knowledge. [Smith in Brachman & Levesque, 1985, p. 33.]
Knowledge representation is the area of AI concerned with the formal symbolic languages used to represent the knowledge (data) used by intelligent systems, and the data structures used to implement those formal languages. However, one cannot study static representation formalisms and know anything about how useful they are. Instead, one must study how they are helpful for their intended use. In most cases, the intended use is to use explicitly stored knowledge to produce additional explicit knowledge. This is what reasoning is. Together, knowledge representation and reasoning can be seen to be both necessary and sufficient for producing general intelligence, [that is, KR&R is an] AI-complete area. Although they are bound up with each other, knowledge representation and reasoning can be teased apart, according to whether the particular study is more about the representation language/data structure, or about the active process of drawing conclusions. [Shapiro, EofAI 2nd Ed.,1992, p. 56.]
Discuss:
Introduction, Part 2
Knowledge as "justified true belief" (see text pp. 7-8)
vs. Belief.
Does a "Knowledge base" contain knowledge of the world? Is that what
a KR represents?
Procedural/Declarative controversy: Discuss CS notion of procedural
vs. declarative representation (program vs. data?). Better
distinction: What can the entity say?
Know that, e.g., "I know that Buffalo is west of Rochester."
The Tell/Ask interface: A KR&R system as a utility
vs. constructing a computational cognitive agent.
See the partial FISI Course
Notes, (1994) Chapters 1, 2.1, 2.2, 3.1, 3.2, and 3.4
Ray Reiter: The Closed World Assumption (CWA): "Anything you
don't know is true is false."
Example (DBMS): "Is Mary Smith the manager of the Data
Processing Division?" If the Database doesn't have that she is, then she isn't.
CWA justifies "negation by failure", such as in Prolog.
In our everyday lives, we are always learning new things. So
CWA doesn't hold---the "Open World Assumption" (OWA).
If a KB system is allowed to mix Tell with Ask, it must make
the OWA.
The OWA entails "Unknown" as a third possible "truth value".
What if new information contradicts old conclusions?
(Non-monotonicity)
What if new information contradicts old information? (Belief
Revision)
Classical Logic is Monotonic: If A |- C then A u {B} |- C.
What if then learn that penguins don't fly.
What does "Birds fly" mean, if you know that penguins are
birds, but don't fly?
Which is appropriate for a "Knowledge-based system"?
Knowledge vs. attribution of knowledge.
Belief vs. attribution of belief.
vs. Know how, e.g., "I know how to type."
vs. Know who, e.g., "I know Bill Rapaport", "I know who Bill Clinton is".
and/or
The IJCAI'95 tutorial Presentation, (1995)
Chapters 1, 2.1, 2.2, 3.1, and 3.2.
and/or
the paper, Propositional,
First-Order And Higher-Order Logics: Basic Definitions, Rules of
Inference, Examples (1999).
Default rule: |
| ||
---|---|---|---|
Default Logic, Chapter 4.2, and Circumscription, Chapter 4.3.1, do not give much advice to builders of computer reasoning systems, so we will not spend much time discussing them, just a few comments.
Default Logic: If we call the parts of a default rule
preconditions: justifications |
---|
consequent |
Circumscription is an operation performed on the set of non-logical axioms to introduce additional non-logical axioms in order to formalize the CWA in the sense of being able to formally show that only the mentioned objects exist and/or only the objects that provably have some property have that property. The problem with trying to automate circumscription is that a choice must be made of which axioms or predicates are to be "circumscribed," and this choice remains an art form.
[Additional reference: E. Davis, Representations of Commonsense Knowledge, San Mateo, CA: Morgan Kaufmann, 1990.]
Modal logics add sentential operators to the syntax and semantics of propositional and predicate logic.
In these notes, I'll use L and M as the two modal operators.
Syntactic Extensions:
Propositional Logic: If P is a well-formed proposition, so are L(P) and M(P).
Predicate Logic: If P is a well-formed formula, so are L(P) and M(P).
I will assume we are discussing modal predicate logic, but note that not too much will hang on that.
M(P) is often taken as an abbreviation of ¬L(¬P). Otherwise, the equivalence of these two will be incorporated in the logic some other way.
Common intensional semantics of L(P) and M(P) (choose one):
L(P) | M(P) |
---|---|
Necessarily [P]. | Possibly [P]. |
I know that [P]. | I believe that [P] might be true. |
I believe that [P]. | I believe that [P] might be true. |
[P] will always be true. | [P] will be true at some time. |
Extensional semantics of modal logics requires the notion of possible worlds connected by an "accessibility" relation.
L(P) is true in world w if and only if P is true in every world accessible from w.
M(P) is true in world w if and only if there is some world accessible from w in which P is true.
The Necessitation rule of inference of Modal Logic (Martins p. 58) is
Different modal axioms are associated with different properties of the accessibility relation.
L(P) => P is valid if accessibility is reflexive. This is reasonable if L is knowledge, but not if L is belief.
L(P) => L(L(P)) is valid if accessibility is transitive. This is reasonable if L is temporal, but what if knowledge? belief?
Whenever a rule of inference is used to infer C from P and Q, record that P and Q are the "justifications" of C, so that later one can find P and Q from C, or find C from P and/or Q.
If one ever has a contradiction, P and ~P, one can then trace back through the justifications of P and ~P to find the original premises, and delete one of them.
Similarly, if belief in some proposition P is removed, one can trace forward through the propositions P justifies and remove belief in all dependent consequents, at least those which do not have some other justification.
JTMSs are often separate facilities from the problem solvers they support, and often only record justifications among atomic propositions.
A Negative: The paths of justifications sometimes form loops that prevent belief revision from being performed.
Example of a loop from [E. Charniak, C. Riesbeck, & D. McDermott, Artificial Intelligence Programming, Hillsdale, NJ: Lawrence Erlbaum, 1980, p 197.]
- KB:
- all(x)(Man(x) => Person(x))
- all(x)(Person(x) => Human(x))
- all(x)(Human(x) => Person(x))
- Man(Fred)
- Dependency Network:
Man(Fred) ---->o<---- all(x)(Man(x) => Person(x) | | v Person(Fred) --->o<---- all(x)(Person(x) => Human(x)) ^ | | | | v o<----Human(Fred) ^ | | all(x)(Human(x) => Person(x))Now if Man(Fred) is retracted, one justification of Person(Fred) goes away, but it has another justification!
A Positive: Justifications are useful for explanation.
ATMSs record with every consequent the original premises (assumptions) on which it depends. This eliminates the possibility of loops and the necessity of tracing through justifications to find the premises. Unfortunately, assumptions are not as useful as justifications for explanation.
A technique for keeping track of assumptions was borrowed from a technique used for Fitch-Style proofs in the Logic of Relevant Implication (see Chapter 3.2).
Raphael's SIR
Reference: Bertram Raphael, SIR: Semantic Information Retrieval. In Marvin Minsky, Ed. Semantic Information Processing, MIT Press, Cambridge, MA, 1968, 33-145. (Reprint (partial?) of 1964 Ph.D. dissertation.)
Raphael did not present a graphical representation of "the SIR model" [p. 54], but used Lisp property lists, and spoke about "type-1", "type-2", and type-3" links [pp. 57-58]:
A noteworthy "special feature" of SIR was the "exception principle":
"General information about `all the elements' of a set is considered to apply to particular elements only in the absence of more specific information about those elements. Thus it is not necessarily contradictory to learn that `mammals are land animals' and yet `a whale is a mammal which always lives in water.' In the program, this idea is implemented by always referring for desired information to the property-list of the individual concerned before looking at the descriptions of sets to which the individual belongs.This is the first appearance I know of the "exception principle" in AI. It lead to default reasoning, concern with the logical principles underlying its procedural semantics lead to nonmonotonic logics, and it is an ancestor of similar operations in object-oriented programming.The justification for this departure from the no-exception principles of Aristotelian logic is that this precedence of specific facts over background knowledge seems to be the way people operate, and I wish the computer to communicate with people as naturally as possible.
The present program does not experience the uncomfortable feeling people frequently get when they must face facts like [these]. However, minor programming additions to the present system could require it to identify those instances in which specific information and general information differ; the program could then express its amusement at such paradoxes." [Raphael, 1968, p. 85, italics in original]
This work introduced several important notions:
Firmly established notion of inheritance hierarchies and their psychological validity.
Even though had problem with "fast negatives".
What do semantic networks represent, meaning of a sentence or mind of a language user?
Need for syntax and semantics of networks.
Structural vs. Assertional information.
The following brief description of CD is taken from Stuart C. Shapiro and William J. Rapaport, Models and minds: knowledge representation for natural-language competence. In R. Cummins & J. Pollock, Eds. Philosophy and AI: Essays at the Interface. MIT Press, Cambridge, MA, 1991, 215--259.
Conceptual Dependency theory (Schank and Rieger 1974, Shank 1975, Schank and Riesbeck 1981; cf. Hardt 1987) uses a knowledge-representation formalism consisting of sentences, called ``conceptualizations'', which assert the occurrence of events or states, and six types of terms:(The glosses of these types of terms are quoted from Schank & Rieger 1974: 378-379.)
- PPs---``real-world objects'',
- ACTs---``real-world actions'',
- PAs---``attributes of objects'',
- AAs---``attributes of actions'',
- Ts---``times'',
- and LOCs---``locations''.
The set of ACTs is closed and consists of the well-known primitive ACTs PTRANS (transfer of physical location), ATRANS (transfer of an abstract relationship), etc.
The syntax of an event conceptualization is a structure with six slots (or arguments), some of which are optional: actor, action, object, source, destination, and instrument.
A stative conceptualization is a structure with an object, a state, and a value.
Only certain types of terms can fill certain slots. For example, only a PP can be an actor, and only an ACT can be an action. Interestingly, conceptualizations, themselves, can be terms, although they are not one of the six official terms. For example, only a conceptualization can fill the instrument slot, and a conceptualization can fill the object slot if MLOC (mental location) fills the act slot.
A ``causation'' is another kind of conceptualization, consisting only of two slots, one containing a causing conceptualization and the other containing a caused conceptualization.
Although, from the glosses of PP and ACT, it would seem that the intended domain of interpretation is the real world, the domain also must contain theoretically postulated objects such as: the ``conscious processor'' of people, in which conceptualizations are located; conditional events; and even negated events, which haven't happened.
Proposed by Marvin Minsky
Motivated by vision problems
Answer to a challenge by Hubert Dreyfus (intentionally?)
Along with scripts, first structured KR theory
Based, at least in part, on Simula 67 classes---part of the development of OOP
Basic ideas:
Slots and fillers
Hierarchy
Inheritance
Defaults
Procedural Attachments---IfNeeded & IfAdded
Multiple Inheritance
Possibilities: E.g. Dogs are Domestic Mammals vs. Wild Birds
Problems: What to do when there's a conflict Esp if negated links allowed
Nixon Diamond:
Pacifist /\ / \ / \NOT / \ / \ Quaker Republican \ / \ / \ / \ / \/ Nixon
Frame systems usually do not allow negative links. See Brachman comment that elephants are grey by default, but albino elephants are allowed, so why not Clyde, the non-elephant elephant?
What is a hierarchy? If have classes and individuals and inheritance, must distinguish class properties from individual properties. An abstraction hierarchy doesn't have classes, but abstract individuals.
See text for examples of KEE.
TBox: Definitional, structural information
ABox: Assertional
TBox
Hierarchy of Concepts and Relations
Generic Concepts
Defined Concepts contain Necessary and Sufficient conditions
Primitive Concepts contain only Necessary conditions
Necessary: All x [C(x) -> P(x)]
Sufficient: All x [P(x) -> C(x)]
Individual Concepts are classes with a single element. Note: another way to have an inheritance hierarchy with only one relation.
A Generic Concept is defined as a subconcept of others with certain roles.
See examples in text.
ABox
Concepts provide Unary Predicates
Roles provide Binary Relations
Defining an arch (text p. 202)
(cdef Arch (and (atleast 1 lintel) (atmost 1 lintel) (all lintel Block) (atleast 2 upright) ; note typo in text (atmost 2 upright) ; note typo in text (all upright Block) (sd NotTouch (upright objects)) (sd Support (lintel supported) (upright supporter))))Assume an Arch A1 with lintel B1 and uprights B2 and B3.
R[(sd s (pc1 ps1) ... (pcn psn))] = {x in D: Ey[y in R[s] & Az1, ..., zn[(<x, z1> in R[pc1] <-> <y, z1> in R[ps1]) & ... & (<x, zn> in R[pcn] <-> <y, zn> in R[psn])]]}Instantiating to the Arch, we get
R[(sd Support (lintel supported) (upright supporter))] = {x in D: Ey[y in R[Support] & Az1, z2[(<x, z1> in R[lintel] <-> <y, z1> in R[supported]) & (<x, z2> in R[upright] <-> <y, zn> in R[supporter])]]}
Reference: James F. Allen, Maintaining Knowledge about Temporal Intervals, Communications of the ACM 26, 11 (1983), 832-843. Reprinted in R. J. Brachman and H. J. Levesque, eds. Readings in Knowledge Representation, Morgan Kaufmann, Los Altos, CA, 1985, 509-521.
For all the details, see the KIF Home Page.
"Knowledge Interchange Format (KIF) is a computer-oriented language for the interchange of knowledge among disparate programs. It has declarative semantics (i.e. the meaning of expressions in the representation can be understood without appeal to an interpreter for manipulating those expressions); it is logically comprehensive (i.e. it provides for the expression of arbitrary sentences in the first-order predicate calculus); it provides for the representation of knowledge about the representation of knowledge; it provides for the representation of nonmonotonic reasoning rules; and it provides for the definition of objects, functions, and relations." [Abstract of the Reference Manual]
The Introduction to the Reference Manual says what KIF is and isn't.
A KIF knowledge base is a finite set (not sequence) of forms, each of which is either a sentence, a definition, or a rule.
KIF sentences are very similar to FOPC wffs.
KIF definitions can be either complete, giving necessary and sufficient conditions, or partial, giving only necessary conditions. Each defined constant gets a defining axiom, which is an analytic truth. Discuss this.
KIF rules may be nonmonotonic, but need not be. Rules are not sentences. See the discussion of this in the Reference Manual.
See the interesting discussion of metaknowledge.
You can browse the ontologies stored under the Stanford Knowledge Systems Laboratory Network Services.