CS676: Knowledge Representation Fall, 1997

Course Notes
Stuart C. Shapiro
Department of Computer Science
State University of New York at Buffalo


Introduction

These notes will comment on and supplement the text. They will serve as an outline for the class meetings, and will be quite informal.

Artificial Intelligence (AI) is a field of computer science and engineering concerned with the computational understanding of what is commonly called intelligent behavior, and with the creation of artifacts that exhibit such behavior. [Shapiro, EofAI 2nd Ed.,1992, p. 54.]

Brian C. Smith's knowledge representation hypothesis:

Any mechanically embodied intelligent process will be comprised of structural ingredients that a) we as external observers naturally take to represent a propositional account of the knowledge that the overall process exhibits, and b) independent of such external semantical attribution, play a formal but causal and essential role in engendering the behaviour that manifests that knowledge. [Smith in Brachman & Levesque, 1985, p. 33.]

Knowledge representation is the area of AI concerned with the formal symbolic languages used to represent the knowledge (data) used by intelligent systems, and the data structures used to implement those formal languages. However, one cannot study static representation formalisms and know anything about how useful they are. Instead, one must study how they are helpful for their intended use. In most cases, the intended use is to use explicitly stored knowledge to produce additional explicit knowledge. This is what reasoning is. Together, knowledge representation and reasoning can be seen to be both necessary and sufficient for producing general intelligence, [that is, KR&R is an] AI-complete area. Although they are bound up with each other, knowledge representation and reasoning can be teased apart, according to whether the particular study is more about the representation language/data structure, or about the active process of drawing conclusions. [Shapiro, EofAI 2nd Ed.,1992, p. 56.]

Discuss:

Introduction, Part 2

Knowledge as "justified true belief" (see text pp. 7-8) vs. Belief.
Which is appropriate for a "Knowledge-based system"?
Knowledge vs. attribution of knowledge.
Belief vs. attribution of belief.

Does a "Knowledge base" contain knowledge of the world? Is that what a KR represents?

Procedural/Declarative controversy: Discuss CS notion of procedural vs. declarative representation (program vs. data?). Better distinction: What can the entity say?

Know that, e.g., "I know that Buffalo is west of Rochester."
vs. Know how, e.g., "I know how to type."
vs. Know who, e.g., "I know Bill Rapaport", "I know who Bill Clinton is".

The Tell/Ask interface: A KR&R system as a utility vs. constructing a computational cognitive agent.

Classical Logic

See the partial FISI Course Notes, Chapters 1, 2.1, 2.2, 3.1, 3.2, and 3.4
and/or
The IJCAI'95 tutorial Presentation, Chapters 1, 2.1, 2.2, 3.1, and 3.2.

Facing the Open World

Ray Reiter: The Closed World Assumption (CWA): "Anything you don't know is true is false."

Example (DBMS): "Is Mary Smith the manager of the Data Processing Division?" If the Database doesn't have that she is, then she isn't.

CWA justifies "negation by failure", such as in Prolog.

In our everyday lives, we are always learning new things. So CWA doesn't hold---the "Open World Assumption" (OWA).

If a KB system is allowed to mix Tell with Ask, it must make the OWA.

The OWA entails "Unknown" as a third possible "truth value".

What if new information contradicts old conclusions? (Non-monotonicity)

What if new information contradicts old information? (Belief Revision)

Classical Logic is Monotonic: If A |- C then A u {B} |- C.

Example:
  • Birds fly.
  • Canaries are birds.
  • Penguins are birds.
  • Tweety is a canary.
  • Opus is a penguin.
  • Does Tweety fly?
  • Does Opus fly?

What if then learn that penguins don't fly.

What does "Birds fly" mean, if you know that penguins are birds, but don't fly?

Default rule:
Bird(x): Flies(x)
Flies(x)

Non-Monotonic Logics (Chapter 4)

Default Logic, Chapter 4.2, and Circumscription, Chapter 4.3.1, do not give much advice to builders of computer reasoning systems, so we will not spend much time discussing them, just a few comments.

Default Logic: If we call the parts of a default rule

preconditions: justifications
consequent
then, to fire the rule, we must derive the preconditions and try but fail to derive the negations of the justifications. Remember, trying to derive a non-derivable may be non-terminating if we have a logical system that is only semi-decidable. If, nevertheless, we succeed in firing the rule, we must use some sort of TMS (see below) to prepare for the possibility that we might have to retract the conclusion.

Circumscription is an operation performed on the set of non-logical axioms to introduce additional non-logical axioms in order to formalize the CWA in the sense of being able to formally show that only the mentioned objects exist and/or only the objects that provably have some property have that property. The problem with trying to automate circumscription is that a choice must be made of which axioms or predicates are to be "circumscribed," and this choice remains an art form.

Justification-Based TMSs

Whenever a rule of inference is used to infer C from P and Q, record that P and Q are the "justifications" of C, so that later one can find P and Q from C, or find C from P and/or Q.

If one ever has a contradiction, P and ~P, one can then trace back through the justifications of P and ~P to find the original premises, and delete one of them.

Similarly, if belief in some proposition P is removed, one can trace forward through the propositions P justifies and remove belief in all dependent consequents, at least those which do not have some other justification.

JTMSs are often separate facilities from the problem solvers they support, and often only record justifications among atomic propositions.

A Negative: The paths of justifications sometimes form loops that prevent belief revision from being performed.

Example of a loop from [E. Charniak, C. Riesbeck, & D. McDermott, Artificial Intelligence Programming, Hillsdale, NJ: Lawrence Erlbaum, 1980, p 197.]
KB:
  • all(x)(Man(x) => Person(x))
  • all(x)(Person(x) => Human(x))
  • all(x)(Human(x) => Person(x))
  • Man(Fred)

Dependency Network:
Man(Fred) ---->o<---- all(x)(Man(x) => Person(x)
               |
               |
               v
        Person(Fred) --->o<---- all(x)(Person(x) => Human(x))
            ^            |
            |            |
            |            v
            o<----Human(Fred)
            ^
            |
            |
all(x)(Human(x) => Person(x)) 

Now if Man(Fred) is retracted, one justification of Person(Fred) goes away, but it has another justification!

A Positive: Justifications are useful for explanation.

Assumption-Based TMSs

ATMSs record with every consequent the original premises (assumptions) on which it depends. This eliminates the possibility of loops and the necessity of tracing through justifications to find the premises. Unfortunately, assumptions are not as useful as justifications for explanation.

A technique for keeping track of assumptions was borrowed from a technique used for Fitch-Style proofs in the Logic of Relevant Implication (see Chapter 3.2).

Modal Logic

[Additional reference: E. Davis, Representations of Commonsense Knowledge, San Mateo, CA: Morgan Kaufmann, 1990.]

Modal logics add sentential operators to the syntax and semantics of propositional and predicate logic.

In these notes, I'll use L and M as the two modal operators.

Syntactic Extensions:

Propositional Logic: If P is a well-formed proposition, so are L(P) and M(P).

Predicate Logic: If P is a well-formed formula, so are L(P) and M(P).

I will assume we are discussing modal predicate logic, but note that not too much will hang on that.

M(P) is often taken as an abbreviation of ¬L(¬P). Otherwise, the equivalence of these two will be incorporated in the logic some other way.

Common intensional semantics of L(P) and M(P) (choose one):

L(P)M(P)
Necessarily [P].Possibly [P].
I know that [P].I believe that [P] might be true.
I believe that [P].I believe that [P] might be true.
[P] will always be true.[P] will be true at some time.

Modal logic often used when
Operators do not commute with quantifiers
e.g. L(ExSpy(x)) vs. ExLSpy(x)
Operators are referentially opaque
e.g. ¬L(Scott = Author-of(Waverly)) vs. ¬L(Scott = Scott)
No need for quantifying over sentences
e.g. Ax(Says(Bill, x) => L(x))
(If first two don't hold, something simpler might be used. If third doesn't hold, might need something more complicated.)

Extensional semantics of modal logics requires the notion of possible worlds connected by an "accessibility" relation.

L(P) is true in world w if and only if P is true in every world accessible from w.

M(P) is true in world w if and only if there is some world accessible from w in which P is true.

Different modal axioms are associated with different properties of the accessibility relation.

L(P) => P is valid if accessibility is reflexive. This is reasonable if L is knowledge, but not if L is belief.

L(P) => L(L(P)) is valid if accessibility is transitive. This is reasonable if L is temporal, but what if knowledge? belief?

Exercises:
  • (L(A) & L(A => B)) => L(B) is valid
  • (L(A) & (A => B)) => L(B) is not valid

SNePS

Some SNePS References

Conceptual Dependency

The following brief description of CD is taken from Stuart C. Shapiro and William J. Rapaport, Models and minds: knowledge representation for natural-language competence. In R. Cummins & J. Pollock, Eds. Philosophy and AI: Essays at the Interface. MIT Press, Cambridge, MA, 1991, 215--259.

Conceptual Dependency theory (Schank and Rieger 1974, Shank 1975, Schank and Riesbeck 1981; cf. Hardt 1987) uses a knowledge-representation formalism consisting of sentences, called ``conceptualizations'', which assert the occurrence of events or states, and six types of terms: PPs---``real-world objects'', ACTs---``real-world actions'', PAs---``attributes of objects'', AAs---``attributes of actions'', Ts---``times'', and LOCs---``locations''. (The glosses of these types of terms are quoted from Schank & Rieger 1974: 378-379.) The set of ACTs is closed and consists of the well-known primitive ACTs PTRANS (transfer of physical location), ATRANS (transfer of an abstract relationship), etc. The syntax of an event conceptualization is a structure with six slots (or arguments), some of which are optional: actor, action, object, source, destination, and instrument. A stative conceptualization is a structure with an object, a state, and a value. Only certain types of terms can fill certain slots. For example, only a PP can be an actor, and only an ACT can be an action. Interestingly, conceptualizations, themselves, can be terms, although they are not one of the six official terms. For example, only a conceptualization can fill the instrument slot, and a conceptualization can fill the object slot if MLOC (mental location) fills the act slot. A ``causation'' is another kind of conceptualization, consisting only of two slots, one containing a causing conceptualization and the other containing a caused conceptualization. Although, from the glosses of PP and ACT, it would seem that the intended domain of interpretation is the real world, the domain also must contain theoretically postulated objects such as: the ``conscious processor'' of people, in which conceptualizations are located; conditional events; and even negated events, which haven't happened.

Frames

Proposed by Marvin Minsky

Motivated by vision problems

Answer to a challenge by Hubert Dreyfus (intentionally?)

Along with scripts, first structured KR theory

Based, at least in part, on Simula 67 classes---part of the development of OOP

Basic ideas:
Slots and fillers
Hierarchy
Inheritance
Defaults
Procedural Attachments---IfNeeded & IfAdded

Multiple Inheritance

Possibilities: E.g. Dogs are Domestic Mammals vs. Wild Birds

Problems: What to do when there's a conflict Esp if negated links allowed

Nixon Diamond:

                  Pacifist
                     /\
                    /  \
                   /    \NOT
                  /      \
                 /        \
              Quaker  Republican
                 \        /
                  \      /
                   \    /
                    \  /
                     \/
                    Nixon

Frame systems usually do not allow negative links. See Brachman comment that elephants are grey by default, but albino elephants are allowed, so why not Clyde, the non-elephant elephant?

What is a hierarchy? If have classes and individuals and inheritance, must distinguish class properties from individual properties. An abstraction hierarchy doesn't have classes, but abstract individuals.

See text for examples of KEE.

The KL-ONE Family

TBox: Definitional, structural information

ABox: Assertional

TBox
Hierarchy of Concepts and Relations
Generic Concepts
Defined Concepts contain Necessary and Sufficient conditions
Primitive Concepts contain only Necessary conditions
Necessary: All x [C(x) -> P(x)]
Sufficient: All x [P(x) -> C(x)]

Individual Concepts are classes with a single element. Note: another way to have an inheritance hierarchy with only one relation.

A Generic Concept is defined as a subconcept of others with certain roles.

See examples in text.

ABox
Concepts provide Unary Predicates
Roles provide Binary Relations

Defining an arch (text p. 275)

(cdef Arch
      (and (atleast 1 lintel)
           (atmost 1 lintel)
           (all lintel Block)
           (atleast 2 upright)
           (atmost 2 upright)
           (all upright Block)
           (sd NotTouch (upright objects))
           (sd Support (lintel supported) (upright supporter))))
Assume an Arch A1 with lintel B1 and uprights B2 and B3.
From semantics, p. 278 of text:
R[(sd s (pc1 ps1) ... (pcn psn))] =
   {x in D: Ey[y in R[s] &
               Az1, ..., zn[(<x, z1> in R[pc1] <-> <y, z1> in R[ps1])
                            & ... &
                            (<x, zn> in R[pcn] <-> <y, zn> in R[psn])]]}
Instantiating to the Arch, we get
R[(sd Support (lintel supported) (upright supporter))] =
   {x in D: Ey[y in R[Support] &
               Az1, z2[(<x, z1> in R[lintel] <-> <y, z1> in R[supported])
                            &
                       (<x, z2> in R[upright] <-> <y, zn> in R[supporter])]]}

KIF: Knowledge Interchange Format

For all the details, see the KIF Home Page.

"Knowledge Interchange Format (KIF) is a computer-oriented language for the interchange of knowledge among disparate programs. It has declarative semantics (i.e. the meaning of expressions in the representation can be understood without appeal to an interpreter for manipulating those expressions); it is logically comprehensive (i.e. it provides for the expression of arbitrary sentences in the first-order predicate calculus); it provides for the representation of knowledge about the representation of knowledge; it provides for the representation of nonmonotonic reasoning rules; and it provides for the definition of objects, functions, and relations." [Abstract of the Reference Manual]

The Introduction to the Reference Manual says what KIF is and isn't.

A KIF knowledge base is a finite set (not sequence) of forms, each of which is either a sentence, a definition, or a rule.

KIF sentences are very similar to FOPC wffs.

KIF definitions can be either complete, giving necessary and sufficient conditions, or partial, giving only necessary conditions. Each defined constant gets a defining axiom, which is an analytic truth. Discuss this.

KIF rules may be nonmonotonic, but need not be. Rules are not sentences. See the discussion of this in the Reference Manual.

See the interesting discussion of metaknowledge.

Ontologies

You can browse the ontologies stored under the Stanford Knowledge Systems Laboratory Network Services.


Stuart C. Shapiro <shapiro@cs.buffalo.edu>