The Department of Computer Science & Engineering |
STUART C. SHAPIRO: CSE
305
|
The standard definition of data type is (see Sebesta, p. 248):
"The word object is often associated with the value of a variable and the space it occupies. In this book, however, we reserve object exclusively for instances of user-defined abstract data types, rather than also using it for the values of variables of predefined types. In object-oriented languages, every instance of every class, whether predefined or user-defined , is called an object." [Sebesta, p. 249 (italics in the original)]
"Ruby is a completely object-oriented language. Every value is an object, even simple numeric literals" [Flanagan & Matsumoto, The Ruby Programming Language, 2008, p. 2 (italics in the original)].This is immediately followed by an example of numeric literals having methods:
irb(main):001:0> 1.class => Fixnum
The major steps in the evolution of data types were:
The rest of this chapter is a survey of data types and their design issues.
Various coding schemes are possible. Most languages now use binary numbers for positive integers, and twos complement for negative integers.
Bignums are integers with unlimited length. For example, in Ruby,
Common Lisp also has bignums.irb(main):001:0> def fact(n); if n<=1 then 1 else n*fact(n-1) end; end nil irb(main):002:0> fact(4) 24 irb(main):003:0> fact(100) 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000 irb(main):004:0> fact(100).class Bignum
Represented by binary coded decimal (BCD). Each digit represented
by its binary equivalent. For example, 35
in BCD is
0011 0101
.
Usually represented using IEEE Floating-Point Standard: sign bit, exponent, fraction ("mantissa"). For more details of number representations, see my CSE115 notes on Java arithmetic.
Usually several types, differing on precision (number of bits used for fractional part).
cl-user(12): (/ 36 10) 18/5 cl-user(13): (type-of (/ 36 10)) ratio cl-user(14): (+ 18/5 2) 28/5
cl-user(15): (sqrt 4) 2.0 cl-user(16): (sqrt -1) #C(0.0 1.0) cl-user(17): (type-of (sqrt -1)) complex
Operations on numbers will be discussed in Chapter 7.
Only some programming languages (Java and perhaps Haskell) have an actual
Boolean type with two special values, True and False. Some languages have
Boolean values, but allow other values to count as them. C uses the int
0
for False, and any other int for True. Lisp uses
nil
for False and any other value for True.
One test for a Boolean type:
but what are the possible values of a logical expression?define lessThan(x,y) return x<y;
Often represented in ASCII, which uses 8 bits, and so can code 128 differet characters.
There is a move, started by Java to use Unicode, which uses 16 bits, and can represent character's from most of the languages in the world.
Many languages have a data type named something like
string
, others use arrays of characters. However,
strings are usually implemented as arrays of characters.
The length of a string may be stored with the value or the
variable, or may be indicated by a sentinal. For example, C and C++
terminate strings with the null
character,
'\0'
.
String concatenation is such a common operation that several
languages include an operator for it, such as Java's overloaded
+
. Java uses concatenation to construct output lines.
Other languages use format strings with interpolated control
characters.
Some other common operations are: string length; substring extraction; character at position; string comparison; and substring search.
A major issue is whether string operations are destructive (change
the argument string) or non-destructive (return a string like the
argument string, except...). In Java, String
s are
immutable (have no destructive operations), whereas
StringBuffer
s are like String
s, but are
mutable:
bsh % str1 = "This is a string."; bsh % str2 = str1.replace('i', 'y'); bsh % print(str2); Thys ys a stryng. bsh % print(str1); This is a string. bsh % str3 = new StringBuffer("This is a string."); bsh % print(str3); This is a string. bsh % str4 = str3.replace(8,9,"another"); bsh % print(str4); This is another string. bsh % print(str3); This is another string.
Common Lisp has only mutable strings, but both destructive and non-destructive operations:
cl-user(1): (setf str1 "This is a string.") "This is a string." cl-user(2): (setf str2 (substitute #\y #\i str1)) "Thys ys a stryng." cl-user(3): str2 "Thys ys a stryng." cl-user(4): str1 "This is a string." cl-user(5): (setf str2 (nsubstitute #\y #\i str1)) "Thys ys a stryng." cl-user(6): str2 "Thys ys a stryng." cl-user(7): str1 "Thys ys a stryng."
A string's length may be static, as is Java's String
,
dynamic, as is Java's Stringbuffer
, or limited dynamic, as
Sebesta says C's are [p. 257]. However, the program
was an infinite loop when I ran it on pollux. When I killed it,#include <stdio.h> #include <string.h> #define true 1 int main() { char str[10]; int i; while (true) { str[i++] = 'a'; str[i] = '\0'; printf("str = %s; Its length is %d; i = %d\n", str, (int)strlen(str), i); } return 0; } ---------------------------------------------- <pollux:Test:1:27> gcc -Wall -o dstrlen dstrlen.c <pollux:Test:1:28> ./dstrlen str = a; Its length is 1; i = 1 str = aa; Its length is 2; i = 2 str = aaa; Its length is 3; i = 3 str = aaaa; Its length is 4; i = 4 str = aaaaa; Its length is 5; i = 5 str = aaaaaa; Its length is 6; i = 6 str = aaaaaaa; Its length is 7; i = 7 str = aaaaaaaa; Its length is 8; i = 8 str = aaaaaaaaa; Its length is 9; i = 9 str = aaaaaaaaaa; Its length is 10; i = 10 str = aaaaaaaaaaa; Its length is 11; i = 11 str = aaaaaaaaaaaa; Its length is 12; i = 12
str
had a length of 2,089. Of course, this is C not
doing range checking on arrays, again, and on pollux' operating
system. On timberlake, the string never exceeds a length of 12, and
i
is incremented modulo 12.
Pattern matching is a common operation
on strings that is a very involved subject. A large part of Perl is
devoted to pattern matching. Java has an extensive pattern matching
capability in the package java.util.regex
. C++ also has
a pattern matching library. (X)Emacs supports regular expression
pattern matching for searching and replacing strings. For example,
the regular expression <[^>]*>
will match html
tags.
In C, the typedef identifier is a synonym for its parent type. However, that is not true in all languages with user-defined types. If the new type identifier is not a synomym, a question is, is name type compatibility used, or structure type compatibility.#include <stdio.h> #define MperK 0.62137 #define KperM 1.60935 typedef float kilometer; typedef float mile; kilometer MtoK(mile x) { return x * KperM; } mile KtoM(kilometer x) { return x * MperK; } int main() { mile m = 100; kilometer k = 100; printf("%3.0f miles = %5.2f kilometers.\n", m, MtoK(m)); printf("%3.0f kph = %5.2f mph.\n", k, KtoM(k)); return 0; } ---------------------------------------------------------------- <timberlake:Test:1:86> gcc -Wall -o conversion conversion.c <timberlake:Test:1:87> ./conversion 100 miles = 160.93 kilometers. 100 kph = 62.14 mph.
In name type compatibility, two expressions having compatible types depends on the type identifier, even if the parent types are the same. In structure type compatibility, it depends on the parent types. For example, in the Ada-like type declarations
type array1type is array(1..10) of Integer; type array2type is array(11..20) of Integer; A: array1type; B: array2type;
A
and B
do not have compatible types under
name type compatibility, but do under structure type compatibility.
Some languages use name type compatibility, some use structure type compatibility, and some have facilities for both.
If a variable is declared with a type expression, such as
the variable is considered to have an anonymous type.A: array(1..10) of Integer;
char
. The integer types are also
considered ordinal types, although the signed integers also have
negatives. The important thing is that, except for the minimal value,
every value of an ordinal type is the successor of a value of its
type, and, except for the maximal value, every value of an ordinal
type is the predecessor of a value of its type. So one should be able
to use any ordinal type as an array subscript, or as a for loop
index.
#include <stdio.h> enum months {Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec}; int monLength[12] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; char* monName[12] = {"January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"}; int main() { enum months m; for (m = Jan; m <= Dec; m++) { printf("%s has %d days.\n", monName[m], monLength[m]); } return 0; } --------------------------------------------------------------- <timberlake:Test:1:89> gcc -Wall -o enumtest enumtest.c <timberlake:Test:1:90> ./enumtest January has 31 days. February has 28 days. March has 31 days. April has 30 days. May has 31 days. June has 30 days. July has 31 days. August has 31 days. September has 30 days. October has 31 days. November has 30 days. December has 31 days.
int
and its values are treated like int
values.
In fact, let's try to assign a days
value to a months
variable in C:
#include <stdio.h>
enum months {Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec};
enum days {Sun, Mon, Tue, Wed, Thur, Fri, Sat};
int monLength[12] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
char* monName[12] = {"January", "February", "March", "April",
"May", "June", "July", "August",
"September", "October", "November", "December"};
int main() {
enum months m;
enum days d = Thur;
m = d;
printf("It ran.\n");
return 0;
}
--------------------------------------------------------------------------
<timberlake:Test:1:123> gcc -Wall -o enumtest2 enumtest2.c
<timberlake:Test:1:124> ./enumtest2
It ran.
C++, though is more careful:
#include <iostream>
#include <string>
using namespace std;
enum months {Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec};
enum days {Sun, Mon, Tue, Wed, Thur, Fri, Sat};
int main() {
enum months m;
enum days d = Thur;
m = d;
printf("It ran.\n");
return 0;
}
----------------------------------------------------------------
<timberlake:Test:1:93> g++ -Wall -o enumtest enumtest.cpp
enumtest.cpp: In function 'int main()':
enumtest.cpp:19: error: cannot convert 'days' to 'months' in assignment
Here is an example:
public class Months { public enum Month {January, February, March, April, May, June, July, August, September, October, November, December} public static int[] monLength = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; public static void main(String[] args) { Month f = Month.February; System.out.println(f + " is the shortest month. A full list of months and lengths is:"); for (Month m : Month.values()) { System.out.println(m + " has " + monLength[m.ordinal()] + " days."); } } // end of main() } // Months ------------------------------------------------- <timberlake:Test:1:96> javac Months.java <timberlake:Test:1:97> java Months February is the shortest month. A full list of months and lengths is: January has 31 days. February has 28 days. March has 31 days. April has 30 days. May has 31 days. June has 30 days. July has 31 days. August has 31 days. September has 30 days. October has 31 days. November has 30 days. December has 31 days.
type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun) subtype WeekDays is Days range Mon..Fri; sybtype WeekendDays is Days range Sat..Sun; Day1: Days; Day2: WeekDays; Day3: WeekendDays;
Day1 := Day2
and Day1 := Day3
are legal.Day2 := Day3
and Day3 := Day2
are
illegal.Day2 := Day1
or Day3 := Day1
are only legal if Day1
has a proper value at run-time.
Subrange types are particularly useful for the indexes of arrays,
such as
subtype arrayIndex is Integer range 1..100;
squares: array(arrayIndex) of Integer;
for i in arrayIndex loop
squares[i] := i*i;
end loop;
monthLength(january, 31). monthLength(february, 28). monthLength(march, 31). monthLength(april, 30). monthLength(may, 31). monthLength(june, 30). monthLength(july, 31). monthLength(august, 31). monthLength(september, 30). monthLength(october, 31). monthLength(november, 30). monthLength(december, 31). :- monthLength(M, L), format("~a has ~d days.\n", [M,L]), fail; halt. ---------------------------------------------------------- <timberlake:Test:1:98> prolog -l months.pro % compiling /projects/shapiro/CSE305/Test/months.pro... january has 31 days. february has 28 days. march has 31 days. april has 30 days. may has 31 days. june has 30 days. july has 31 days. august has 31 days. september has 30 days. october has 31 days. november has 30 days. december has 31 days. % compiled /projects/shapiro/CSE305/Test/months.pro in module user, 0 msec 2752 bytes
cl-user(1): (type-of '3) fixnum cl-user(2): (type-of '3.7) single-float cl-user(3): (type-of 'January) symbol cl-user(4):(setf monLength (make-hash-table)) #<eql hash-table with 0 entries @ #x4ec5882> cl-user(5): (mapc #'(lambda (key value) (setf (gethash key monLength) value)) '(January February March April May June July August September October November December ) '(31 28 31 30 31 30 31 31 30 31 30 31)) (January February March April May June July August September October ...) cl-user(6): (loop for m = (progn (format t "Enter a month or `bye': ") (read)) if (eq m 'bye) return 'Goodbye do (format t "~A has ~D days.~%" m (gethash m monLength))) Enter a month or `bye': March March has 31 days. Enter a month or `bye': June June has 30 days. Enter a month or `bye': bye Goodbye
cl-user(27): (setf Fibonacci 11235) 11235 cl-user(28): (defun Fibonacci (n) (if (< n 3) 1 (+ (Fibonacci (- n 1)) (Fibonacci (- n 2))))) Fibonacci cl-user(29): (symbol-name 'Fibonacci) "Fibonacci" cl-user(30): (symbol-value 'Fibonacci) 11235 cl-user(31): Fibonacci 11235 cl-user(32): (symbol-function 'Fibonacci) #<Interpreted Function Fibonacci> cl-user(33): (type-of (symbol-function 'Fibonacci)) function cl-user(34): (Fibonacci 10) 55
The ability to compute subscripts makes a subscripted array like a
variable name that can be computed. More precisely, a subscripted
array is an expression evaluated for its l-value. Compare
these two C subroutines for the Fibonacci sequence:
#include <stdio.h>
int fibonacci(int n) {
if (n<=2) return 1;
int current = 1,
oneBack = 1,
twoBack = 1,
i;
for (i=3; i<=n; i++) {
twoBack = oneBack;
oneBack = current;
current = oneBack + twoBack;
}
return current;
}
int Fibonacci (int n) {
if (n<=2) return 1;
int num[3] = {1,1},
current = 1,
i;
for (i=3; i<=n; i++) {
current = (current + 1) % 3;
num[current] = num[(current + 1) % 3] + num[(current + 2) % 3];
}
return num[current];
}
int main() {
int i;
for (i=1; i<=8; i++)
printf("fibonacci(%d) = %d\n", i, fibonacci(i));
printf("\n");
for (i=1; i<=8; i++)
printf("Fibonacci(%d) = %d\n", i, Fibonacci(i));
return 0;
}
------------------------------------------------
<timberlake:Test:1:29> gcc -Wall -o indexdemo indexdemo.c
<timberlake:Test:1:30> ./indexdemo
fibonacci(1) = 1
fibonacci(2) = 1
fibonacci(3) = 2
fibonacci(4) = 3
fibonacci(5) = 5
fibonacci(6) = 8
fibonacci(7) = 13
fibonacci(8) = 21
Fibonacci(1) = 1
Fibonacci(2) = 1
Fibonacci(3) = 2
Fibonacci(4) = 3
Fibonacci(5) = 5
Fibonacci(6) = 8
Fibonacci(7) = 13
Fibonacci(8) = 21
An array can be thought of as a mapping, or even a function. For
example, the C array monLength
, above, is a mapping from
a month's ordinal, 0..11, to its length. This is clearer in the Java
expression, above, monLength[m.ordinal()]
. The Common
Lisp use of monLength
is more directly represented as a
mapping. An array might also be thought of as a function from a
month's ordinal to its length.
Most current programming languages use parentheses around the
arguments of a function, e.g. f(x)
, and brackets around the
subscripts of an array, e.g. a[i]
, but Fortran and Ada use
parentheses for arrays also. Thinking of an array as a function
justifies this, but most programmers find it confusing.
Common Lisp, as usual uses a more functional notation:
Some programming languages, including Java and Common Lisp, do
range-checking. That is, they give a run-time error if the program
tries to use an out-of-range subscript. Others, including C, Perl,
and Fortran, do not. A programming language that does range checking
is clearly more reliable.
Some programming languages have a fixed lowest subscript: in
C-based languages, it is 0; in Fortran, it is 1. Others allow the
programmer to choose the lowest subscript.
The array subscript range might be statically bound (during
compile-time); dynamically bound (during run-time), but then fixed; or
fully dynamic (might change during run-time).
Array storage binding might be static, stack-dynamic, or
heap-dynamic.
Some languages provide a convenient way to initialize arrays, such
as the C-based languages,
Some languages provide array operations, i.e., operations on arrays
themselves. For example, in Fortran:
APL is A Programming Language specially designed to operate on
arrays.
Two-dimensional arrays may be thought of
as solid rectangles (rectangular arrays), or as arrays of arrays
(jagged arrays). Some languages insist the programmer think of
arrays one way, some the other, and some support both. Java supports only jagged arrays:
Let's try Fortran:
Jagged arrays needn't have every row have the same number of
columns.
The entire discussion of two-dimensional arrays extends to
multi-dimensional arrays.
Fortran 95, Ada, Python, and Ruby allow references to a slice of an
array---a more or less regular piece of an array. It is most common for a pointer variable to be an address of a
memory cell in the heap, but C and C++ also allow addresses in RAM or
on the stack.
Fortran 77 (and earlier) does not have pointer types, but they can
be simulated by using one array for data and a separate array of
indices into the first array as the pointers.
How can a pointer variable contain an address in RAM or on the
stack? Addresses in RAM or on the stack are allocated when variables
are declared. If In statically scoped languages, the declaration of a pointer
variable must include the type of variable it points to.
If Here's a C program using a pointer whose value is an
address in the stack:
Here is an example in Fortran 95, showing implicit dereferencing:
Pointer arithmetic is allowed in C and C++. If In C and C++, an array name is a constant pointer to the first
element of the array, so subscripting is done by pointer arithmetic,
and pointer expressions may replace subscripted arrays.
Anonymous variables on the heap are manipulated via pointers. The
allocation operators Many novice C programmers find pointers to be confusing, but "if
everything is a pointer, you don't have to think about pointers," and
that is the approach taken by Erlang, Haskell, Lisp, Java, Prolog,
Python, and Ruby. In those languages, you can think you are storing
an object (or, at worst, a reference to an object) in a variable. You
just have to remember that a change made via one reference variable
may be seen via another reference variable.
The dangling pointer problem is the problem of a pointer variable,
in scope and during its lifetime, pointing to a memory cell that was
already deallocated, perhaps via another pointer variable (and
possibly even reused).
This C program shows that a pointer may be mistakenly used, even
though the space it points to has been deallocated:
The dangling pointer problem is commonly solved by removing
explicit deallocation from the programmer, and using automatic garbage
collection instead.
The problem of memory leakage is the problem of memory cells allocated on the
heap becoming unreachable (becoming garbage) when the pointer variables
referring to them end their lifetime or get reassigned to other heap memory.
This problem is also solved by automatic garbage collection.
Haskell represents its functions in the curried form:
cl-user(33): (setf a (make-array 10))
#(nil nil nil nil nil nil nil nil nil nil)
cl-user(34): (setf days #(Sun Mon Tue Wed Thu Fri Sat))
#(Sun Mon Tue Wed Thu Fri Sat)
cl-user(35): (aref days 3)
Wed
cl-user(36): (setf (aref a 2) 5)
5
cl-user(37): a
#(nil nil 5 nil nil nil nil nil nil nil)
However, one must distinguish whether the
int[] squares = {0, 1, 2, 9, 16, 25};
{...}
notation
is a general array-valued constructor, allowed on the rhs of
assignment statements, or only a special syntax for declaration
statements.
Program arrayop
Integer A1(5), A2(5), A3(5), A4(5)
Data A1 /1, 2, 3, 4, 5/ A2 /6, 7, 8, 9, 10/
A3 = A1 + A2
A4 = A1 * A2
Print *, A1
Print *, A2
Print *, A3
Print *, A4
End
------------------------------------
<timberlake:Test:1:34> f95 -o arrayop arrayop.f
<timberlake:Test:1:35> ./arrayop
1 2 3 4 5
6 7 8 9 10
7 9 11 13 15
6 14 24 36 50
Rectangular arrays are indexed with one pair of brackets, such as
a[i, j]
.
Jagged arrays are indexed with two pairs of brackets, such as
a[i][j]
.
Note that
bsh % int[][] a = new int[3][4];
bsh % print(a.length);
3
bsh % print(a[1].length);
4
a
is a 3-element array of 4-element arrays. It
is usual to also think of this as 3 rows of 4 columns each:
An array stored so that all the elements of the first row are stored
before all the elements of the second row, etc. is referred to as
stored in row major order.
bsh % for (int i=0; i<3; i++) for (int j=0; j<4; j++) a[i][j] = 10*i+j;
bsh % for (int i=0; i<3; i++) {
for (int j=0; j<4; j++) {System.out.print(a[i][j] + " ");}
System.out.println();}
0 1 2 3
10 11 12 13
20 21 22 23
We can see this clearly in C:
This shows that C stores arrays in row major order.
#include <stdio.h>
int a[3][4];
int main() {
int i,j;
for (i=0; i<3; i++) {
for (j=0; j<<4; j++) {
a[i][j] = 10*i + j;
}
}
for (i=0; i<12; i++) {
printf("%3d", *(a + i));}
printf("\n");
return 0;
}
--------------------------------------
<timberlake:Test:1:41> gcc -Wall -o arrayorder arrayorder.c
<timberlake:Test:1:42> ./arrayorder
0 1 2 3 10 11 12 13 20 21 22 23
(Note that
Program arrayorder
Integer A(3,4), B(12)
Equivalence ( A(1,1), B(1) )
Do 50 i = 1, 3
Do 50 j = 1, 4
A(i,j) = 10*i + j
50 Continue
Print *, B
End
-----------------------------------------
<timberlake:Test:1:44> f95 -o arrayorder arrayorder.f
<timberlake:Test:1:45> ./arrayorder
11 21 31 12 22 32 13 23 33 14 24 34
Equivalence
is deprecated in Fortran 90 and
later versions.)
Fortran stores arrays in column major order. Since Fortran and C
programs can easily call each other, this is an important difference.
Here's a small Python example:
<timberlake:Test:1:46> python
Python 2.6.4 (r264:75706, Dec 21 2009, 12:37:31)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = [0,1,2,3,4,5,6,7,8,9]
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[3:7]
[3, 4, 5, 6]
>>> b = [13,14,15,16]
>>> a[3:7] = b
>>> a
[0, 1, 2, 13, 14, 15, 16, 7, 8, 9]
>>>
map
s in C++, hash
tables in Common Lisp, Map
s in Java, hashes in Perl and Ruby, and
dictionaries in Python are generalizations of arrays for which the
"index" can be any type. The "index" is called a key, and the element
stored with the key is called the value.
Here is a use of Python's dictionaries to print the length of all the months:
#! /util/bin/python
months = ("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November",
"December");
monLength = {"January":30, "February":28, "March":31, "April":30,
"May":31, "June":30, "July":31, "August":31,
"September":30, "October":31, "November":30,
"December":31};
for month in months:
print "%s has %d days." % (month, monLength[month])
-----------------------------------------------------
<timberlake:Test:1:167> python months.py
January has 30 days.
February has 28 days.
March has 31 days.
April has 30 days.
May has 31 days.
June has 30 days.
July has 31 days.
August has 31 days.
September has 30 days.
October has 31 days.
November has 30 days.
December has 31 days.
C and C++ calls them structs. Common Lisp calls them structures.
Note that C++ and Common Lisp have true, modern, objects as well. See
Sebesta for more details.
nil
, which is an explicitly invalid
address. That is, the value bound to a variable whose type is a
pointer type is either a memory address or nil
.
ptr
is a pointer variable, we want:
ptr := <expression>
, but <expression>
would be evaluated for its r-value. So we need something that
says "evaluate this expression for its l-value." In C and C++
that operator is &
, and its operand must be an expression
that could be on the left-hand side of an assignment statement.
x
is a variable and ptr
is a pointer
variable, what is the meaning of x := ptr
?
x
is also a pointer variable, it's a simple
assignment statement. x
is not a pointer
variable, it's either an error or the compiler must know that
ptr
is to be dereferenced. C and C++ use *
as an explicit dereferencing operator. Fortran 95 does implicit
dereferencing.
Notice that
#include <stdio.h>
int* ptr;
void sub1() {
int x, y;
x = 3;
ptr = &x;
y = *ptr;
printf("x = %2d; y = %2d.\n", x, y);
}
void sub2() {
int z = 5;
printf("z = %2d.\n", z);
}
void sub3() {
printf("*ptr = %2d.\n", *ptr);
}
int main() {
sub1();
sub2();
sub3();
return 0;
}
--------------------------------------------
<timberlake:Test:1:27> gcc -Wall -o pointerTest pointerTest.c
<timberlake:Test:1:28> ./pointerTest
x = 3; y = 3.
z = 5.
*ptr = 5.
ptr
contains a pointer to the memory cell on the stack
that was first occupied by sub1
's x
, but
later was occupied by sub2
's z
.
The fact that
Program pointerTest
Integer, Pointer :: ptr
Integer, Target :: x
Integer :: y
x = 3
ptr => x
y = ptr
x = 5
Print *, "x = ", x, "y = ", y
End
-----------------------------------------------------
<timberlake:Test:1:29> f95 -o pointerTest pointerTest.f
<timberlake:Test:1:30> ./pointerTest
x = 5 y = 3
y
has the value 3
shows that
ptr
was dereferenced before a value was stored into y
.
ptr
is of type
typ *
, and i
is of type int
, the
expression ptr + i
evaluates to the address
i*sizeof(typ)
beyond ptr
.
new
, in Java and C++, and
malloc(size)
, in C, return pointers to the newly
allocated heap memory.
(
#include <stdio.h>
#include <malloc.h>
int* ptr;
int main() {
ptr = malloc(sizeof(int));
*ptr = 3;
free(ptr);
printf("*ptr = %2d\n", *ptr);
return 0;
}
---------------------------------------------------
<pollux:Test:1:27> gcc -Wall -o danglingTest danglingTest.c
timberlake
printed *ptr = 0
.)
The operation of constructing a list with some element as a head and
some list as a tail is usually called cons
. In Lisp:
Lists are a native data type in Erlang, Haskell, Lisp, Prolog, and Python.
cl-user(9): (cons 'a '(b c d e f))
(a b c d e f)
Lists are immutable in Erlang, Haskell, and Prolog;
mutable in Lisp and Python;
homogeneous in Haskell;
heterogeneous in Erlang, Lisp, Prolog, and Python.
(lambda (x y) (= (mod x y) 0))
.
A lambda expression, being a function, can be the first element of a list, with
the following elements being its arguments.
cl-user(6): (type-of (lambda (x y) (= (mod x y) 0)))
function
Lisp's
cl-user(13): ((lambda (x y) (= (mod x y) 0)) 81 3)
t
apply
is a function that takes a function and a list of
arguments, and applies the function to the arguments.
Lisp's
cl-user(14): (apply (lambda (x y) (= (mod x y) 0)) '(48 6))
t
funcall
is a function that takes a function and a sequence of
arguments, and applies the function to the arguments.
When you define a function in Lisp, the function is put in the
cl-user(33): (funcall (lambda (x y) (= (mod x y) 0)) 48 6)
t
symbol-function
cell of the name of the function.
A function can compute a function
cl-user(15): (defun fact (n) (if (< n 2) 1 (* n (fact (1- n)))))
fact
cl-user(16): (fact 4)
24
cl-user(17): (symbol-function 'fact)
#<Interpreted Function fact>
cl-user(18): (compile 'fact)
fact
nil
nil
cl-user(19): (symbol-function 'fact)
#<Function fact>
cl-user(34): (funcall (symbol-function 'fact) 4)
24
If that's a function, we can use
cl-user(35): ((lambda (x) (lambda (y) (= (mod x y) 0))) 312)
#<Interpreted Closure (:internal (:internal nil)) @ #x71c22642>
cl-user(37): (type-of ((lambda (x) (lambda (y) (= (mod x y) 0))) 312))
function
funcall
on it
Representing a function of two arguments as a function of one argument whose
value is a function of one argument (and similarly for more than one argument)
is called "currying", after the logician Haskell Curry. The type of a function
from type t1 to type t2 can be expressed as
t1 -> t2. So t1 -> (t2
-> t3) is the type of a function from type t1
to a function of type t2 -> t3, which is the
curried form of a function of two arguments, one of type t1
and one of type t2 to a result of type t3.
cl-user(40): (funcall ((lambda (x) (lambda (y) (= (mod x y) 0))) 312) 3)
t
There is a function data type in Erlang, Haskell, Lisp, Python.
<timberlake:Test:1:35> ghci
GHCi, version 6.12.1: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Prelude> let divby x y = ((mod x y) == 0)
Prelude> divby 14 3
False
Prelude> divby 4551 3
True
Prelude> :t divby
divby :: (Integral a) => a -> a -> Bool
Prelude> ^d
Leaving GHCi.
There are lambda expressions in Ruby, but their class is Proc
.