0% found this document useful (0 votes)
41 views17 pages

PPL: Names, Variables, and Scopes

Uploaded by

fakefakke0001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views17 pages

PPL: Names, Variables, and Scopes

Uploaded by

fakefakke0001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept.

of CSE, NRCM

UNIT-2: Names, Bindings and Scopes


Topics:
1. Introduction
2. Names
3. Variables
4. The Concept of Binding
5. Scope
6. Scope and Lifetime
7. Referencing Environments
8. Named Constants

2.1 Introduction
Imperative programming languages are, to varying degrees, abstractions of the underlying
von Neumann computer architecture. The architecture’s two primary components are its memory,
which stores both instructions and data, and its processor, which provides operations for modifying the
contents of the memory. The abstractions in a language for the memory cells of the machine are variables.
In some cases, the characteristics of the abstractions are very close to the characteristics of
the cells; an example of this is an integer variable, which is usually represented directly in one or more
bytes of memory. In other cases, the abstractions are far removed from the organization of the hardware
memory, as with a three-dimensional array, which requires a software mapping function to support the
abstraction.
Design Issues:
The following are the primary design issues for names:
 Are names case sensitive?
 Are the special words of the language reserved words or keywords?
These issues are discussed in the following two subsections, which also
include examples of several design choices.
2.2 Name Forms:
A name is a string of characters used to identify some entity in a program.
Fortran 95+ allows up to 31 characters in its names. C99 has no length limitation on its internal
names, but only the first 63 are significant. External names in C99 (those defined outside functions,
which must be handled by the linker) are restricted to 31 characters. Names in Java, C#, and Ada
have no length limit, and all characters in them are significant. C++ does not specify a length limit
on names.
Names in most programming languages have the same form: a letter followed by a string
consisting of letters, digits, and underscore characters ( _ ). Although the use of underscore characters to
form names was widely used in the 1970s and 1980s, that practice is now far less popular. In the C -based
languages, it has to a large extent been replaced by the so-called camel notation, in which all of the words
of a multiple-word name except the first are capitalized, as in myStack.2 Note that the use of underscores
and mixed case in names is a programming style issue, not a language design issue.
All variable names in PHP must begin with a dollar sign. In Perl, the special character at the
beginning of a variable’s name, $, @, or %, specifies its type (although in a different sense than in other
languages). In Ruby, special characters at the beginning of a variable’s name, @ or @@, indicate that the
variable is an instance or a class variable, respectively.
In many languages, notably the C-based languages, uppercase and lowercase letters in names are
distinct; that is, names in these languages are case sensitive.
For example, the following three names are distinct in C++: rose, ROSE, and Rose.

Page | 1
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

Special Words:
Special words in programming languages are used to make programs more readable by
naming actions to be performed. They also are used to separate the syntactic parts of statements and
programs. In most languages, special words are classified as reserved words, which means they cannot be
redefined by programmers, but in some they are only keywords, which means they can be redefined.
A keyword is a word of a programming language that is special only in certain contexts.
FORTRAN is the only remaining widely used language whose special words are keywords.
A reserved word is a special word of a programming language that cannot be used as a
name. As a language design choice, reserved words are better than keywords because the ability to redefine
keywords can be confusing. For example, in FORTRAN, one could have the following statements:
Integer Real
Real Integer
In most languages, names that are defined in other program units, such as Java packages and
C and C++ libraries, can be made visible to a program. These names are predefined, but visible only if
explicitly imported. Once imported, they cannot be redefined.

2.3 Variables:
A program variable is an abstraction of a computer memory cell or collection of cells.
Programmers often think of variable names as names for memory locations, but there is much more to a
variable than just a name.
 A variable can be characterized as a six tuple of attributes: (name, address, value, type, lifetime,
and scope).
Names:
Variable names are the most common names in programs. Variable names can be categorized as follows:
a) Static Variables:
Static variables are those that are bound to memory cells before program execution begins
and remain bound to those same memory cells until program execution terminates. Statically bound
variables have several valuable applications in programming.
 One advantage of static variables is efficiency. All addressing of static variables can be direct;
other kinds of variables often require indirect addressing, which is slower. Also, no run-time
overhead is incurred for allocation and deallocation of static variables, although this time is
often negligible.
 One disadvantage of static binding to storage is reduced flexibility. C and C++ allow
programmers to include the static specifier on a variable definition in a function, making
the variables it defines static.
b) Stack-Dynamic Variables:
Stack-dynamic variables are those whose storage bindings are created when their declaration
statements are elaborated, but whose types are statically bound.
Elaboration of such a declaration refers to the storage allocation and binding process
indicated by the declaration, which takes place when execution reaches the code to which the
declaration is attached. Therefore, elaboration occurs during run time. As their name indicates,
stack-dynamic variables are allocated from the run-time stack.
Some languages—for example, C++ and Java—allow variable declarations to occur anywhere a
statement can appear.
 The advantages of stack-dynamic variables are as follows: To be useful, at least in most cases, recursive
subprograms require some form of dynamic local storage so that each active copy of the recursive
subprogram has its own version of the local variables.
 The disadvantages, relative to static variables, of stack-dynamic variables are the run-time overhead of
allocation and deallocation, possibly slower accesses because indirect addressing is required.

Page | 2
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

c) Explicit Heap-Dynamic Variables:


Explicit heap-dynamic variables are nameless (abstract) memory cells that are allocated and
deallocated by explicit run-time instructions written by the programmer.
These variables, which are allocated from and deallocated to the heap, can only be referenced through
pointer or reference variables.
The heap is a collection of storage cells whose organization is highly disorganized because of
the unpredictability of its use. The pointer or reference variable that is used to access an explicit heap-
dynamic variable is created as any other scalar variable. An explicit heap-dynamic variable is created by
either an operator (for example, in C++) or a call to a system subprogram provided for that purpose (for
example, in C).
As an example of explicit heap-dynamic variables, consider the following C++ code segment:
int *intnode; // Create a pointer
intnode = new int; // Create the heap-dynamic variable
...
delete intnode; // Deallocate the heap-dynamic variable

In this example, an explicit heap-dynamic variable of int type is created by the new
operator. This variable can then be referenced through the pointer, intnode.
Later, the variable is deallocated by the delete operator. C++ requires the explicit
deallocation operator delete.
d) Implicit Heap-Dynamic Variables:
Implicit heap-dynamic variables are bound to heap storage only when they are assigned values. In
fact, all their attributes are bound every time they are assigned.
For example, consider the following JavaScript assignment statement:
Marks = [74, 84, 86, 90, 71];
Regardless of whether the variable named Marks was previously used in the program or what
it was used for, it is now an array of five numeric values.
 The advantage of such variables is that they have the highest degree of flexibility, allowing
highly generic code to be written.
Scope:
The scope of a variable is the range of statements in which the var is visible.
 A variable is visible in a statement if it can be referenced in that statement.
 Local variable is local in a program unit or block if it is declared there.
 Non-local variable of a program unit or block are those that are visible within the program unit or
block but are not declared there.
Static Scope :
Binding names to non-local variables is called static scoping.
 There are two categories of static scoped languages:
1. Nested Subprograms.
2. Subprograms that can’t be nested.

 Ada, and JavaScript allow nested subprogram, but the C-based languages do not.
 When a compiler for static-scoped language finds a reference to a variable, the attributes
of the variables are determined by finding the statement in which it was declared.

Ex: Suppose a reference is made to a var x in subprogram Sub1.


The correct declaration is found by first searching the declarations of subprogram Sub1.
 If no declaration is found for the var there, the search continues in the declarations of the subprogram
that declared subprogram Sub1, which is called its static parent.

Page | 3
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

 If a declaration of x is not found there, the search continues to the next larger enclosing unit (the unit
that declared Sub1’s parent), and so forth, until a declaration for x is found or the largest unit’s
declarations have been searched without success.
 The static parent of subprogram Sub1, and its static parent, and so forth up to and including the main
program, are called the static ancestors of Sub1.

Consider the following JavaScript function, big, in which the two functions sub1 and sub2 are nested:

function big() {
function sub1() {
var x = 7;
sub2();
}
function sub2() {
var y = x;
}
var x = 3;
sub1();
}

Under static scoping, the reference to the variable x in sub2 is to the x declared in the procedure big.
This is true because the search for x begins in the procedure in which the reference occurs, sub2, but no
declaration for x is found there. The search continues in the static parent of sub2, big, where the declaration
of x is found. The x declared in sub1 is ignored, because it is not inthe static ancestry of sub2.
Blocks:
Blocks allow a section of code to have its own local variables whose scope is minimized. Such
variables are stack dynamic, so they have their storage allocated when the section is entered and deallocated
when the section is exited.
 Blocks provide the origin of the phrase block-structured language.
 The C-based languages allow any compound statement (a statement sequence surrounded by
matched braces) to have declarations and thereby define a new scope. Such compound
statements are called blocks.
 For example, if list were an integer array, one could write as
if (list[i] < list[j])
{
int temp;
temp = list[i];
list[i] = list[j];
list[j] = temp;
}
The scopes created by blocks, which could be nested in larger blocks, are treated exactly like
those created by subprograms. References to variables in a block that are not declared there
are connected to declarations by searching enclosing scopes (blocks or subprograms) in order
of increasing size.
Declaration order:
 In JavaScript, local variables can be declared anywhere in a function, but the scope of such a
variable is always the entire function. If used before its declaration in the function, such a variable
has the value undefined.

Page | 4
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

 The for statements of C++, Java, and C# allow variable definitions in their initialization
expressions. In early versions of C++, the scope of such a variable was from its definition to the
end of the smallest enclosing block. In the standard version, however, the scope is restricted to
the for construct, as is the case with Java and C#.

Global Scope:
Some languages, including C, C++, PHP, JavaScript, and Python, allow a program
structure that is a sequence of function definitions, in which variable definitions can appear outside the
functions.
 Definitions outside functions in a file create global variables, which potentially can be visible to
those functions.
 C and C++ have both declarations and definitions of global data. Declarations specify types and
other attributes but do not cause allocation of storage.
 Definitions specify attributes and cause storage allocation. For a specific global name, a C
program can have any number of compatible declarations, but only a single definition.
 Declaration order and global variables are also issues in the class and member declarations in
object-oriented languages.
Dynamic Scope:
Dynamic scoping is based on the calling sequence of subprograms, not on their spatial
relationship to each other. Thus, the scope can be determined only at run time.
 The scope of variables in APL, SNOBOL4, and the early versions of LISP is dynamic.
 Perl and Common LISP also allow variables to be declared to have dynamic scope, although the
default scoping mechanism in these languages is static.
Consider the following example on dynamic scoping:
function big()
{
function sub1()
{
var x = 7;
}
function sub2() {
var y = x;
var z = 3;
}
var x = 3;
}
The meaning of the identifier x referenced in sub2 is dynamic—it cannot be determined at compile time.
It may reference the variable from either declaration of x, depending on the calling sequence.
 Dynamic scoping also makes programs much more difficult to read, because the calling sequence of
subprograms must be known to determine the meaning of references to nonlocal variables.
 Subprograms are always executed in the environment of all previously called subprograms that have
not yet completed their executions. As a result, dynamic scoping results in less reliable programs
than static scoping.
 A second problem with dynamic scoping is the inability to type check references to non-locals
statically. This problem results from the inability to statically find the declaration for a variable
referenced as a nonlocal.
Scope and Lifetime:
Consider a variable that is declared in a Java method that contains no method
calls. The scope of such a variable is from its declaration to the end of the method. The lifetime of
that variable is the period of time beginning when the method is entered and ending when
execution of the method terminates.

Page | 5
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

 Although the scope and lifetime of the variable are clearly not the same, because static scope is a
textual, or spatial, concept whereas lifetime is a temporal concept, they at least appear to be related in
this case.
 Scope and lifetime are also unrelated when subprogram calls are involved.
Consider the following C++ functions :

void printheader() {
...
} /* end of printheader */
void compute() {
int sum;
...
printheader();
} /* end of compute */

The scope of the variable sum is completely contained within the compute function. It does not
extend to the body of the function printheader, although printheader executes in the midst of the
execution of compute. However, the lifetime of sum extends over the time during which printheader
executes. Whatever storage location sum is bound to before the call to printheader, that binding will
continue during and after the execution of printheader.
Referencing Environments:
The referencing environment of a statement is the collection of all variables that are
visible in the statement. The referencing environment of a statement in a static-scoped language is the
variables declared in its local scope plus the collection of all variables of its ancestor scopes that are visible.
In such a language, the referencing environment of a statement is needed while that statement is being
compiled, so code and data structures can be created to allow references to variables from other scopes
during run time.
The referencing environment of a statement includes the local variables, plus all of the
variables declared in the functions in which the statement is nested (excluding variables in nonlocal scopes
that are hidden by declarations in nearer functions). Each function definition creates a new scope and thus a
new environment.
Named Constants:
A named constant is a variable that is bound to a value only once. Named constants are useful
as aids to readability and program reliability. Readability can be improved, for example, by using the
name pi instead of the constant 3.14159265.
Consider the following skeletal Java program segment:
void example()
{
int[] intList = new int[100];
String[] strList = new String[100];
...
for (index = 0; index < 100; index++) {
...
}
...
for (index = 0; index < 100; index++) {
...
}
...
Page | 6
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

average = sum / 100;


...
}

When this program must be modified to deal with a different number of data values, all occurrences of 100
must be found and changed. On a large program, this can be tedious and error prone. An easier and more
reliable method is to use a named constant as a program parameter, as follows:
void example()
{
final int len = 100;
int[] intList = new int[len];
String[] strList = new String[len];
...
for (index = 0; index < len; index++) {
...
}
...
for (index = 0; index < len; index++) {
...
}
...
average = sum / len;
...
}

Now, when the length must be changed, only one line must be changed (the variable len), regardless
of the number of times it is used in the program. This is another example of the benefits of abstraction.
The name len is an abstraction for the number of elements in some arrays and the number of iterations in
some loops. This illustrates how named constants can aid modifiability.

Data Types and Variables


Topics:
1 Introduction
2 Primitive Data Types
3 Character String Types
4 User-Defined Ordinal Types
5 Array Types
6 Associative Arrays
7 Record Types
8 Tuple Types
9 List Types
10 Union Types
11 Pointer and Reference Types
12 Type Checking
13 Strong Typing

Page | 7
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

1. Introduction:
A data type defines a collection of data values and a set of predefined operations on those
values. Computer programs produce results by manipulating data. An important factor in determining the
ease with which they can perform this task is how well the data types available in the language being used
match the objects in the real-world of the problem being addressed. Therefore, it is crucial that a language
supports an appropriate collection of data types and structures.
There are a number of uses of the type system of a programming language. The most
practical of these is error detection. The process and value of type checking, which is directed by the type
system of the language.
A value is any entity that can be manipulated by a program. Values can be evaluated,
stored, passed as arguments, and returned as function results, and so on. Different programming
languages support different types of values:
 C supports integers, real numbers, structures, arrays, unions, pointers to variables, and pointers to
functions. (Integers, real numbers, and pointers are primitive values; structures, arrays, and unions are
composite values.)
 C++, which is a superset of C, supports all the above types of values plus objects. (Objects are composite
values.)
 JAVA supports booleans, integers, real numbers, arrays, and objects. (Booleans, integers, and real
numbers are primitive values; arrays and objects are composite values.)
 ADA supports booleans, characters, enumerands, integers, real numbers, records, arrays,
discriminated records, objects (tagged records), strings, pointers to data, and pointers to procedures.
(Booleans, characters, enumerands, integers, real numbers, and pointers are primitive values; records,
arrays, discriminated records, objects, and strings are composite values.)

Most programming languages group values into types. For instance, nearly all languages
make a clear distinction between integer and real numbers. languages also make a clear distinction
between Booleans and integers: integers can be added and multiplied, while booleans can be subjected to
operations like not, and, and or.

Therefore we define a type to be a set of values, equipped with one or more operations that
can be applied uniformly to all these values.
Every programming language supports both primitive types, whose values are primitive and
composite types, whose values are composed from simpler values. Some languages also have recursive types,
a recursive type being one whose values are composed from other values of the same type.

2. Primitive types:
A primitive value is one that cannot be decomposed into simpler values. A primitive type is one
whose values are primitive. Every programming language provides built-in primitive types. Some languages
also allow programs to define new primitive types.

2.1 Built-in primitive types


2.2 Defined primitive types
2.3 Discrete primitive types

2.1 Built-in primitive types:

One or more primitive types are built-in to every programming language. The choice of built-in
primitive types tells us much about the programming language’s intended application area. Languages
intended for commercial data processing (such as COBOL) are likely to have primitive types whose

Page | 8
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

values are fixed-length strings and fixed-point numbers. Languages intended for numerical computation
(such as FORTRAN) are likely to have primitive types whose values are real numbers (with a choice of
precisions) and perhaps also complex numbers. A language intended for string processing (such as
SNOBOL) is likely to have a primitive type whose values are strings of arbitrary length .
For example, JAVA has boolean, char, int, and float, whereas ADA has Boolean,
Character, Integer, and Float. For the sake of consistency, we shall use Boolean, Character, Integer, and
Float as names for the most common primitive types:
Boolean = {false, true}
Character = {. . . , ‘a’, . . . , ‘z’, . . . , ‘0’, . . . , ‘9’, . . . , ‘?’, . . .}
Integer = {. . . ,−2,−1, 0,+1,+2, . . .}
Float = {. . . ,−1.0, . . . , 0.0, . . . ,+1.0, . . .}
 The Boolean type has exactly two values, false and true. In some languages these two values
are denoted by the literals false and true, in others by predefined identifiers false and true.
 The Character type is a language-defined or implementation-defined set of characters. The
chosen character set is usually ASCII (128 characters), ISOLATIN (256 characters), or
UNICODE (65 536 characters).
 The Integer type is a language-defined or implementation-defined range of whole
numbers. The range is influenced by the computer’s word size and integer arithmetic.
For instance, on a 32-bit computer with two’s complement arithmetic, Integer will be
{−2 147 483 648, . . . ,+2 147 483 647}.
 The Float type is a language-defined or implementation-defined subset of the (rational) real
numbers. The range and precision are determined by the computer’s word size and floating-
point arithmetic.

2.2 User Defined primitive types:

Another way to avoid portability problems is to allow programs to define their own
integer and floating-point types, stating explicitly the desired range and/or precision for each type.
This approach is taken by ADA.
In ADA we can define a completely new primitive type by enumerating its values
(more precisely, by enumerating identifiers that will denote its values).Such a type is called
an enumeration type, and its values are called enumerands.
C and C++ also support enumerations, but in these languages an enumeration type is
actually an integer type, and each enumerand denotes a small integer.

Example: ADA and C++ enumeration types

The following ADA type definition:

type Month is (jan, feb, mar, apr, may, jun,jul, aug, sep, oct, nov, dec);

defines a completely new type, whose values are twelve enumerands:

Month = {jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec}

The cardinality of this type is: #Month = 12

By contrast, the C++ type definition:

enum Month {jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec};

Page | 9
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

defines Month to be an integer type, and binds jan to 0, feb to 1, and so on.

Thus: Month = {0, 1, 2, . . . , 11}

2.3 Discrete primitive types:

A discrete primitive type is a primitive type whose values have a one-to-one relationship
with a range of integers.
This is an important concept in ADA, in which values of any discrete primitive type may be
used for array indexing, counting, and so on. The discrete primitive types in ADA are Boolean,
Character, integer types, and enumeration types.
Most programming languages allow only integers to be used for counting and array
indexing. C and C++ allow enumerands also to be used for counting and array indexing, since they
classify enumeration types as integer types.

3. Composite types:
A composite value (or data structure) is a value that is composed from simpler values.
A composite type is a type whose values are composite. Programming languages support a huge variety
of composite values: tuples, structures, records, arrays, algebraic types, discriminated records, objects,
unions, strings, lists, trees, sequential files, direct files, relations, etc.
All these composite values can be understood in terms of a small number of structuring concepts,
Which are:
 Cartesian products (tuples, records)
 mappings (arrays)
 disjoint unions (algebraic types, discriminated records, objects)
 recursive types (lists, trees).

In a Cartesian product, values of several (possibly different) types are grouped into tuples.
The structures of C and C++, and the records of ADA, can be understood in terms of Cartesian
products.
Consider the following C++ definitions:

enum Month {jan, feb, mar, apr, may, jun,


jul, aug, sep, oct, nov, dec};
struct Date {
Month m;
byte d;
};

This structure type has the set of values:

Date = Month × Byte = {jan, feb, . . . , dec} × {0, . . . , 255}

This type models dates even more crudely than its ADA counterpart in Example 2.5.

The following code illustrates structure construction:

struct Date someday = {jan, 1};

Page | 10
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

Records:
A record is a possibly heterogeneous aggregate of data elements in which the individual elements
are identified by names.
Operations on Records:
 Assignment is very common if the types are identical
 Ada allows record comparison
 Ada records can be initialized with aggregate literals
 COBOL provides MOVE CORRESPONDING
Copies a field of the source record to the corresponding field in the target record
Evaluation and Comparison to Arrays :
• Records are used when collection of data values is heterogeneous
• Access to array elements is much slower than access to record fields, because subscripts are
dynamic (field names are static)
• Dynamic subscripts could be used with record field access, but it would disallow type checking and
it would be much slower.

The unions of C and C++ are not disjoint unions, since they have no tags. This obviously
makes tag test impossible, and makes projection unsafe. In practice, therefore, C programmers
enclose each union within a structure that also contains a tag.

A union is a type whose variables are allowed to store different type values at different times during
execution.
Consider the following C type definition:
union Untagged_Number
{
int ival;
float rval;
};
The set of values of this union type is: Untagged-Number = Integer ∪ Float.

4. Recursive Types :
A recursive type is one defined in terms of itself. In this section we discuss two
common recursive types: lists and strings, as well as recursive types in general.

Lists:
A list is a sequence of values. A list may have any number of components, including none.
The number of components is called the length of the list. The unique list with no components is called
the empty list.
A list is homogeneous if all its components are of the same type; otherwise it is heterogeneous.
Consider only homogeneous lists:
Typical list operations are:
 Length
 emptiness test
 head selection (i.e., selection of the list’s first component)
 tail selection (i.e., selection of the list consisting of all but the first component)
 concatenation.
The following code constructs an IntList object with four nodes:
IntList primes = new IntList(
new IntNode(2, new IntNode(3,
new IntNode(5, new IntNode(7, null)))));

Page | 11
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

Character String Type:


A string is a sequence of characters. A string may have any number of characters, including
none.
The number of characters is called the length of the string. The unique string with no characters is
called the empty string.
Strings are supported by all modern programming languages. Typical string operations are:
 length
 equality comparison
 lexicographic comparison
 character selection
 substring selection
 concatenation.
In a programming language that supports lists, the most natural approach is to treat strings
as lists of characters. This approach makes all the usual list operations automatically applicable to
strings. A slightly different and more flexible approach is to treat strings as pointers to arrays of
characters. C and C++ adopt this approach.
In an object-oriented language, the most natural approach is to treat strings as objects. This
approach enables strings to be equipped with methods providing all the desired operations, and
avoids the disadvantages of treating strings just as special cases of arrays or lists. JAVA adopts this
approach.

5. Array Types:
An array is an aggregate of homogeneous data elements in which an individual element is
identified by its position in the aggregate, relative to the first element.
Array Design Issues
• What types are legal for subscripts?
• Are subscripting expressions in element references range checked?
• When are subscript ranges bound?
• When does allocation take place?
• What is the maximum number of subscripts?
• Can array objects be initialized?
• Are any kind of slices supported?

Arrays and Indices:


Specific elements of an array are referenced by means of a two-level syntactic mechanism,
where the first part is the aggregate name, and the second part is a possibly dynamic selector consisting
of one or more items known as subscripts or indices.
Arrays are sometimes called finite mappings. Symbolically, this mapping can be shown
asarray_name(subscript_value_list) ---> element

Subscript Binding:
The binding of the subscript type to an array variable is usually static, but the subscript value
ranges are sometimes dynamically bound.

Array Categories (Types):


There are five categories of arrays, based on the binding to subscript ranges, the binding to
storage, and from where the storage is allocated. The category names indicate the design choices of

Page | 12
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

these three. In the first four of these categories, once the subscript ranges are bound and the storage is
allocated, they remain fixed for the lifetime of the variable.
1. A static array is one in which the subscript ranges are statically boundand storage allocation
is static (done before run time). The advantage of static arrays is efficiency: No dynamic
allocation or deallocation is required.
The disadvantage is that the storage for the array is fixed for the entire execution time
of the program.

2. A fixed stack-dynamic array is one in which the subscript ranges are statically bound, but the
allocation is done at declaration elaboration time during execution. The advantage of fixed stack-
dynamic arrays over static arrays is space efficiency. A large array in one subprogram can use the
same space as a large array in a different subprogram, as long as both subprograms are not active
at the same time. The same is true if the two arrays are in different blocks that are not active at
the same time.
The disadvantage is the required allocation and deallocation time.

3. A stack-dynamic array is one in which both the subscript ranges and the storage allocation are
dynamically bound at elaboration time. Once the sub scriptranges are bound and the storage is
allocated, however, they remain fixed during the lifetime of the variable.
The advantage of stack-dynamic arrays over static and fixed stack-dynamic arrays is
flexibility. The size of an array need not be known until the array is about to be used.

4. A fixed heap-dynamic array is similar to a fixed stack-dynamic array, in that the subscript ranges
and the storage binding are both fixed after storage is allocated. The differences are that both the
subscript ranges and storage bindings are done when the user program requests them during
execution, and the storage is allocated from the heap, rather than the stack. The advantage of fixed
heap-dynamic arrays is flexibility—the array’s size always fits the problem.
The disadvantage is allocation time from the heap, which is longer than allocation time
from the stack.

5. A heap-dynamic array is one in which the binding of subscript ranges and storage allocation is
dynamic and can change any number of times during the array’s lifetime. The advantage of heap-
dynamic arrays over the others is flexibility: Arrays can grow and shrink during program
execution as the need forspace changes. The disadvantage is that allocation and deallocation take
longer and may happen many times during execution of the program. Examples of the five
categories are given in the following paragraphs. Arrays declared in C and C++ functions that
include the static modifier are static.
Arrays that are declared in C and C++ functions (without the static specifier) are
examples of fixed stack-dynamic arrays.
Array Initialization:
An array aggregate for a single-dimensioned array is a list of literals delimited by
parentheses and slashes.

 C, C++, Java, and C# also allow initialization of their arrays:


int list [ ] = {4, 5, 7, 83};
char name [ ] = "freddie";

Arrays of strings in C and C++ can also be initialized with string literals. In this case, the
array is one of pointers to characters. For example,

Page | 13
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

char *names [ ] = {"Bob", "Jake", "Darcie"};

 In Java, similar syntax is used to define and initialize an array of references to String objects.

For example,
String[ ] names = ["Bob", "Jake", "Darcie"];

 Ada provides two mechanisms for initializing arrays in the declaration statement:
List : array (1..5) of Integer := (1, 3, 5, 7, 9);
Bunch : array (1..5) of Integer := (1 => 17, 3 => 34,others => 0);

Rectangular and Jagged Arrays:


A rectangular array is a multidimensional array in which all of the rows have the same
number of elements and all of the columns have the same number of elements. Rectangular arrays
model rectangular tables exactly.
A jagged array is one in which the lengths of the rows need not be the same. For
example, a jagged matrix may consist of three rows, one with 5 elements, one with 7 elements, and
one with 12 elements. Jagged arrays are made possible when multidimensional arrays are actually
arrays of arrays.
For example, a matrix would appear as an array of single-dimensioned arrays.
C, C++, and Java support jagged arrays but not rectangular arrays. In tho se
languages, a reference to an element of a multidimensional array uses a separate pair of
brackets for each dimension.
Slices:
A slice is some substructure of an array; nothing more than a referencing mechanism.
Slices are only useful in languages that have array operations.
Implementation of Arrays
• Access function maps subscript expressions to an address in the array
• Access function for single-dimensioned arrays:
address(list[k]) = address (list[lower_bound])+((k-lower_bound) * element_size)
Accessing Multi-dimensioned Arrays
• Two common ways:
– Row major order (by rows) – used in most languages
– Column major order (by columns) – used in Fortran
Record Types:
Record is a possibly heterogeneous aggregate of data elements in which the individual
elements are identified by names
• Design issues:
– What is the syntactic form of references to the field?
– Are elliptical references allowed?
 The fundamental difference between a record and an array is that record elements, or fields, are
not referenced by indices. Instead, the fields are named with identifiers, and references to the fields
are made using these identifiers.
 In Java and C#, records can be defined as data classes, with nested records defined as nested
classes. Data members of such classes serve as the record fields.
 For example, consider the following declaration:
[Link] = "MAHESH BABU"
[Link] = 10000
These assignment statements create a table (record) named employee with two elements
(fields)named name and hourlyRate, both initialized.

Page | 14
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

Pointer and Reference Types:


A pointer type variable has a range of values that consists of memory addresses and a special value, nil
• Provide the power of indirect addressing
• Provide a way to manage dynamic memory
• A pointer can be used to access a location in the area where storage is dynamically created
(usually called a heap)

Design Issues of Pointers:

• What are the scope of and lifetime of a pointer variable?


• What is the lifetime of a heap-dynamic variable?
• Are pointers restricted as to the type of value to which they can point?
• Are pointers used for dynamic storage management, indirect addressing, or both?
• Should the language support pointer types, reference types, or both?

Pointer Operations
• Two fundamental operations: assignment and dereferencing
• Assignment is used to set a pointer variable‘s value to some useful address
• Dereferencing yields the value stored at the location represented by the pointer‘s value
Dereferencing can be explicit or implicit
C++ uses an explicit operation via *
j = *ptr
Sets j to the value located at ptr
Pointer Assignment Illustration

Fig: The assignment operation j = *ptr

Pointers in C and C++ :


•Pointers are extremely flexible but must be used with care
• Pointers can point at any variable regardless of when or where it was allocated
• Pointers are used for addressing flexibility and to control dynamic storage management and addressing.
• Pointer arithmetic is possible
• Explicit dereferencing and address-of operators
• Domain type need not be fixed (void *)
void * can point to any type and can be type checked (cannot be de-referenced)
Pointer Arithmetic in C and C++
float stuff[100]; float *p;
p = stuff;
*(p+5) is equivalent to stuff[5] and p[5]
*(p+i) is equivalent to stuff[i] and p[i]

Page | 15
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

Reference Types
C++ includes a special kind of pointer type called a reference type that is used primarily for formal
parameters
– Advantages of both pass-by-reference and pass-by-value
• Java extends C++‘s reference variables and allows them to replace pointers entirely
– References are references to objects, rather than being addresses
• C# includes both the references of Java and the pointers of C++
Reference Counter
Reference counters: maintain a counter in every cell that stores the number of pointers
currently pointing at the cell.
– Disadvantages: space required, execution time required, complications for cells connected circularly
– Advantage: it is intrinsically incremental, so significant delays in the application execution are avoided

Type Checking:
Type checking is the activity of ensuring that the operands of an operator are of compatible
types.
• A compatible type is one that is either legal for the operator, or is allowed under language rules to be
implicitly converted, by compiler- generated code, to a legal type. This automatic conversion is called as
coercion.
• A type error is the application of an operator to an operand of an inappropriate type
• If all type bindings are static, nearly all type checking can be static
• If type bindings are dynamic, type checking must be dynamic
• Def: A programming language is strongly typed if type errors are always detected.
Type Compatibility:
Name type compatibility means the two variables have compatible types if they are in either
the same declaration or in declarations that use the same type name.
• Easy to implement but highly restrictive:
– Subranges of integer types are not compatible with integer types
– Formal parameters must be the same type as their corresponding actual parameters
(Pascal)
• Structure type compatibility means that two variables have compatible types if their types have
identical structures
• More flexible, but harder to implement.
Strong Typing:
It allows the detection of the misuses of variables that result in type errors.
Language examples:
– FORTRAN 77 is not: parameters, EQUIVALENCE
– Pascal is not: variant records
– C and C++ are not: parameter type checking can be avoided; unions are not type checked.
Ada is, almost (UNCHECKED CONVERSION is loophole) (Java is similar)
• Coercion rules strongly affect strong typing--they can weaken it considerably (C++ versus Ada)
• Although Java has just half the assignment coercions of C++, its strong typing is still far less effective
than that of Ada.

Tuple:
A tuple is a compound data type having a fixed number of terms. Each term in a tuple is known as an
element. The number of elements is the size of the tuple.
 A tuple can have any number of items and they may be of different types (integer, float, list,
string, etc.).
 Tuples are used to store multiple items in a single variable.
 Tuple is one of 4 built-in data types in Python used to store collections of data, the other 3
are List, Set, and Dictionary, all with different qualities and usage.
 A tuple is a collection which is ordered and unchangeable.

Page | 16
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM

 In computer science, tuples come in many forms. Most typed functional programming
languages implement tuples directly as product types, tightly associated with algebraic data
types, pattern matching, and destructuring assignment.
 Many programming languages offer an alternative to tuples, known as record types,
featuring unordered elements accessed by label.

Advantages of Tuples:

Tuples offer the following advantages –

• Tuples are fined size in nature i.e. we can’t add/delete elements to/from a tuple.
• We can search any element in a tuple.
• Tuples are faster than lists, because they have a constant set of values.
• Tuples can be used as dictionary keys, because they contain immutable values like strings, numbers, etc.

List Types:
A list is a number of items in an ordered or unordered structure. A list can be used for a
number of things like storing items or deleting and adding items. But for the programmer to perform the
different tasks for the list, the program must have enough memory to keep up with changes done to the list. List
is the most versatile data type available in functional programming languages used to store a collection of
similar data items.
The concept is similar to arrays in object-oriented programming. List items can be written in a
square bracket separated by commas. The way to writing data into a list varies from language to language.
• There is different sort of lists which are linear list and linked list. Also, the list can be referred to as an abstract
data type.
 Linear List - A static abstract data type. The amount of data does not change at run time.
 Linked List - Dynamic Abstract Data Type. Uses pointers to vary memory used at run time.

TYPE EQUIVALENCE:
Name Type Equivalence
Name type equivalence means the two variables have equivalent types, if they are in either the
same declaration or in declarations that use the same type name.
 Easy to implement but highly restrictive:
 Sub ranges of integer types are not equivalent with integer types
 Formal parameters must be the same type as their corresponding actual parameters

Structure Type Equivalence:


Structure type equivalence means that two variables have equivalent types if their types have
identical structures.
 More flexible, but harder to implement
Consider the problem of two structured types:
– Are two record types equivalent if they are structurally the same but use different field names?
– Are two array types equivalent if they are the same except that the subscripts are different?
(e.g. [1...10] and [0...9])
– Are two enumeration types equivalent if their components are spelled differently?
– With structural type equivalence, you cannot differentiate between types of the same structure
(e.g., different units of speed, both float)

************************************************************

Page | 17

You might also like