PPL: Names, Variables, and Scopes
PPL: Names, Variables, and Scopes
of CSE, NRCM
2.1 Introduction
Imperative programming languages are, to varying degrees, abstractions of the underlying
von Neumann computer architecture. The architecture’s two primary components are its memory,
which stores both instructions and data, and its processor, which provides operations for modifying the
contents of the memory. The abstractions in a language for the memory cells of the machine are variables.
In some cases, the characteristics of the abstractions are very close to the characteristics of
the cells; an example of this is an integer variable, which is usually represented directly in one or more
bytes of memory. In other cases, the abstractions are far removed from the organization of the hardware
memory, as with a three-dimensional array, which requires a software mapping function to support the
abstraction.
Design Issues:
The following are the primary design issues for names:
Are names case sensitive?
Are the special words of the language reserved words or keywords?
These issues are discussed in the following two subsections, which also
include examples of several design choices.
2.2 Name Forms:
A name is a string of characters used to identify some entity in a program.
Fortran 95+ allows up to 31 characters in its names. C99 has no length limitation on its internal
names, but only the first 63 are significant. External names in C99 (those defined outside functions,
which must be handled by the linker) are restricted to 31 characters. Names in Java, C#, and Ada
have no length limit, and all characters in them are significant. C++ does not specify a length limit
on names.
Names in most programming languages have the same form: a letter followed by a string
consisting of letters, digits, and underscore characters ( _ ). Although the use of underscore characters to
form names was widely used in the 1970s and 1980s, that practice is now far less popular. In the C -based
languages, it has to a large extent been replaced by the so-called camel notation, in which all of the words
of a multiple-word name except the first are capitalized, as in myStack.2 Note that the use of underscores
and mixed case in names is a programming style issue, not a language design issue.
All variable names in PHP must begin with a dollar sign. In Perl, the special character at the
beginning of a variable’s name, $, @, or %, specifies its type (although in a different sense than in other
languages). In Ruby, special characters at the beginning of a variable’s name, @ or @@, indicate that the
variable is an instance or a class variable, respectively.
In many languages, notably the C-based languages, uppercase and lowercase letters in names are
distinct; that is, names in these languages are case sensitive.
For example, the following three names are distinct in C++: rose, ROSE, and Rose.
Page | 1
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
Special Words:
Special words in programming languages are used to make programs more readable by
naming actions to be performed. They also are used to separate the syntactic parts of statements and
programs. In most languages, special words are classified as reserved words, which means they cannot be
redefined by programmers, but in some they are only keywords, which means they can be redefined.
A keyword is a word of a programming language that is special only in certain contexts.
FORTRAN is the only remaining widely used language whose special words are keywords.
A reserved word is a special word of a programming language that cannot be used as a
name. As a language design choice, reserved words are better than keywords because the ability to redefine
keywords can be confusing. For example, in FORTRAN, one could have the following statements:
Integer Real
Real Integer
In most languages, names that are defined in other program units, such as Java packages and
C and C++ libraries, can be made visible to a program. These names are predefined, but visible only if
explicitly imported. Once imported, they cannot be redefined.
2.3 Variables:
A program variable is an abstraction of a computer memory cell or collection of cells.
Programmers often think of variable names as names for memory locations, but there is much more to a
variable than just a name.
A variable can be characterized as a six tuple of attributes: (name, address, value, type, lifetime,
and scope).
Names:
Variable names are the most common names in programs. Variable names can be categorized as follows:
a) Static Variables:
Static variables are those that are bound to memory cells before program execution begins
and remain bound to those same memory cells until program execution terminates. Statically bound
variables have several valuable applications in programming.
One advantage of static variables is efficiency. All addressing of static variables can be direct;
other kinds of variables often require indirect addressing, which is slower. Also, no run-time
overhead is incurred for allocation and deallocation of static variables, although this time is
often negligible.
One disadvantage of static binding to storage is reduced flexibility. C and C++ allow
programmers to include the static specifier on a variable definition in a function, making
the variables it defines static.
b) Stack-Dynamic Variables:
Stack-dynamic variables are those whose storage bindings are created when their declaration
statements are elaborated, but whose types are statically bound.
Elaboration of such a declaration refers to the storage allocation and binding process
indicated by the declaration, which takes place when execution reaches the code to which the
declaration is attached. Therefore, elaboration occurs during run time. As their name indicates,
stack-dynamic variables are allocated from the run-time stack.
Some languages—for example, C++ and Java—allow variable declarations to occur anywhere a
statement can appear.
The advantages of stack-dynamic variables are as follows: To be useful, at least in most cases, recursive
subprograms require some form of dynamic local storage so that each active copy of the recursive
subprogram has its own version of the local variables.
The disadvantages, relative to static variables, of stack-dynamic variables are the run-time overhead of
allocation and deallocation, possibly slower accesses because indirect addressing is required.
Page | 2
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
In this example, an explicit heap-dynamic variable of int type is created by the new
operator. This variable can then be referenced through the pointer, intnode.
Later, the variable is deallocated by the delete operator. C++ requires the explicit
deallocation operator delete.
d) Implicit Heap-Dynamic Variables:
Implicit heap-dynamic variables are bound to heap storage only when they are assigned values. In
fact, all their attributes are bound every time they are assigned.
For example, consider the following JavaScript assignment statement:
Marks = [74, 84, 86, 90, 71];
Regardless of whether the variable named Marks was previously used in the program or what
it was used for, it is now an array of five numeric values.
The advantage of such variables is that they have the highest degree of flexibility, allowing
highly generic code to be written.
Scope:
The scope of a variable is the range of statements in which the var is visible.
A variable is visible in a statement if it can be referenced in that statement.
Local variable is local in a program unit or block if it is declared there.
Non-local variable of a program unit or block are those that are visible within the program unit or
block but are not declared there.
Static Scope :
Binding names to non-local variables is called static scoping.
There are two categories of static scoped languages:
1. Nested Subprograms.
2. Subprograms that can’t be nested.
Ada, and JavaScript allow nested subprogram, but the C-based languages do not.
When a compiler for static-scoped language finds a reference to a variable, the attributes
of the variables are determined by finding the statement in which it was declared.
Page | 3
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
If a declaration of x is not found there, the search continues to the next larger enclosing unit (the unit
that declared Sub1’s parent), and so forth, until a declaration for x is found or the largest unit’s
declarations have been searched without success.
The static parent of subprogram Sub1, and its static parent, and so forth up to and including the main
program, are called the static ancestors of Sub1.
Consider the following JavaScript function, big, in which the two functions sub1 and sub2 are nested:
function big() {
function sub1() {
var x = 7;
sub2();
}
function sub2() {
var y = x;
}
var x = 3;
sub1();
}
Under static scoping, the reference to the variable x in sub2 is to the x declared in the procedure big.
This is true because the search for x begins in the procedure in which the reference occurs, sub2, but no
declaration for x is found there. The search continues in the static parent of sub2, big, where the declaration
of x is found. The x declared in sub1 is ignored, because it is not inthe static ancestry of sub2.
Blocks:
Blocks allow a section of code to have its own local variables whose scope is minimized. Such
variables are stack dynamic, so they have their storage allocated when the section is entered and deallocated
when the section is exited.
Blocks provide the origin of the phrase block-structured language.
The C-based languages allow any compound statement (a statement sequence surrounded by
matched braces) to have declarations and thereby define a new scope. Such compound
statements are called blocks.
For example, if list were an integer array, one could write as
if (list[i] < list[j])
{
int temp;
temp = list[i];
list[i] = list[j];
list[j] = temp;
}
The scopes created by blocks, which could be nested in larger blocks, are treated exactly like
those created by subprograms. References to variables in a block that are not declared there
are connected to declarations by searching enclosing scopes (blocks or subprograms) in order
of increasing size.
Declaration order:
In JavaScript, local variables can be declared anywhere in a function, but the scope of such a
variable is always the entire function. If used before its declaration in the function, such a variable
has the value undefined.
Page | 4
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
The for statements of C++, Java, and C# allow variable definitions in their initialization
expressions. In early versions of C++, the scope of such a variable was from its definition to the
end of the smallest enclosing block. In the standard version, however, the scope is restricted to
the for construct, as is the case with Java and C#.
Global Scope:
Some languages, including C, C++, PHP, JavaScript, and Python, allow a program
structure that is a sequence of function definitions, in which variable definitions can appear outside the
functions.
Definitions outside functions in a file create global variables, which potentially can be visible to
those functions.
C and C++ have both declarations and definitions of global data. Declarations specify types and
other attributes but do not cause allocation of storage.
Definitions specify attributes and cause storage allocation. For a specific global name, a C
program can have any number of compatible declarations, but only a single definition.
Declaration order and global variables are also issues in the class and member declarations in
object-oriented languages.
Dynamic Scope:
Dynamic scoping is based on the calling sequence of subprograms, not on their spatial
relationship to each other. Thus, the scope can be determined only at run time.
The scope of variables in APL, SNOBOL4, and the early versions of LISP is dynamic.
Perl and Common LISP also allow variables to be declared to have dynamic scope, although the
default scoping mechanism in these languages is static.
Consider the following example on dynamic scoping:
function big()
{
function sub1()
{
var x = 7;
}
function sub2() {
var y = x;
var z = 3;
}
var x = 3;
}
The meaning of the identifier x referenced in sub2 is dynamic—it cannot be determined at compile time.
It may reference the variable from either declaration of x, depending on the calling sequence.
Dynamic scoping also makes programs much more difficult to read, because the calling sequence of
subprograms must be known to determine the meaning of references to nonlocal variables.
Subprograms are always executed in the environment of all previously called subprograms that have
not yet completed their executions. As a result, dynamic scoping results in less reliable programs
than static scoping.
A second problem with dynamic scoping is the inability to type check references to non-locals
statically. This problem results from the inability to statically find the declaration for a variable
referenced as a nonlocal.
Scope and Lifetime:
Consider a variable that is declared in a Java method that contains no method
calls. The scope of such a variable is from its declaration to the end of the method. The lifetime of
that variable is the period of time beginning when the method is entered and ending when
execution of the method terminates.
Page | 5
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
Although the scope and lifetime of the variable are clearly not the same, because static scope is a
textual, or spatial, concept whereas lifetime is a temporal concept, they at least appear to be related in
this case.
Scope and lifetime are also unrelated when subprogram calls are involved.
Consider the following C++ functions :
void printheader() {
...
} /* end of printheader */
void compute() {
int sum;
...
printheader();
} /* end of compute */
The scope of the variable sum is completely contained within the compute function. It does not
extend to the body of the function printheader, although printheader executes in the midst of the
execution of compute. However, the lifetime of sum extends over the time during which printheader
executes. Whatever storage location sum is bound to before the call to printheader, that binding will
continue during and after the execution of printheader.
Referencing Environments:
The referencing environment of a statement is the collection of all variables that are
visible in the statement. The referencing environment of a statement in a static-scoped language is the
variables declared in its local scope plus the collection of all variables of its ancestor scopes that are visible.
In such a language, the referencing environment of a statement is needed while that statement is being
compiled, so code and data structures can be created to allow references to variables from other scopes
during run time.
The referencing environment of a statement includes the local variables, plus all of the
variables declared in the functions in which the statement is nested (excluding variables in nonlocal scopes
that are hidden by declarations in nearer functions). Each function definition creates a new scope and thus a
new environment.
Named Constants:
A named constant is a variable that is bound to a value only once. Named constants are useful
as aids to readability and program reliability. Readability can be improved, for example, by using the
name pi instead of the constant 3.14159265.
Consider the following skeletal Java program segment:
void example()
{
int[] intList = new int[100];
String[] strList = new String[100];
...
for (index = 0; index < 100; index++) {
...
}
...
for (index = 0; index < 100; index++) {
...
}
...
Page | 6
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
When this program must be modified to deal with a different number of data values, all occurrences of 100
must be found and changed. On a large program, this can be tedious and error prone. An easier and more
reliable method is to use a named constant as a program parameter, as follows:
void example()
{
final int len = 100;
int[] intList = new int[len];
String[] strList = new String[len];
...
for (index = 0; index < len; index++) {
...
}
...
for (index = 0; index < len; index++) {
...
}
...
average = sum / len;
...
}
Now, when the length must be changed, only one line must be changed (the variable len), regardless
of the number of times it is used in the program. This is another example of the benefits of abstraction.
The name len is an abstraction for the number of elements in some arrays and the number of iterations in
some loops. This illustrates how named constants can aid modifiability.
Page | 7
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
1. Introduction:
A data type defines a collection of data values and a set of predefined operations on those
values. Computer programs produce results by manipulating data. An important factor in determining the
ease with which they can perform this task is how well the data types available in the language being used
match the objects in the real-world of the problem being addressed. Therefore, it is crucial that a language
supports an appropriate collection of data types and structures.
There are a number of uses of the type system of a programming language. The most
practical of these is error detection. The process and value of type checking, which is directed by the type
system of the language.
A value is any entity that can be manipulated by a program. Values can be evaluated,
stored, passed as arguments, and returned as function results, and so on. Different programming
languages support different types of values:
C supports integers, real numbers, structures, arrays, unions, pointers to variables, and pointers to
functions. (Integers, real numbers, and pointers are primitive values; structures, arrays, and unions are
composite values.)
C++, which is a superset of C, supports all the above types of values plus objects. (Objects are composite
values.)
JAVA supports booleans, integers, real numbers, arrays, and objects. (Booleans, integers, and real
numbers are primitive values; arrays and objects are composite values.)
ADA supports booleans, characters, enumerands, integers, real numbers, records, arrays,
discriminated records, objects (tagged records), strings, pointers to data, and pointers to procedures.
(Booleans, characters, enumerands, integers, real numbers, and pointers are primitive values; records,
arrays, discriminated records, objects, and strings are composite values.)
Most programming languages group values into types. For instance, nearly all languages
make a clear distinction between integer and real numbers. languages also make a clear distinction
between Booleans and integers: integers can be added and multiplied, while booleans can be subjected to
operations like not, and, and or.
Therefore we define a type to be a set of values, equipped with one or more operations that
can be applied uniformly to all these values.
Every programming language supports both primitive types, whose values are primitive and
composite types, whose values are composed from simpler values. Some languages also have recursive types,
a recursive type being one whose values are composed from other values of the same type.
2. Primitive types:
A primitive value is one that cannot be decomposed into simpler values. A primitive type is one
whose values are primitive. Every programming language provides built-in primitive types. Some languages
also allow programs to define new primitive types.
One or more primitive types are built-in to every programming language. The choice of built-in
primitive types tells us much about the programming language’s intended application area. Languages
intended for commercial data processing (such as COBOL) are likely to have primitive types whose
Page | 8
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
values are fixed-length strings and fixed-point numbers. Languages intended for numerical computation
(such as FORTRAN) are likely to have primitive types whose values are real numbers (with a choice of
precisions) and perhaps also complex numbers. A language intended for string processing (such as
SNOBOL) is likely to have a primitive type whose values are strings of arbitrary length .
For example, JAVA has boolean, char, int, and float, whereas ADA has Boolean,
Character, Integer, and Float. For the sake of consistency, we shall use Boolean, Character, Integer, and
Float as names for the most common primitive types:
Boolean = {false, true}
Character = {. . . , ‘a’, . . . , ‘z’, . . . , ‘0’, . . . , ‘9’, . . . , ‘?’, . . .}
Integer = {. . . ,−2,−1, 0,+1,+2, . . .}
Float = {. . . ,−1.0, . . . , 0.0, . . . ,+1.0, . . .}
The Boolean type has exactly two values, false and true. In some languages these two values
are denoted by the literals false and true, in others by predefined identifiers false and true.
The Character type is a language-defined or implementation-defined set of characters. The
chosen character set is usually ASCII (128 characters), ISOLATIN (256 characters), or
UNICODE (65 536 characters).
The Integer type is a language-defined or implementation-defined range of whole
numbers. The range is influenced by the computer’s word size and integer arithmetic.
For instance, on a 32-bit computer with two’s complement arithmetic, Integer will be
{−2 147 483 648, . . . ,+2 147 483 647}.
The Float type is a language-defined or implementation-defined subset of the (rational) real
numbers. The range and precision are determined by the computer’s word size and floating-
point arithmetic.
Another way to avoid portability problems is to allow programs to define their own
integer and floating-point types, stating explicitly the desired range and/or precision for each type.
This approach is taken by ADA.
In ADA we can define a completely new primitive type by enumerating its values
(more precisely, by enumerating identifiers that will denote its values).Such a type is called
an enumeration type, and its values are called enumerands.
C and C++ also support enumerations, but in these languages an enumeration type is
actually an integer type, and each enumerand denotes a small integer.
type Month is (jan, feb, mar, apr, may, jun,jul, aug, sep, oct, nov, dec);
Month = {jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec}
enum Month {jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec};
Page | 9
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
defines Month to be an integer type, and binds jan to 0, feb to 1, and so on.
A discrete primitive type is a primitive type whose values have a one-to-one relationship
with a range of integers.
This is an important concept in ADA, in which values of any discrete primitive type may be
used for array indexing, counting, and so on. The discrete primitive types in ADA are Boolean,
Character, integer types, and enumeration types.
Most programming languages allow only integers to be used for counting and array
indexing. C and C++ allow enumerands also to be used for counting and array indexing, since they
classify enumeration types as integer types.
3. Composite types:
A composite value (or data structure) is a value that is composed from simpler values.
A composite type is a type whose values are composite. Programming languages support a huge variety
of composite values: tuples, structures, records, arrays, algebraic types, discriminated records, objects,
unions, strings, lists, trees, sequential files, direct files, relations, etc.
All these composite values can be understood in terms of a small number of structuring concepts,
Which are:
Cartesian products (tuples, records)
mappings (arrays)
disjoint unions (algebraic types, discriminated records, objects)
recursive types (lists, trees).
In a Cartesian product, values of several (possibly different) types are grouped into tuples.
The structures of C and C++, and the records of ADA, can be understood in terms of Cartesian
products.
Consider the following C++ definitions:
This type models dates even more crudely than its ADA counterpart in Example 2.5.
Page | 10
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
Records:
A record is a possibly heterogeneous aggregate of data elements in which the individual elements
are identified by names.
Operations on Records:
Assignment is very common if the types are identical
Ada allows record comparison
Ada records can be initialized with aggregate literals
COBOL provides MOVE CORRESPONDING
Copies a field of the source record to the corresponding field in the target record
Evaluation and Comparison to Arrays :
• Records are used when collection of data values is heterogeneous
• Access to array elements is much slower than access to record fields, because subscripts are
dynamic (field names are static)
• Dynamic subscripts could be used with record field access, but it would disallow type checking and
it would be much slower.
The unions of C and C++ are not disjoint unions, since they have no tags. This obviously
makes tag test impossible, and makes projection unsafe. In practice, therefore, C programmers
enclose each union within a structure that also contains a tag.
A union is a type whose variables are allowed to store different type values at different times during
execution.
Consider the following C type definition:
union Untagged_Number
{
int ival;
float rval;
};
The set of values of this union type is: Untagged-Number = Integer ∪ Float.
4. Recursive Types :
A recursive type is one defined in terms of itself. In this section we discuss two
common recursive types: lists and strings, as well as recursive types in general.
Lists:
A list is a sequence of values. A list may have any number of components, including none.
The number of components is called the length of the list. The unique list with no components is called
the empty list.
A list is homogeneous if all its components are of the same type; otherwise it is heterogeneous.
Consider only homogeneous lists:
Typical list operations are:
Length
emptiness test
head selection (i.e., selection of the list’s first component)
tail selection (i.e., selection of the list consisting of all but the first component)
concatenation.
The following code constructs an IntList object with four nodes:
IntList primes = new IntList(
new IntNode(2, new IntNode(3,
new IntNode(5, new IntNode(7, null)))));
Page | 11
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
5. Array Types:
An array is an aggregate of homogeneous data elements in which an individual element is
identified by its position in the aggregate, relative to the first element.
Array Design Issues
• What types are legal for subscripts?
• Are subscripting expressions in element references range checked?
• When are subscript ranges bound?
• When does allocation take place?
• What is the maximum number of subscripts?
• Can array objects be initialized?
• Are any kind of slices supported?
Subscript Binding:
The binding of the subscript type to an array variable is usually static, but the subscript value
ranges are sometimes dynamically bound.
Page | 12
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
these three. In the first four of these categories, once the subscript ranges are bound and the storage is
allocated, they remain fixed for the lifetime of the variable.
1. A static array is one in which the subscript ranges are statically boundand storage allocation
is static (done before run time). The advantage of static arrays is efficiency: No dynamic
allocation or deallocation is required.
The disadvantage is that the storage for the array is fixed for the entire execution time
of the program.
2. A fixed stack-dynamic array is one in which the subscript ranges are statically bound, but the
allocation is done at declaration elaboration time during execution. The advantage of fixed stack-
dynamic arrays over static arrays is space efficiency. A large array in one subprogram can use the
same space as a large array in a different subprogram, as long as both subprograms are not active
at the same time. The same is true if the two arrays are in different blocks that are not active at
the same time.
The disadvantage is the required allocation and deallocation time.
3. A stack-dynamic array is one in which both the subscript ranges and the storage allocation are
dynamically bound at elaboration time. Once the sub scriptranges are bound and the storage is
allocated, however, they remain fixed during the lifetime of the variable.
The advantage of stack-dynamic arrays over static and fixed stack-dynamic arrays is
flexibility. The size of an array need not be known until the array is about to be used.
4. A fixed heap-dynamic array is similar to a fixed stack-dynamic array, in that the subscript ranges
and the storage binding are both fixed after storage is allocated. The differences are that both the
subscript ranges and storage bindings are done when the user program requests them during
execution, and the storage is allocated from the heap, rather than the stack. The advantage of fixed
heap-dynamic arrays is flexibility—the array’s size always fits the problem.
The disadvantage is allocation time from the heap, which is longer than allocation time
from the stack.
5. A heap-dynamic array is one in which the binding of subscript ranges and storage allocation is
dynamic and can change any number of times during the array’s lifetime. The advantage of heap-
dynamic arrays over the others is flexibility: Arrays can grow and shrink during program
execution as the need forspace changes. The disadvantage is that allocation and deallocation take
longer and may happen many times during execution of the program. Examples of the five
categories are given in the following paragraphs. Arrays declared in C and C++ functions that
include the static modifier are static.
Arrays that are declared in C and C++ functions (without the static specifier) are
examples of fixed stack-dynamic arrays.
Array Initialization:
An array aggregate for a single-dimensioned array is a list of literals delimited by
parentheses and slashes.
Arrays of strings in C and C++ can also be initialized with string literals. In this case, the
array is one of pointers to characters. For example,
Page | 13
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
In Java, similar syntax is used to define and initialize an array of references to String objects.
For example,
String[ ] names = ["Bob", "Jake", "Darcie"];
Ada provides two mechanisms for initializing arrays in the declaration statement:
List : array (1..5) of Integer := (1, 3, 5, 7, 9);
Bunch : array (1..5) of Integer := (1 => 17, 3 => 34,others => 0);
Page | 14
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
Pointer Operations
• Two fundamental operations: assignment and dereferencing
• Assignment is used to set a pointer variable‘s value to some useful address
• Dereferencing yields the value stored at the location represented by the pointer‘s value
Dereferencing can be explicit or implicit
C++ uses an explicit operation via *
j = *ptr
Sets j to the value located at ptr
Pointer Assignment Illustration
Page | 15
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
Reference Types
C++ includes a special kind of pointer type called a reference type that is used primarily for formal
parameters
– Advantages of both pass-by-reference and pass-by-value
• Java extends C++‘s reference variables and allows them to replace pointers entirely
– References are references to objects, rather than being addresses
• C# includes both the references of Java and the pointers of C++
Reference Counter
Reference counters: maintain a counter in every cell that stores the number of pointers
currently pointing at the cell.
– Disadvantages: space required, execution time required, complications for cells connected circularly
– Advantage: it is intrinsically incremental, so significant delays in the application execution are avoided
Type Checking:
Type checking is the activity of ensuring that the operands of an operator are of compatible
types.
• A compatible type is one that is either legal for the operator, or is allowed under language rules to be
implicitly converted, by compiler- generated code, to a legal type. This automatic conversion is called as
coercion.
• A type error is the application of an operator to an operand of an inappropriate type
• If all type bindings are static, nearly all type checking can be static
• If type bindings are dynamic, type checking must be dynamic
• Def: A programming language is strongly typed if type errors are always detected.
Type Compatibility:
Name type compatibility means the two variables have compatible types if they are in either
the same declaration or in declarations that use the same type name.
• Easy to implement but highly restrictive:
– Subranges of integer types are not compatible with integer types
– Formal parameters must be the same type as their corresponding actual parameters
(Pascal)
• Structure type compatibility means that two variables have compatible types if their types have
identical structures
• More flexible, but harder to implement.
Strong Typing:
It allows the detection of the misuses of variables that result in type errors.
Language examples:
– FORTRAN 77 is not: parameters, EQUIVALENCE
– Pascal is not: variant records
– C and C++ are not: parameter type checking can be avoided; unions are not type checked.
Ada is, almost (UNCHECKED CONVERSION is loophole) (Java is similar)
• Coercion rules strongly affect strong typing--they can weaken it considerably (C++ versus Ada)
• Although Java has just half the assignment coercions of C++, its strong typing is still far less effective
than that of Ada.
Tuple:
A tuple is a compound data type having a fixed number of terms. Each term in a tuple is known as an
element. The number of elements is the size of the tuple.
A tuple can have any number of items and they may be of different types (integer, float, list,
string, etc.).
Tuples are used to store multiple items in a single variable.
Tuple is one of 4 built-in data types in Python used to store collections of data, the other 3
are List, Set, and Dictionary, all with different qualities and usage.
A tuple is a collection which is ordered and unchangeable.
Page | 16
PRINCIPLES OF PROGRAMMING LANGUAGES (PPL) HEMANTH, Asst. Professor, Dept. of CSE, NRCM
In computer science, tuples come in many forms. Most typed functional programming
languages implement tuples directly as product types, tightly associated with algebraic data
types, pattern matching, and destructuring assignment.
Many programming languages offer an alternative to tuples, known as record types,
featuring unordered elements accessed by label.
Advantages of Tuples:
• Tuples are fined size in nature i.e. we can’t add/delete elements to/from a tuple.
• We can search any element in a tuple.
• Tuples are faster than lists, because they have a constant set of values.
• Tuples can be used as dictionary keys, because they contain immutable values like strings, numbers, etc.
List Types:
A list is a number of items in an ordered or unordered structure. A list can be used for a
number of things like storing items or deleting and adding items. But for the programmer to perform the
different tasks for the list, the program must have enough memory to keep up with changes done to the list. List
is the most versatile data type available in functional programming languages used to store a collection of
similar data items.
The concept is similar to arrays in object-oriented programming. List items can be written in a
square bracket separated by commas. The way to writing data into a list varies from language to language.
• There is different sort of lists which are linear list and linked list. Also, the list can be referred to as an abstract
data type.
Linear List - A static abstract data type. The amount of data does not change at run time.
Linked List - Dynamic Abstract Data Type. Uses pointers to vary memory used at run time.
TYPE EQUIVALENCE:
Name Type Equivalence
Name type equivalence means the two variables have equivalent types, if they are in either the
same declaration or in declarations that use the same type name.
Easy to implement but highly restrictive:
Sub ranges of integer types are not equivalent with integer types
Formal parameters must be the same type as their corresponding actual parameters
************************************************************
Page | 17