next up previous contents
Next: 4. Classes and Objects Up: Algorithms and Data Structures Previous: 2. Simple Algorithms and   Contents

Subsections

3. Introduction to Object-oriented Programming

The question always arises: is object-oriented programming a new paradigm? i.e. is it based on a radically different model of computing, a totally new way of programming and thinking about programming?

An analysis of everyday meaning of the words object-oriented programming would lead us to equate it to simply programming with abstract-data-types. However, to achieve what is now understood as object-orientation, we must add inheritance, and late binding (run-time binding).

Programming based on ADTs, without inheritance, is termed object-based.

Inheritance and late binding facilitate reuse, in the sense that software modules -- classes, or families of classes -- may be extended without modifying the original. Here, I use the term module in a general sense - to signify a fundamental unit of software; the components of a module, e.g. functions, individual statements, should not normally be separated from the module.

The concept of reuse is sometimes poorly defined. It doesn't mean simple use without any modification, e.g. as in a library, or library function - that form of use is successfully used all the time. Really, ease of reuse means easily extended, or modified to cope with some change of requirements; indeed, we can also include ease of maintenance, since this, also, comes about chiefly due to change of requirements.

Reuse via extension without (invasive) modification, is highly significant:

A quotation from John Vlissides (C++ Report, Feb 1997) sums it up: "I've said it before and I'll say it again: A hallmark - if not the hallmark - of good object-oriented design is that you can modify and extend a system by adding code rather than by hacking it. In short, change is additive not invasive."

We will see that the reuse concepts mentioned here are the embodiment of the open-closed principle [Meyer, 1997,Martin, 1996]. The open-closed principle allows a module to satisfy two apparently irreconcilable requirements:

Of course, I must not be seen to diminish the significance the abstraction and encapsulation offered by simple classes and modules. These offer great benefits on their own, and are major engineering principles in their own right.

On the other hand, some of the greatest problems in software development lie in the illusion that software is easy to modify - after all, the very name software seems to suggest malleability - like plasticine, as opposed to hardware.

Actually, software is easy to create, and to modify, to hack together a quick solution, but only in certain circumstances:

These can be summarised by contrasting large-scale software projects with small-scale and personal software projects: programming-in-the-small, and opposed to programming-in-the-large, [Parnas, 1972].

When software teams attempt to carry out large projects using the same practices that they used in personal software projects, the contrasts become apparent, the phrase software-crisis, see e.g [Brooks, 1995], takes on a grim reality.

3.0.0.0.1 Java - Mixture of Paradigms

Java is a superset of C, an imperative programming, not unlike Pascal or Modula-2; it is C with object-orientation extensions.

The perceptive reader will already have wondered: how are methods implemented? Surely, eventually, one must get to machine code - which definitely imperative.

Here is they key to getting to grips with the apparent paradox, how can Java be object-oriented and imperative at the same time? The answer is that, in Java, even in a strongly object-oriented system, a typical method / function is implemented in an imperative language -- most of which differs little from C.

As a consequence of its mixture of features, it is perfectly possible to write Java programs in a style that is anything from pure C, through to a style that emulates pure Smalltalk.

3.1 Software Engineering and its Requirements and Goals

In this and following subsections I will develop a justification of object-orientation as a solution to some of the requirements and goals of software engineering today.

Unless you have worked on the production of software for sale, it may be difficult to grasp some of the goals of software engineering and software production companies; hence, I will attempt to relate much of what I say to student experience.

When you set about doing a programming assignment you are likely to be primarily focussed on three issues:

Looking at the matter from an other point of view - that of a consumer, what do you look for in a software package? Presumably:

Hence, from the consumer and the producer point of view, we have recurring goals: cost, productivity, quality, schedule, efficiency, reuse.

One can lump together cost, schedule, reuse, and productivity, as simply productivity: how quickly, and with as few programmers as possible, can I produce my new Internet killer-app? In addition, how quickly and cheaply can I extend it to run on a recently introduced Pentium III with super-charged-MMX-12-speed-pipelined-over-drive?

Quality? I am going to lump efficiency in with it. Quality can be defined simply as meeting customer expectations, [Crosby, 1979]. Moreover, we can quote the title of Crosby's book: Quality is Free, [Crosby, 1979], to link quality with productivity; at its simplest, the less time spent fixing faults, the greater the productivity.

3.1.0.0.1 Software Crisis

The phrase software crisis is often used to encapsulate all the ills of the software industry, particularly poor productivity, and thus to justify constant pleas for reuse.

Actually, there are many issues involved here; e.g. maybe software engineers are just as productive as any other engineers. Or more so? Or, if software engineers are unproductive, by what standard are they measured?

Nevertheless, there is good reason to believe that some software engineering methods may be immature, and that the techniques and practices associated with object-orientation do represent progress in the industry or profession.

3.2 The Waterfall Life-cycle and Structured Systems Analysis and Design

As well as being a programming (implementation) technique, object-orientation very much also impacts on analysis and design, and hence the whole of management of software projects, [Yourdon, 1989].

Those software projects of the late 1970s and 1980s that were managed (as opposed to hacked!), were probably managed on the basis of some flavour of the waterfall life-cycle, and structured analysis and design [Yourdon, 1989].

Incidentally, let me add here, that while I believe that management - both project and people - is a crucial ingredient in any significantly sized software project, it is nonsense to think that good management alone can succeed in the absence of good engineering techniques.

3.2.0.0.1 Waterfall Life-cycle

The waterfall life-cycle seems the natural way to run an engineering project:

The phases in a typical project are:

  1. Study feasibility. Preliminary exploration of possible solutions. Take a hard look at benefits versus drawbacks and costs of candidate solutions.

  2. User Definition of System Requirement. The user documents as much as he knows about the job the system must do. Often they don't know what they want, and, even if they do, they cannot express it with any precision.

  3. Developer Definition of System Requirement. Developer analyses users requirement and, performs further investigation of requirements, produces developers version of requirements: System Requirement Document (SRD). The SRD is now a specification and part of the contract. Prepare project plan, including costing and schedule.

  4. High-Level Design. Decompose into subsystems, perhaps modules. Allocate to team or individuals. Usually called Architectural Design or Preliminary Design. Output Architectural Design Document (ADD). Design tests for sub-systems.

  5. Detailed Design. Decompose further into subsystems (units or components) such that one person can cope. Specify these subsystems. Design tests for (sub)subsystems. Outcome: Detailed Design Document (DDD).

  6. Implementation coding. Code and test components - each programmer.

  7. Integration and Test. Components and modules are brought together to form higher level systems. And tested. You should now have a working system!

  8. System Test. Assess the quality of the system as defined in SRD, above. Simple definition of quality: meets customer requirements / expectations.

  9. Operations and Maintenance.

Although this scheme is better than nothing, e.g. all hands to the keyboards, with no planning, it, along with structured analysis and design, has serious shortcomings, some of which were already noted:

3.2.0.0.2 Structured Systems Analysis

This example trivializes the methodology, nevertheless it identifies the characteristics I want to examine.

Informal requirement: Read ten numbers; sum them and compute the average; output the result to the screen. Three processes:

Input ---> Process ---> Output

is how it would appear on a top-level data-flow-diagram. If necessary, e.g. if there was detailed verification of input, it might be necessary to explode the Input process, to arrive at a number of subprocesses, e.g. four processes:

Input1 ---> Input2 ---> Input3 ---> Input4

For the sake of argument, we will assume that Process can be exploded into two: Sum, Average. Let us ignore for now the decomposition of Output.

When you get to a process which merits no further decomposition, you define its activity using some sort of pseudo-code, in a so-called mini-spec. Each statement in the mini-spec might correspond to one or more program statements, or maybe to a system function call.

The idea of structured systems analysis is that the complete system requirements are captured in the combined:

a
data flow diagrams,

b
the data dictionary which describes the (flowing) data,

c
mini-specs.

3.2.0.0.3 Structured Systems Design

The hierarchy of data-flow diagrams can now be rather easily used to form a (program) structure chart. From the example above, we can produce a structure chart:

                    +--------+
                    | Top    |
                    +---+----+
                        |
     +------------------+------------+------------------+
     |                               |                  |
 +------+                         +--+----+          +--+---+
 | Input|                         |Process|          |Output|
 +--+---+                         +-------+          +------+
    |
  +-+-------------+----------+--- etc...
  |               |
 +--------+    +-------+
 |Input1  |    |Input2 |            etc...
 +--------+    +-------+

                  Structure Chart -- Design

Thus we have a top-down decomposition.

3.2.0.0.4 Structured Systems Implementation

Implementation in code is now a straightforward matter. There is a main program, which corresponds to Top:.

Procedure Top.
Declare variables.
begin
  Input();
  Process();
  Output();
end;

The structure chart decomposition of Input is reflected likewise in its code:

Procedure Input.
Declare local variables.
  Input1();
  Input2();
  Input3();
  Input4();
  return;
end;

If Input1 etc. are at the bottom of the hierarchy, their code is developed from whatever is specified in their mini-specs.

Typically, procedures at each layer are arranged in libraries.

3.3 Shortcomings of Waterfall and Top-down Decomposition

Firstly, we must give credit where credit is due; the waterfall life-cycle is an awful lot better than nothing. When I started computing in the 1970s is wasn't widely known or practiced, and, certainly, when I became aware of it, it solved a lot of problems, and was light years of progress over: 'lets start programming at the beginning, and keep going till we're done!.

The waterfall project life-cycle and top-down decomposition are so intertwined that use of one more or less demands the other. Hence we can discuss their shortcomings together, although this means that we are intermingling technical, managerial, and social issues:

a
For any significantly complex problem, it unlikely that it will be possible to agree on exact requirements at the beginning of the project.

b
Real systems have no top [Meyer, 1997]. This is true in many systems: concurrent systems, any system dependent on the graphics-user-interface (GUI), see the discussion below on event-driven systems. In fact, instead of the top-down and library structure, what is more suitable is a framework or upside-down library [Budd, 1997a,Budd, 1999b].

c
Although the components of the top-down decomposition seem to be independent and decoupled - and hence we would expect that changes to one, would affect none or very few of the others - this is far from the case. They may in fact be tightly coupled via the (effectively) global data owned by Top. This coupling is all the more damaging for its subtlety - you don't know it's there until you start to make changes, and you have no plan of escape from it's grasp.

3.4 Key Application: Event-driven Interaction

[Foley and vanDam, 1990].

Traditionally programmed interaction works as in the following:

     begin
          Prompt user;
          Read input;
          Do processing;
          Display result;
     end;

A little thought will indicate that this will not work for any GUI (graphics-user-interface) like Windows, nor, indeed, for any direct-manipulation interface:

The basic structure of most Macintosh application programs (and probably Windows too) is as in:

  begin
    Initialise; -- build windows on screen
      While (not Done)
        SystemTask;
        if GetNextEvent(eventMask,theEvent) -- an event from a queue
           HandleEvent(theEvent); -- e.g. mouse button press
      CleanUp;
  end;
                  Event driven input-output

The important point is this: HandleEvent (the application part) is driven by the event - from the uncontrollable, unpredictable user, not the other way round.

For example this can allow an interface to be modeless: the program identifies the mode from the event.

It is difficult to identify any Top in such a system. The top-down decomposition can be done with only extreme difficulty and with extreme prejudice to the system architecture and any thought of extendibility or reuse.

Budd, [Budd, 1997a] points out the complete contrast between a top-down architecture and the framework or upside-down libraries architecture favoured by the likes of our event-driven interaction example. He uses the term upside-down library to highlight the contrast, as follows:

Hence, upside-down library - the library is at the top, though not in the simple sense of top-down decomposition.

It has been said that, had it not been invented before, object-oriented programming would have had to be invented for construction of GUIs.

3.5 Cornerstones of Software Engineering and Object-oriented Systems

Already, see section 3.1, we have identified quality, re-usability and productivity as key goals of software engineering.

We have produced some samples and evidence to indicate that the re-usability and productivity goals ultimately demand architectural design techniques and languages that produce decentralised designs, comprised of modules which communicate with other modules via well-defined, narrow interfaces, [Meyer, 1997]. As pointed out by Meyer, this is strongly suggestive of an object-oriented approach.

3.5.0.0.1 Modules and Data Abstraction.

Systems engineering has long understood the benefits of abstraction and modular design, especially for large complex systems. The systems engineering concept of the black-box is archetypal abstraction: the (sealed) black-box has some inputs, and some outputs that are clearly defined functions of the inputs. The designer who employs the black-box as a sub-system is concerned only with the function provided, and with the allowable range of inputs; he is absolutely entitled (and obliged) to consider only the external behaviour of the sub-system.

Although abstraction in software was already well accepted in the form of high-level languages and subprogram libraries, [Parnas, 1972] gave the first persuasive argument for software modularisation on the basis of data structures (data abstraction); he also pointed clearly to encapsulation - software's version of the sealed box - when he identified the crucial (but subtle) importance of information hiding: the user of a module benefits from knowing just enough to use the module, the provider of the module benefits from knowing just enough to implement the module. More knowledge on either part can only increase coupling and, hence, complexity.

3.5.0.0.2 Abstract-data-types

A data type, such as the native data types int, float, etc., is characterised by:

  1. A set of values - that can be assumed by objects of the type.

  2. A set of operations that can be performed on objects of the type.

Clearly, internal representation should concern only compiler writers - the type is a sealed black-box.

I hope it will be clear to you already that any language with a modicum of type checking does not allow objects of native types to be accessed via their representation - just via their defined operations.

The ADT concept can best be described by proceeding from the notion of a type; more prosaically, it is a user defined type that obeys the rules of a type set out in the previous paragraph - particularly that users interact with the objects strictly via the supplied operations.

3.5.0.0.3 Classes

Object-oriented languages have special modules, usually called classes, which are specifically used for defining ADTs.

Objects are instances of classes. That is, objects are to classes what variables are to types: in Java, int i; defines the variable i, an instance of the type int.

Syntactically, classes are like super-records (using record to mean a collection data structure without any coupled methods/definition of behaviour). In addition to data, they contain operations - in the form of functions. Normally, the data (representation) part is private and so inaccessible to user programs. Interaction with the object must be via the public operations:

3.5.0.0.4 Inheritance.

Inheritance allows a derived type (class) to be defined as an extension, or specialisation of a base type; i.e. derived objects inherit all the features of base, and add their own specialisations.

Inheritance allows us to build complex types/classes by reference to simpler classes. This extension to a simple type is achieved by adding extra data fields, and adding operations or replacing existing ones. However, since we do this by referring to the simpler class, rather than by altering it, the integrity of the simpler type is unaffected.

There are at least two objections to the extension of existing software modules by alteration, reuse by duplication:

Suppose an application program deals with people - there are various categories of Person: Customers, Employees, Directors, etc. Inheritance allows a (derived/extended) class, e.g. Employee, to be defined in terms of the (base) Person class. Moreover, with overloading of interface functions, client programs can treat objects of the two classes in a uniform way.

3.5.0.0.5 Late Binding

Late binding of functions allows client programs to take advantage of extended classes without alteration of the client code.

Note: Late binding is synonymous with dynamic binding and run-time binding. This contrasts with early binding -- static binding, compile-time binding -- in which a function call is bound to a definite function at compile time. Late binding becomes operable when the type of a reference (variable) may change during the execution of a program -- according to the type of the last value assigned to it.

Inheritance, with overloaded interface functions, will allow client programs to treat all classes in the inheritance family in a uniform way - polymorphism. However, with plain inheritance, all objects will be statically bound to a single class and, thereby, to interface functions of that class.

Inheritance, together with late binding permits the following:

Old code can call new code!.

Read that again! Old code can call new code. Normally, with libraries, new code, which must be upper-level functions, is the caller and the called code -- the library functions is the old code.

But how does old code know about the new code? When maybe the new code didn't exist when the old code was written.

With dynamic binding, when an overloaded operation is invoked, the language takes care of choosing the appropriate version - at run-time; thus, run-time binding. Hence, new code can be bound to a polymorphic object at run time. Hence, systems can be extended without knock-on requirements for reworking of the older parts of the system. This is the same as the upside-down library effect mentioned above.

Here we find another form of polymorphism: objects may exhibit a sort-of dynamic typing, in which and object takes the type of the last object assigned to it.

These notions are subtle - but should become clear with exposure to examples.

3.5.0.0.6 Design-by-Contract

Meyer,[Meyer, 1997], suggests a form of contract as the only proper basis for design of modules.

The supplier of the module agrees with the client:

  1. What services the module should provide. These can be expressed as post-conditions.

  2. Exclusions -- what the module is not expected to cope with. These exclusions can be expressed as pre-conditions.

It takes little imagination, or experience of software development, or of life, to see the good sense in design-by-contract.

Responsibility-driven-design, [Budd, 1999b] is a similar concept.

3.5.0.0.7 Open-closed Modules

[Meyer, 1997]

For a module to be usable by a client - whether outside the organisation it was be closed to changes of behaviour. Nevertheless, it must be open to possible extension. This remarkable property, open-closed, is facilitated by inheritance and run-time binding.

3.6 Summary


next up previous contents
Next: 4. Classes and Objects Up: Algorithms and Data Structures Previous: 2. Simple Algorithms and   Contents
jc 2005-11-16