Lecture 10: Exceptions

0 comments


Lecture 10: Exceptions

10.1 goto
C++ exceptions are like C++/C long jumps which are in turn like global goto statements. The goto remember is the sole application of "function" scope in C++/C and allows the transfer of control between any two places in a function.
f() {
  ...
  label:
  ...
  goto label;
}
Very flexible, but it certainly can be abused, for example be transferring control from outside of a loop to inside it which will almost certainly cause problems. Two valid uses are:
  1. Break out of nested loops, for example:
2.  for (..;..;..)
3.    for (..;..;..)
4.      for (..;..;..)
5.        for (..;..;..)
6.        {
7.          ...
8.          if (get_out) goto xit;
9.          ...
10.      }
11.xit:
12....
This could be accomplished by defining flags and checking the flags in the control statements of the for loops, but that solution would slow things down and make the code more intricate.
  1. Exception handling, for example:
14.f()
15.{
16.  ...
17.  if (problem) { error handling setup code; goto handle_error; }
18.  ...
19.  if (problem) { error handling setup code; goto handle_error; }
20.  ...
21.  return 0; // normal return
22.  handle_error: // error handling code
23.  ...
24.  return error_code;
25.}
This allows sharing of error handling code, and doesn't complicate the flow of control too much.
Other than these applications, the traditional advice of avoiding the use of goto is sound in most cases.
8.2 longjmp()
The ANSI C long jump is accessed via the include statement
#include <setjmp.h>
which declares a type (jmp_buf) and two functions:
int setjmp(jmp_buf);
void longjmp(jmp_buf, int);
The jmp_buf type is used to store information about the current state of the run-time stack so it can be restored to that state if a long jump is executed. setjmp() initializes the buffer and returns 0. If a long jump occurs, control returns to the original setjmp() call whose return value now becomes that specified by the longjmp() call. It is a global form of goto and should be used in the same manner:
  1. Break out of nested function calls, for example:
2.  f(TNode *tree)
3.  {
4.    TNode *node;
5.    jmp_buf buf;
6.   
7.    switch(setjmp(buf))
8.    {
9.    case 0: // initiate search
10.    search(tree, &node, buf, "hello, world");
11.    break;
12.  case 1: // failed search
13.    ...
14.    break;
15.  case 2: // successful search
16.    ...
17.    break;
18.  }
19.}
20. 
21.void search(TNode *branch, TNode **found, jmp_buf buf, char *item)
22.{
23.  if (!branch) longjmp(buf, 1);
24.  else if (strcmp(branch->value, item) < 0)
25.    search(branch->left, found, buf, item);
26.  else if (strcmp(branch->value, item) > 0)
27.    search(branch->right, found, buf, item);
28.  else { *found = branch; longjmp(buf, 2); }
29.}
Again, a system of flags would be another way to handle this, but it would be slower and complicate the code.
  1. Exception handling, for example:
31.main()
32.{
33.  jmp_buf buf;
34.  switch(setjmp(buf))
35.  {
36.  case 0:
37.  main_loop:
38.    switch(request_choice(menu))
39.    {
40.      ...
41.    }
42.  case MEM_EXCEPTION:
43.    // handle non-fatal memory exception
44.    ...
45.    // re-enter loop
46.    goto main_loop;
47.  case DEVICE_EXCEPTION:
48.    // handle fatal device exception
49.    ...
50.    return code;
51.  }
52.  return 0;
53.}
Like the goto, it must be used in a very structured way, or the code will become unreadable and unmaintainable.
8.3 C++ Exceptions
In C++, exceptions are essentially a more sophisticated and flexible implementation of the long jump idea. Doing the jump is beset with more complexities for the C++ compiler implementor, since all the objects which go out of scope during the unwinding process for the run-time stack must be destroyed. However, this is of course a boon for the programmer. A long jump in the C++ context will not result in this happening.
Syntax: try block followed immediately by one or more catch blocks. try block takes place of setjmp(). Usage possibilities are approximately as with the long jump:
  1. Break out of nested function calls, for example:
2.  f(TNode *tree)
3.  {
4.    TNode *node;
5.   
6.    try {
7.      search(tree, &node, "hello, world");
8.    }
9.    catch(int return_code)
10.  {
11.    switch(return_code)
12.    {
13.    case 1: // failed search
14.      ...
15.      break;
16.    case 2: // successful search
17.      ...
18.      break;
19.    }
20.  }
21.}
22. 
23.void search(TNode *branch, TNode **found, char *item)
24.{
25.  if (!branch) throw 1;
26.  else if (strcmp(branch->value, item) < 0)
27.    search(branch->left, found, item);
28.  else if (strcmp(branch->value, item) > 0)
29.    search(branch->right, found, item);
30.  else { *found = branch; throw 2; }
31.}
  1. Exception handling, for example:
33.main()
34.{
35.  main_loop:
36.  try {
37.    switch(request_choice(menu))
38.    {
39.      ...
40.    }
41.  }
42.  catch(int exception)
43.  {
44.    switch(exception)
45.    {
46.    case MEM_EXCEPTION:
47.      // handle non-fatal memory exception
48.      ...
49.      // re-enter loop
50.      goto main_loop;
51.    case DEVICE_EXCEPTION:
52.      // handle fatal device exception
53.      ...
54.      return code;
55.    }
56.  }
57.  return 0;
58.}
The above examples mimic the long jump by throwing ints; however, any type can be thrown, and it can be caught by value or by reference. The catch statement is a bit like a one-argument function call, except only polymorphic conversions are available. Thus the first example could be rewritten as:
class FoundNode {
  TNode *m_node;
public:
  FoundNode(TNode *node) : m_node(node) { }
  TNode *node() { return m_node; }
};
 
f(TNode *tree)
{
  try {
    search(tree, "hello, world");
  }
  catch(FoundNode node)
  {
    if (node.node()) // successful search
    {
      ...
    }
    else // failed search
    {
      ...
    }
  }
}
 
void search(TNode *branch, char *item)
{
  if (!branch) throw FoundNode(0);
  else if (strcmp(branch->value, item) < 0)
    search(branch->left, item);
  else if (strcmp(branch->value, item) > 0)
    search(branch->right, item);
  else throw FoundNode(branch);
}
When an exception is detected and an object is thrown, the matching catch block (or it can match a base class of object being caught) with the most closely nested try block handles the exception. When there are two or more matching catches for the closest try, the first catch after the try block is used.
If the catch handler wishes to look at the object, it has to give it a name in the header of the catch block.
catch(...) catches anything.
A catch block can "re-throw" exception to more outer catch blocks by just saying throw without specifying an object. Could the re-throw possibly land in a later catch for the same try block?
If no catch block is found for the exception, the function terminate() is called. The function
PFV set_terminate(PFV)
(typedef void (*PFV)() applies here) can be used to install a new function for terminate() to call. By default it calls abort(). Any function installed by set_terminate() should not return.
Functions can declare the exceptions they will allow to be thrown by them or the functions they use. By default, any exception can be thrown.
If a function throws an exception which isn't in its list of declared exceptions, the function unexpected() is called. PFV set_unexpected(PFV) can be used to tailor this. By default unexpected() calls terminate().
There are three types of code that exceptions can create problems for (The following notes draw much from Margaret Ellis and Martin Carroll, "Tradeoffs of Exceptions", C++ Report, Vol. 7, No. 3 (March-April 1995), pp. 12-16.):
  1. Doing something (e.g. heap memory acquisition, handler function installations, changing stream's or other object's state) which is to be undone by later code. Problems arise when exception occurs after something has been done, but before it is undone.
This problem is fixed by making sure all these operations are carried out within the context of construction and destruction of an object. Some "weightless" code can be added to achieve this. For example, instead of coding a function as follows:
void f()
{
  PFV *old_new_handler = set_new_handler(my_new_handler);
 
  ... // this code may generate an exception
 
  set_new_handler(old_new_handler);
}
It would be preferable, when the possibility exists that an exception might be thrown during the course of the function's execution, to code it as follows:
class install_new_handler {
  PFV *m_old_new_handler;
public:
  install_new_handler(PVF *new_handler) {
    m_old_new_handler(set_new_handler(new_handler));
  }
   install_new_handler() {
    set_new_handler(m_old_new_handler)
  }
};
 
void f()
{
  install_new_handler(my_new_handler);
 
  ... // this code may generate an exception
}
Even if exceptions are not a problem, this technique has a lot to recommend it since it makes sure whatever needs undoing gets undone without the programmer having to remember to write the appropriate code.
  1. Exception gets thrown while inside a constructor. C++ exception handlers don't call the destructor for this object, so object must make sure any partial allocations get undone.
This can be done using a try, catch and re-throw combination where the universal catch block takes care of undoing that which must be undone.
stack::stack(int sz)
{
  try {
    val = new char[size = size];
    notify(); // possibly generates exception
  }
  catch(...)
  {
    delete[] val;
    throw;
  }
  top = val;
}
  1. Exception gets thrown while inside a member function for a class while the object is in an invalid state.
Same solution as in 2 works here.
Exception techniques for templates. Example stack underflow or overflow. In the case of overflow, the exception report could contain the value of the object being pushed. Exception classes can be templated and inherit from common base class. This allows application to handle stack exceptions in a general or specific way.



Read More »

Lecture 9: Templates

0 comments


Lecture 9: Templates



Templates allow the definition of generic containers which can be easily configured for a wide variety of objects. A single template definition can provide for, say, a list of Strings or a list of GUIComps just by changing a single parameter when it is invoked. Templates can also provide generic functions or type-safe interfaces to traditional C generic functions (like bsearch() or qsort()). Template functionality is also referred to as "parameterized types". Templates and exceptions aren't really part of the generic object-oriented model, and so were given their own special category in the C++ Functionality Hierarchy presented in the first lecture. Perhaps function templates could be classified as part of C++'s program for a "better C".
9.1 Class templates
A class template definition looks like any other class definition except it is preceeded by a declaration of dummy arguments. For example:
template<class X, class Y, class Z> class stack {
  // ...
};
An instance or declaration of this parameterized class is obtained as with other classes except an appended parameter list indicates what types (not necessarily classes) or constants are to be used to substitute for the dummys.
HStack example.
Templates can be thought of as a sophisticated macro facility with quite possibly the same amount of code resulting. Various parameterizations don't necessarily share code with each other.
Non-inline definition of a member function combines both of these syntaxes. For example:
template<class X, class Y, class Z> X& stack<X, Y, Z>::push(Y& y) { ... }
When defining the constructor the suffix can be left off the member name:
template<class X, class Y, class Z> stack<X, Y, Z>::stack(Z& z) { ... }
In general, the class can be referred to without the angular-bracketed parameter list once inside the lexical scope of the definition.
HStack example with additional constant parameter representing stack size.
PtrStack template from the examples. Class PtrType must be a pointer type, otherwise the compiler won't be able to generate template code. The unsafe (PtrType) cast is insulated by the template definition. And that's the total purpose of this wrapper template: to define a type-safe interface to the generic HStack<void *> code. All instances share same code. The template results in no extra code being generated over what would be present without the template. It just allows for more type safety. An example below shows how such type-safe wrappers can be constructed for generic C functions.
If a class template needs to be escaped for a particular type, the code for the class can be written out explicitly treating BinaryTree<char *> (or whatever) like it was any other type. See LTCmp() in sort2.C. Some compilers allow escaping on a member function level.
9.2 Function templates
Definition of a template for a family of functions looks a lot like a class template. There is an important restriction that every template argument (stuff in < ... >) must affect the type of at least one function argument.
This restriction means the compiler can generate the appropriate instance of the template just by looking at the types of the function arguments, so a < ... > suffix on the function identifier (required when instancing class templates) is not used when instancing from function templates.
In overloading resolution, regular (non-templated) functions are looked through first for an exact match. Unlike the case when no templates are extant, no conversions at all are attempted. Then templates are searched for an exact match (again no conversions). If there is still no match, then regular functions are reviewed again with the usual conversion attempts. No match after this is an error. So if a function template needs to be "escaped" for a certain type, the function can just be defined explicitly for that type.
Example: a template for swap()
If the restrictions imposed by the function template are too much, a class with a single static member function can be used instead.
Example: sort2.C, a type-safe wrapper for the ANSI qsort().



Read More »

Lecture 8: Streams

0 comments


Lecture 8: Streams


Input and output. Except for binary I/O, this amounts to mapping objects from and to sequences of characters. I/O is implemented in C++ with a special set of classes. It is made to be type safe and extensible (unlike C's printf() and scanf()) with no compromise in flexibility.
The core I/O facilities are accessed by including a special header file iostream.h. (stream.h can also be included for compatibility with old I/O implementations.) There are four special objects which are pre-constructed (cin, cout, clog and cerr) which handle standard input, output and error streams. I/O is done via the overloaded operators << ("put to" or "insertion" operator for output to cerr, clog and cout) and >> ("get from" or "extraction" for input from cin), or via member functions.
Additional I/O facilities are accessed using iomanip.h (manipulators), fstream.h (files), strstream.h (streams based on C strings) and stdiostream.h (streams based on C FILEs).
8.1 C++ stream library class trees


8.2 class ios
This is the root of all the stream classes. Error state and formatting information are included in class ios.
Error State Flags
The error state consists of four bits: ios::goodbit, ios::eofbit, ios::failbit and ios::badbit. Note in C, printf(), scanf() etc. were stateless. In the C++ stream library, the state allows a multiple "put to" or "get from" statement to behave somewhat like multiple argument calls to printf() or scanf().
The four bits are stored in an int which can be read using:
int rdstate() const;
The state of the badbit could therefore be tested with an expression like:
if (cin.rdstate() & ios::badbit) // handle error ...
else // everything OK ...
A more convenient way of testing the individual bits is provided by the following member functions:
int good() const;
int eof() const;
int fail() const;        // non-fatal error - failed to read expected data
int bad() const;         // fatal error - stream can no longer be used
operator !() const;      // returns non-zero if good not set
operator void *() const; // returns non-NULL if good set
The error state can be set using:
void clear(int = 0);
The default value of zero results in ios::goodbit being set.
clear();
is therefore equivalent to
clear(0);
which is equivalent to
clear(ios::goodbit);
Note that ios::goodbit is a non-zero value. clear() might be used to set one of the other bits as part of a programmer's code for operator>>() for a particular object. For example:
if (bad_char) is.clear(ios::badbit); // set istream's badbit
Formatting Information
A fair amount of the formatting information is in the form of flags which are contained in a long int. These flags retain their settings until explicitly changed. The crudest access is:
long flags() const; // reads the flags
long flags(long);   // sets flags and returns previous setting
This sort of access should be used only to read the flags as a whole so they can be restored later. A function might do this if it needed specific formatting, but also needed to leave the state of the formatting flags of a stream argument unchanged.
Example:
void f(ostream &os)
{
  long flags = os.flags(); // record original state of formatting flags
  ...
  os.setf(ios::basefield, ios::oct);
  ...
  os.flags(flags); // restore original state of formatting flags
}
Control of individual flag settings can be achieved via:
long setf(long);   // set one or more individual flags
                   // (not a member of a group)
                   // or use setiosflags(long) manipulator
long unsetf(long); // reset one or more individual flags
                   // (not a member of a group)
                   // or use resetiosflags(long) manipulator
Both the setf() functions and unsetf() return the previous value of the flags. This applies to:
ios::boolalpha // insert/extract boolean type in alphabetic format
ios::skipws    // this is the only one that defaults to true
ios::showbase  // add 0 or 0x to non-decimal printouts
ios::showpos   // adds + sign to positive values
ios::uppercase // use X, A-F for hex and E for exponential
ios::showpoint // trailing zeros and decimal point always appear in floats
               // formatted as if all trailing digits were non-zero
ios::unitbuf   // flush ostream after each output operation
A two argument version of setf() is used to set flag which is a member of a group of flags. The second argument specifies the group and is a bitwise OR of all the flags in the group. The specified bit is set and the rest are unset. This function is:
long setf(long, long group);
The groups are:
ios::adjustfield  // padding position
  ios::left       // left aligned
  ios::right      // right aligned (this is the default)
  ios::internal   // between sign or base and value
 
ios::basefield    // or use setbase(int = 0, 10, 8 or 16) manipulator
  ios::dec        // or use dec manipulator (default)
  ios::oct        // or use oct manipulator
  ios::hex        // or use hex manipulator
 
ios::floatfield   // neither flag is set by default!
  ios::scientific
  ios::fixed
The last group is a little odd in that it makes sense for both flags to be unset. This is a shadow "automatic" state and is comparable to the
%g
format of printf().
Other formatting information is set by the following functions. For width(), this information is temporary, and the default width (0) is returned to after a field is inserted.
The effect of the ios::precision(int) method differs depending on which of the three possible ios::floatfield states governs floating-point formatting. In the default "automatic" state when neither bit is set, it represents the total number of digits used. When ios::fixed is set, it is the number of digits after the decimal point. When ios::scientific is set, it is the number of digits in the mantissa.
char fill(char);       // set fill char; or use setfill(char) manipulator
char fill() const;     // find current value of fill char (default is ' ')
int precision(int);    // number of floating point digits displayed
int precision() const; // or use setprecision(int) manip (default is 6)
int width(int);        // or use setw(int) manip
int width() const;     // default is 0 (use just the amount of space needed)
It also provides an extra set of format flags and long ints where user information can be stored by derived classes.
Two miscellaneous capabilities:
A method for tying an istream to an ostream is available so that the ostream gets flushed before any input operation. cout and cin are tied by default. The ostream *ios::tie(ostream *) method takes and returns a pointer to an ostream. The pointer returned is the previous tie. Tying to 0 breaks any existing tie. It can only be tied to one ostream at a time.
ios::sync_with_stdio() resets cin, cout, cerr, clog to use stdiobufs and thus they are synchronized with the corresponding FILEs stdin, stdout and stderr. I/O can only be mixed on a line-by-line basis.
8.3 Input
Pre-defined object cin (class istream with public base ios).
Result of >> operator is istream &. In combination with left-to-right associativity of >>, this means the right thing happens. For example:
cin >> x >> y;
Notice that, though non-const objects must be used, pointers are not used as in C since >> is overloaded using reference arguments. There is also an ios converter (to void *) which allows istream objects to appear as control expressions. It converts to 0 pointer if fail or bad bit is set. fail flag must be cleared to continue input (bad flag set in addition indicates more fundamental problem). Use clear() member function to do this:
int x; char c;
 
while (cin) // get stream of integer values
{
  while (cin >> x) { cout << x; process(x); }
  cin.clear();
  while (cin.get(c) && c != '\n'); // flush line from stream on error
}
Without the cin.clear() call, cin.get(c) would just be a no-op. This is different from the behavior of C's scanf() routine where succeeding calls are not affected by failures in previous calls. As in C, a failed operation must be registered before flag is set, so it should be tested after an input operation, not before.
Member functions:
istream(streambuf *);
istream &ignore(streamsize, int = EOF);
int peek();
istream &putback(char &);
int get(); // like C getchar();
istream &unget(); // putback most recent char read
istream &get(char &);
istream &get  // always terminates buffer with '\0'
              // doesn't extract terminator char from stream
(char *, streamsize, char = '\n');
istream &getline // same except extracts
                 // terminator char from stream
(char *, streamsize, char = '\n');
istream &read(char *, streamsize); // binary input
streamsize readsome(char *, streamsize); // binary input
istream &seekg(streampos);  // set position indicator
istream &seekg(streamoff, seek_dir); // dir is beg, cur or end
streampos tellg() const; // p suffixes stand for "put"
int gcount() const; // number of chars extracted by last unformatted
                    // input function (get, getline, ignore, read)
8.4 Note on random access
Notice that class ios doesn't support random access as one might suspect given the behavior of the C stdio routines seek() and tell() which work on all files. This is because in read/write situations separate positions are maintained for doing input and output. So these methods are supported at the istream and ostream level.
8.5 Output
Pre-defined objects cerr, clog and cout (class ostream with public base ios). Note cerr is not line buffered as in C while cout retains line buffering. clog is line buffered and is an alternative interface to the error stream.
In printing expressions, be careful to use parens if expressions' operators have greater or equal precedence compared with <<. Result of << operator is ostream &. In combination with left-to-right associativity of << this means the right thing happens with (for example):
cerr << "x = " << x;
The type of character constants is char in C++ version 2.0 and after, not int as in C and C++ version 1.0. Putting a char to an ostream results in the character corresponding to the code being printed. To get the integer code printed, the character must be cast to an int.
User-defined types are output by overloading the << operator. Since an ostream & is the first argument for this binary operator, it can't be implemented as a member function of the object being output, but is done using a free-standing function whose second argument is the object being output.
Implementing input for user-defined types is like output except it may be appropriate to change the state of the input stream if an operations fails. For example, complex input routine to get a complex value in the form (2.2, 3.3) would fail and set the bad bit if the sequence of characters retrieved from the istream don't fit the prescribed format. Perhaps setting the fail bit would be appropriate if it had saved the characters and put them back in the event of a failure. The ios::clear() function is used to set the state of a stream.
Member functions:
ostream &put(char);
ostream &flush();
ostream &write(const char *, streamsize);   // binary output
ostream &seekp(streampos),
ostream &seekp(streamoff, seek_dir); // seek_dir is ios::beg,
                                         // ios::cur or ios::end
streampos tellp() const;  // p suffixes stand for "put"
operator <<(streambuf *); // transfers characters to its own streambuf
                                // from this one until it can't find any more
8.6 Files (devices)
The header fstream.h contains definitions for fstream (derived from iostream which is in turn derived from istream and ostream), ofstream (derived from ostream) and ifstream. These are constructed with a character string containing the name of the file and an optional mode composed of bitwise-ORd flags derived from an enum defined in ios. Flags are:


Read More »

Lecture 7: Inheritance and Polymorphism

0 comments

Lecture 7: Inheritance and Polymorphism




Now we begin looking at the powerful facility C++ has for defining relationships and sharing code between objects.
Example:
employ1.cpp without VIRTUAL_FUNCTIONS defined.
First look at Employee class. Before going any farther though, we diverge from the topic at hand to consider a special C++ feature.
7.1 static declarator for class members
We saw this previously in the date example.
Notice the static qualification for the head and list() members of the Employee definition. This is new to C++. These class members are really just encapsulated global objects. They are global variables and non-member functions which the class declaration controls access to. public items are accessible by all code. private items are accessible only by implementation code.
static member functions can only operate directly on static data members. They get no instance of the object, and so have no unqualified access to non-static member functions or data. Also, there is no this pointer. It can access non-static members if it can obtain an instance of its type somehow. static should only appear in the declaration.
A static member function can be accessed via class (e.g. Employee::list()) or the usual way. static data members don't make instances bigger, and they have global linkage and must be defined explicitly outside the class. As with the static member function, no static qualifier should appear when the static data member is defined. Example:
Employee *Employee::m_head;
7.2 Inheritance
Three types of Employee: Manager, Clerk, Typist. They all build on the Employee class. Manager adds data for group managed. Clerk adds data for ten-key score. Typist adds data for wpm score.
Note the method for specifying the construction of base class in constructor. It nothing is specified, the compiler assumes a no-argument constructor should be used and will expect to find one. public inheritance is the only strictly object-oriented type of inheritance. public allows the relationship to be exploited by all code. What is there to exploit? Access to public interface of base class as long as derived class doesn't redefine it. Polymorphism. Usually pointers and references are used here. No need for unsafe pointer casts in many situations where C would require them.
private means only implementation code can use the relationship. Just including an instance of the class as a data member seems a more straight-forward way to go.
7.3 Polymorphism and virtual functions
virtual member functions are the provision for dynamic binding in C++. Allows function code used to fulfill a function invocation to depend on the dynamic type of the object instead of the type it is being manipulated as (static type). Unlike other member functions, this member function makes instances of object bigger, typically by the size of one pointer regardless of number of virtual member functions. This pointer points to a table of function pointers for the virtual functions of the class. See Section 7.5 for more information on this table.
Example:
employ1.cpp with VIRTUAL_FUNCTIONS defined.
Right now Employee::list() just prints out the generic Employee data for each Employee. If instead, we wanted the data printed out to depend on the dynamic type of the Employee (dynamic binding), we should change the declaration of print() in Employee to:
virtual void print(ostream &) const;
Just adding the virtual key word to the base class declaration does the trick. Now list() call results in complete data being printed out for every Employee. Notice there is more overhead associated with virtual function calls, typically the time needed for dereferencing two pointers. In the Manager's print function, we just want to print the generic Employee data. So we could change the call to print() call to:
group[i]->Employee::print(os); // don't need all information here
This call does not incur the virtual function call overhead.
"Pure" virtual functions allow definition of abstract classes (classes for which no object can be created). This allows objects to be build in stages. In these classes, a member function is declared virtual, but not defined. Basically it defines an interface with will be used by all derived classes.
Shape example: Shape.H, Shape.C, OsScreen.H, OsScreen.C, XScreen.H, XScreen.C and MyShape.C.
Sometimes the term polymorphism is used loosely to refer to the idea of parameterized types which is implemented in C++ with templates.
virtual functions can be used to get rid of switch statements. virtual functions are much easier to maintain. When a new derived class is defined, all that need be done is to define a new version of the virtual function instead of going through and updating umpteen switch statements, some of which might be inadvertently overlooked. Objects become "smarter". It knows how to process itself in each situation, so the receiving function is relieved of this reponsibility. This is where polymorphism really becomes useful for simplifying code.
7.4 virtual destructors
Another problem: what happens when we want to delete a heap object when the reference we wish to delete through refers to a different type than the one it was created with? Notice if a Manager is deleted through an Employee pointer, only the Employee part will be destroyed, and the memory for the Manager's Employee list will never get returned to the heap. The solution is a virtual destructor for the base class. All derived classes will have a destructor set up for them whether they define one or not. Destructor is now accessed via indirection. This should be done whenever at least one function in a class is declared virtual. Just by doing this, all destructors in inheriting classes become virtual (even though they don't share the same name as with the non-destructor virtual member functions).
virtual constructors? No. But if you want an object to be able to create a new instance of its type or a clone of itself when its creation type has been hidden via a cast, just create a virtual function which uses new to create an instance. When creating clones, usual precautions of duplicating pointers initialized with new must be taken.
7.5 The virtual table
The virtual function capability is typically implemented using what is called a "virtual table". When a member function of employee has been declared virtual, a memory image of an employee instance would appear as follows:
vtable *;  // actual type depends on compiler
char *;
long;
The memory image of an employee's virtual table (what vtable * points to) looks like:
void (*print)(); => Employee::print
void (*x)(int);  => Employee::x
long (*y)(int, double);  => Employee::y
where x() and y() are two other virtual functions of Employee which have been hypothetically added to Employee class declaration.
Memory image of Manager instance:
vtable *;
char *;
long;
Employee **;
int;
Memory image of Manager's virtual table looks like:
void (*print)(); => Manager::print
void (*x)(int);  => Manager::x
long (*y)(int, double);  => Employee::y  // Manager doesn't define this one
double (*a)();  => Manager::a    // two new virtual functions hypothetically
long (*b)(int);  =>  Manager::b  // added to declaration of Manager class
                                 // these don't appear in Employee, but
                                 // will be dynamically bound for
                                 // classes which inherit from Manager
7.6 Multiple inheritance
Sometimes its useful to combine classes horizontally. Create a menu of classes and create objects that inherit attributes from several base classes.
Derived classes can be treated as if they are either of their base classes when inheritance is public. So casts in C++ can have run-time overhead associated with them!
Problems arise when classes being combined share a common base class.
Example:
ClerkTypist class in employ2.cpp.
Now there are two instances of generic Employee data. We can avoid this by making Employee a virtual base class of Clerk and Typist. Thus when we combine, only one instance of the Employee data gets created, and it would only get constructed once. This one instance gets constructed per our instructions in ClerkTypist, and the Employee constructor calls in Clerk and Typist are ignored. The compiler expects these instructions.
We still have to pass on all the args to the Clerk and Typist constructors though even though they may use some of them only to construct Employee base instances. virtual base classes are usually implemented by including a pointer to an instance of the base class in the derived class's memory image rather than an instance as is the usual case. As with virtual functions, a performance hit is taken here since base class members must be accessed indirectly, so the virtual facility should not be used unless it is needed.



Read More »

Lecture 6: Operator Overloading

0 comments


Lecture 6: Operator Overloading


Operator overloading completes the data abstraction facilities. "data abstraction" means programmer-defined types which can be used with the same flexibility as built-in types. The class concept was the first step toward this capability. Operator overloading completes the implementation. This facility allows the programmer to define how meaningful standard C++ operators can be used with a class.
6.1 It's All Notation
Really all that's happening is a notational trick. By giving a member or non-member function a special name, we indicate to the compiler that it should consider it when it runs into the use of a particular operator involving a class. When the compiler is looking at a+b it's as if it saw a.operator+(b) or operator+(a,b). +a can be interpreted as a.operator+() or operator+(a).
A non-member function implementation of a binary operation has the additional flexibility of being able to apply conversions to the first argument.
References are commonly used as arguments. This is where the reference syntax becomes valuable since it allows the usual operator notation while providing the efficiency that comes with passing pointers to classes instead of (possibly constructed) copies. Be careful when returning references though. Don't use references to local objects which are destroyed when the function exits.
6.2 General Restrictions
The usual rules of function call overloading apply, plus it must be remembered that the built-in operators are prospective candidates for resolving any call, and there are the following general restrictions:
  • The standard C++ precedences and associativities determine how any expression will be parsed and can't be changed.
  • No new operators can be invented, e.g. @ can't be overloaded as an operator.
  • The operator must be written as unary or binary depending on its use with the built-in types, e.g. a binary operator can't be overloaded in a unary way for a class if there's no built-in type for which it's used in a unary manner.
  • For non-member functions overloading operators, at least one function argument must be an instance or a reference to a class. This ensures no C++ code can change the built-in behavior of operators with built-in types.
  • No default values.
  • There are also operator-specific restrictions, notations and features which are enumerated below.
The following operators can't be overloaded: `.', `.*', delete[], sizeof, () value construction, conditional expression operator and ::.
The following operators must be implemented as member functions:
= -> () []
6.3 Guidelines
The good news is that the programmer has complete control over the return type and argument types of the function. This flexibility must be used with some sensitivity. For example, + should be overloaded for a class only when replacing functional notation for an operation analagous to addition for the class. Unless there is a good reason for doing otherwise, it should mimic the behavior of the built-in operator as much as possible meaning (for a function overloading +):
  • It shouldn't change the value of either operand, since + doesn't do that when applied to built-in types.
  • It should be commutative.
As another example, it might be a good idea for overloaded = to return a reference to the left operand.
Each operator has a sort of "culture", a way programmers expect it to behave given their experience with built-in types. The more a new version of the operator fits in with this experience, the easier it will be for the programmer to utilize it in an effective and error-free manner. In addition, the class will be more likely to fit into and work properly in templates that require the operator.
6.4 Operator-specific restrictions, notations and features
Copy assignment operator
One operator which frequently needs to be explicitly defined for a safe class interface is the copy assignment operator. This is because, if one is not defined explicitly, the compiler generates a default version which does a member-wise copy. The copy assignment operator has a reference or value of the same type as its argument. This "feature" is for backward compatibility with C.
It can have the same terrible problems in conjunction with dynamically allocated members as the copy constructor does.
In designing assignment operation, the effects of assigning an object to itself must be taken into account. This may seem silly at first glance, but it is a very real possibility when indirection or references are being used.
Other assignment operators
Behavior of overloaded += is independent (and must be defined separately and explicitly to be used) from the behavior of overloaded + and =. This applies to the rest of the assignment operators as well. It often makes sense to define binary + using the (usually simpler) += operation. += will usually be faster since it doesn't require construction of a temporary to hold the final result since the final result is stored in an existing variable.
operator++ and operator--
These two unary operators have prefix and postfix versions. The prefix version is implemented as with the other unary operators. The postfix version must be implemented as a.operator++(int) or operator++(a, int). The int argument is a dummy which is not used for anything.
Incidentally, in C++, the assignment operators and prefix ++ and -- result in l-values contrary to their behavior in C. Again, overloaded versions of these operators should behave as the built-in types if not overly inconvenient. Here inefficiency can be a big concern as the built-in behavior for the postfix version of these operators is to return by value a snap shot preserving the state of the object before it is operated on. For a large object where copy construction is expensive, this could be prohibitive.
operator() function call
Overloading this enables instances of the class to be treated as if they were function identifiers. It can be overloaded several times for varying argument lists. Can be very useful.
operator->
Overloaded versions of this operator could well be termed "pass the buck" operators. This is implemented as a unary member function and allows -> to be applied to an instance of a class, whereas the built-in version always requires a pointer to a class or struct.
A name of a data or function member must appear on the right side of the operator. If the operator returns a pointer to an object, that data or function access must make sense for the object. If the operator returns a reference or instance of a class, that class must in turn have the -> operator overloaded. Thus, one usage of this operator could potentially result in many functions being called if the buck is passed extensively, though in practice it is usually just one layer deep.
This is useful for smart pointers. A program can defer construction of an object or loading of data until access makes it necessary.
new and delete
The sort of overloading which can be done here is restricted to determining where run-time memory is obtained when an object is created via these operators. Typically they are used to optimize heap memory allocation for a class. For example, if a Tree class created many instances of a Node class, it may be more economical to allocate them in batches, or to check a list of "used" nodes before trying to allocate a new one.
The format of the delete member function is
void operator delete(void *, size_t)
The second argument may be omitted.
The format of the new member function is
void *operator new(size_t)
Additional arguments of arbitrary type may be added after the initial size_t type. If these arguments appear, they are supplied after the new invocation in function call format, i.e. new(...args...) MyClass. It always returns a void * type. Even so, the expression where it is applied has the type of a pointer to the type which is being allocated as is the case with the built-in operator.
Conversion functions
casts (and conversions in general via assignment, C-style cast, C++ style cast) can be "overloaded" as well, but it is conceptually distinct since the return type is fixed (can't be programmer-defined as in general case) and they can only be defined as member functions, not friends. It looks similar though. An int conversion function is defined as the member function operator int(). It has no return value and no arguments. If a conversion function is to be written for a complex type, a typedef must be used so it can be manipulated as a single word.
The array of needed implementations of an operator can be reduced by defining conversions for the classes involved. Conversions to a class are defined implicitly with single-argument constructors. Conversions from the class are defined using conversion functions and exist by virtue of polymorphism.



Read More »