In C and C++, there is three other data types that need to be
discussed.  All three are essential for advanced programming, and are
low level.  The are the Pointer, the Reference (which exists only in
C++ and not C) and the Function or Method in Object Oriented Parlance.
Lets first look at the pointer.

What is Memory?  What is a Variable?  What is an Address?

The pointer is a variable that stores a memory address.  In a simplified
picture, imagine that your computer has one megabyte of RAM.  Each
location in Ram has an address that the CPU understands.  And you can
ask the CPU to go to any specific location in Ram and read the date at
that point.  Now from the perspective of  Ram, it doesn't care what the
data stored at any memory location represents.  To the Ram and the CPU,
it is just binary bytes of information.  And the CPU can read that byte,
regardless on what that data represents to the user or other parts of
the computer, or other programming.  It just reads bytes.  And after
it reads a Word of data at some specific address, it can easily read
the next address space, and the next and so on.  Now this is the case
all the time, in any programming language.  Now the details of all this
we're not going to get into the details of.  We'll wait for the Assembly
Language Workshop for a discussion of the details of how a CPU fetches
and processes data from Ram.  But we can understand that all the memory
in a computer is mapped.  And addresses themselves are data that has
to be stored somewhere in the hardware for use by programs, and the
Operating System.

So as programmers, not just in C++ but in any language.  How do we
instruct the CPU to to go out into Ram and acquire some data?  And more
importantly, how do we instruct the CPU to take some data from a source,
and to SEND it from Ram for storage?  Well, we can give the CPU the exact
memory address of our data for storage or retrieval, and in fact, Assembly
Language does nearly that.  But that is really hard and impossible to
debug the syntax.  Instead the CPU accommodates SYMBOLIC VARIABLES.
This is the creation of a symbol that allows us access to a machine
language memory address, and to retrieve and store data in that location.
So the CPU and our programs have these symbols, and data associated with
them that sits in RAM.  Forgetting C++ for a moment, lets make up our
own language in this language we create variable symbols.

MYVAR(“This is a string of data”);

In our imaginary language creates a symbol “MYVAR”, which is stored in
the computer for later use.  Our language with the help of the CPU then
allocates some finite memory in RAM and stores the address for that data
in a very fast lookup table in association of the Symbol “MYVAR”.
And then we put the string “This is a string of data”, all 23 bytes
of data to my counting, in RAM starting at the memory address starting
at the memory address that is associated with MYVAR.

Now I chose the syntax

MYVAR(“This is a string of data”);  I could have used any other syntax
rules that I think might be useful and understandable of programmers
(and in fact, not enough thought, IMO, is given to this function of
language design) and we could have created a syntax that looks like this:

MYVAR = “This is a string of data”:

or

MYVAR{:This is a string of data:}

or

MYVAR := “This is a string of data”;

or

MYVAR<'This is a string of data'>

Let String MYVAR eq “This is a string of data”

are all possible syntax rules in our own made up programming language.
I bring this up because the fluidity of syntax rules become an import
concept in Object Oriented programming languages like C++.  But I also
make this point that no matter the syntax, the resulting affect and
internal sequence of low level events  needs to be the same.  A Symbol
is stored.  An address is associated to the symbol, and data is stored.

Now what if we took a shortcut?  Instead of creating a symbol that the
CPU associates with some data in RAM, which then stores human useful
data at that address,  what if we just ignored the useful human data
that we stored and just associate the memory address with the symbol,
or alternately, store at the address associate with the symbol yet
another machine language memory address?  This can give us several
advantages, creates several potential dangers and pitfalls, and even
if it seems to be driving us closer to assembly language like code,
can actually simply our coding.  If we can avoid having to write real
binary addressing code, this gives us a decent level of flexibility.
We can indirectly point our symbol at any kind of data, for one thing,
although limited by the rules of C++ syntax.  Certain operations,
like incrementing serially through a segment of memory can be sped up
since the CPU is engineered to handle binary memory address operations
very efficiently.  We can pass access of very large segments of memory
from one variable to another, without having to copy the whole memory
segment to a new location.   And remember that CODE itself is actually
data and we can gain access to that code and pass it around like chars,
ints and longs.  But in order to retrieve the actually useful information
that the stored memory address points at, we need to take an extra step.
First we have to read the associated address attached to our variable.
And then we have to fetch the data at the associated location in RAM that
our symbol is associated.  And since that data is itself is a machine
language memory address, and not otherwise useful data, we then need
to map that address that was stored in symbols associated RAM location,
and deference that address to reach the useful data that we ultimately
want to retrieve.

Does this seem confusing?  It is and it isn't.  Students to C and
C++ choke on this all the time, and yet understanding this concept
is absolutely the key to  understanding how to read, program, design
and analyze good C++ code.  But what leads students astray is that the
fact that this is a computer specific abstraction, they over think it.
A pointer is really quite simple.  It is a symbolic variable that stores a
memory address.  What the heck that stored address is pointing at, though,
is where real world programming gets interesting and  often confusing.
C++ and C have tools to help make this a bit easier, one of which is
that C++ is what is called a TYPED language, and has key words to help
you from screwing yourself up like CONST.  And we will look at all of
this as we go forward.


One thing you will note is that I haven't actually written can C++
syntax yet.  That isn't an accident.  I want to teach the concepts
first before we look at the implementations.  I believe that often,
especially at the beginning, trying to teach the syntax simultaneously
with the concepts is a big teaching mistake.

Next – References (I hate references).