~/cplus/syntax.txt.html

C++ Syntax

Before exploring how we actually represent data in our C++ programs,
I want to introduce a formal discussion of basic C++ syntax, which,
like data types, doesn't get enough attention in most standardized texts.

All programming languages require syntax rules in order for the compilers,
to parse and create working machine code.  These syntax rules require
basic understanding of core components.  These components include files,
structure, statements, data and operators.

Starting from the top, first you have files, and files usually have
naming rules.  C++ inherits from C nearly all the file structures, and
actually requires a greater knowledge in more detail and at an earlier
level of expertise.  C++, because of its object oriented design depends
heavily on library creation.  In fact, even for beginners, most of your
work happens on the library level.

All C programs inherit from the Unix environment, which co-developed
with C, the need for an initiation of the main function definition.
All C programs start with main, and main will then absorb and use all
other parts of the systems libraries and programming to produce your
completed program.  Main is located in your upper most programming file.

Standard Programming Files:  A standard programming file is when the
top most programming will take place.  In the C language, most of your
code, especially as a beginner takes place in this file.  Most commonly
these files have a suffix of either .cc oro .C.  file.C for example is
a standard C++ File name.

A standard programming file will have several components:

1) Include Preprocessor Directives - These import header files and define
the definitions of the symbols which your not spontaneously creating,
that your program will use.

And include directive might look like this:

#include <iostream>

Which tells the compiler to load up the definitions of all the functions
and objects defined in the iostream library.

Standard C++ libraries are included using the angle bracket notions
as above.  They are search for by your compiler in a set of standard
locations which are defined by your compiler and programming environment
(something I wouldn't mind understanding better on modern Linux and
GNU environments).

If you use the syntax

#include "myheader"

with double quotes, the compiler will look for these headers in the
local directory.

C libraries are accessable in C++ and can either have a standard C
language notion

#include <assert.h>
#include <stdio.h>

***Note the .h suffix being included*** or use the C++ version

#include <casset>
#include <cstdio>

2) Macro and other Preprocessor Compiler Directives - Help set up
conditions in which libraries and header files are brought into your
code.

The curly braces forms a block in which the coder can add really as many
instructions as they choose to.  These blocks of statements are seen in
many C++ syntax structures including functions, if statements, for loops and
other structures.  While the above is a minimal C++ source file structure,
generally most of the heavy lifting of your code takes place outside of
main in user defined functions which you create, as well as objects.  A more
realistic first program skeleton might well look something like this:

Here we can see the declaring and defining for 4 user defined functions that
are outside of our program, and get instantiated only when called.  the for
functions are called read, sort, compact and write.

And notice that we are using the standard namespace called std.

in our program to help prevent duplication and to create different  versions
of a program as might be needed for differing architecture or conditions.

The list of Preprocessor Directives are as follows:

#define
#endif
#ifdef
#ifndef
#include (as discussed above)

A Macro directive might look like this:

#ifndef HEAD
#define HEAD
#include <iostream>
#include <string>

#endif

Development of skills using these directives, which is a language in a
language, is one of the skills that advanced C and C++ coders have that
separate them from amateurs.

This Macro is telling the compiler to include the libraries and symbols
for iostream and string from the core C++ library if and ONLY IF, the
symbol HEAD, in the compiler instructions, haven't been already defined.

There are also constants that your program has which the compiler adds
to your code which include

__cplusplus
__DATE__
__FILE__
__LINE__
__STDC__
__TIME__

__DATE__ and __TIME__ are the date and time the program is compiled.

3) Original Code and runtime directives starting with main.

C++ has added a new programming directive called the "using" directive
which is used to create namespace.  Namespace gives a finer grain
control of which symbols your code recognizes in a specified space.
Its really important and in many ways was a long time coming to the C
family of languages.  Most importantly it prevents you from accidentally
stepping on library symbols or words that you might not have been aware
of or that programmers after you might not be aware of.  It also allows
to define the same symbol in multiple locations of your code without
stepping on your own toes.

So todays modern C++ main program files might look something look
something like this: #ifndef TOP_H #include <iostream> #define TOP_H
#endif

#ifndef INTARRAY_H
#include "intarray.h"
#define INTARRAY_H
#endif

using namespace std;
int main( int argc, const char* argv[] )
{ //YOUR PROGRAMMING CODE
}

There is a catch to the namespace usage though.  It might very well be
that your library files, especially if you are creating them yourself,
which you will in C++, have the using directive.  If so, you will likely
depend on them.

Header Files: Header files normally have a .h suffix.  file1.h would
be an exampe of a header file for C or C++.  These are the files that
are being included in you #include preprocessor directive.  These files
are often distributed with a program and you can examine them.  They are
useful for discovering the definitions of programing objects in libraries
are used and often programmers will point them out to you as a form of
documentation, which itself is a practice I'm not happy about because
many programmers mistake them as a substitute for real documentation.

Library Files:  After researching this, it has occurred to me that there
is an ambiguity about the structure of C and C++ Programming files.
Professional programs generally have header files that are described
above, but don't have a proper name for the coding files that associate
with the headers and which produce object binary files and static or
linked libraries.  For a beginner this is all confusing and the lack of
proper nomenclature makes this all the more harder to learn.  I little
bit of compiler theory is needed to understand the files structure
and binary construction of your program.  For now, I just want to
point out that programming objects defined in your header file for use
in your programming has to have source code to produce the actually
machine code that is represented by the symbols in your header file.
Those library source files will not have the main function.  But the
compiler can be asked to create what is called object files, which are
partially processed C binary code for later inclusion in your program.
When we look closer at the gcc compiler we will examine these object
files and learn why they are so important.

What is important to say, however, is that in C++, because of its
object orientation and its emphasis on creating Application Programming
Interfaces (API), most of the C++ coding you will do is taking place in
these library C++ source files (which I will refer to as Library Code
from here on out).

There are two kinds of Library code files that you will work with,
that which you create, and that which you borrow from your system for
inclusion in your programs.

User defined:

User defined library files define the code to create working programming
objects that are normally declared in your matching header files.
These programming source code files look just like your main programming
file except they don't have the main function.  Your top most main
programming file is dependent on these library code files.  The code
they produce has to be linked into your program by your compiler.

Standard C++ or Packaged third party: These are the standard libraries,
either in source or in object files, that define standard language needs
and are usually found somewhere in /lib or /usr/lib on your system.

Standard C++ File Creation:

All our C++ programs has to be created in with a standard text editor.
The code that the compiler works on, also known as translation units for
the compiler at straight ASCII text.  You can NOT use a word processor.
My preferred text editor is VIM or GVIM, which is a derivative of VI.
VI is the standard text editor on Unix like systems and there are many
tutorials for it around the internet.  Other editors include EMACS, and
then there are C++ working environments like Anjuta, which I strongly
discourage.  I discourage the Programming Integrated Programming
environments because with GNU and Unix like systems, your OS is your
integrated environment, and I believe one should learn to use the standard
tools that are on your GNU/Linux system.

A standard C++ file needs to have at least one function defined.
We will look at functions (also called methods, more closely later,
but a new programmer should get use to looking at them from the start,
since everything in C++ is encapsulated in a function called main.
Functions are defined by following structure

"return type" "function name (the symbol)" ( Argument list) {

        Statements that end in semi-colon;

}

Functions do not that semi-colons after the closing curly brace.

The main function looks like this

int main(int argc, char * argv[]){
        return 0;
}

A realistic C++ main program file, including preprocessor directives
would look as follows

#include <iostream> using namespace std;

int main(int argc, char * argv[]){

        return 0;
}

The curly braces forms a block in which the coder can add really as
many instructions as they choose to.  These blocks of statements are
seen in many C++ syntax structures including functions, if statements,
for loops and other structures.  While the above is a minimal C++ source
file structure, generally most of the heavey lifting of your code takes
place outside of main in user defined functions which you create, as
well as objects.  A more realistic first program skeletan might well
look something like this:

#include <iostream> using namespace std;

void oxygen(){ cout << "oxygen()\n";}
void hydrogen(){ cout << "hydrogen()\n";}
void helium(){ cout << "helium()\n";}
void neon(){ cout << "neon()\n";}

int main(int argc, char * argv[]){
        oxygen();
        hydrogen();
        helium();
        neon();

        return 0;
}

Here we can see the declaring and defining for 4 user defined functions
that are outside of our program, and get instantated only when called.
the for functions are called read, sort, compact and write.

And notice that we are using the standard namespace called std.

Statement Structure:

All C and C++ statements (although not all syntax) ends with a semi-colon.
You can even put two semi-colons on a single line, separated by a
semicolon, but in general this isn't recommend.

Statements are constructed with Data, Operators and Keywords.  C++ has
an extended set of Keywords than C.

Keywords:

Keywords are any symbols that the Standard C++ recognizes as having
instructional meaning, that is the tell the compiler to do something.
The Key Words in C++ are as follows, and learning the exact meaning of
all the keywords is essential to learning C++.

These are inherited from C:

auto   const     double  float  int       short   struct   unsigned
break  continue  else    for    long      signed  switch   void
case   default   enum    goto   register  sizeof  typedef  volatile
char   do        extern  if     return    static  union    while

These are the extended set added to C++

asm         dynamic_cast  namespace  reinterpret_cast  try
bool        explicit      new        static_cast       typeid
catch       false         operator   template          typename
class       friend        private    this              using
const_cast  inline        public     throw             virtual
delete      mutable       protected  true              wchar_t

and most C++ Compilers also recognize the follow Keywords

and      bitand   compl   not_eq   or_eq   xor_eq
and_eq   bitor    not     or       xor

Keywords are completely reserved and can not be used as symbols by any
user defined variables in your program.  They are exclusive to the
language and compilers.

There are other important predefined symbols that C++ uses as well.
These are not strictly exclusive to the Language, however, overloading
them or using them as symbols for variables is a very bad idea.

There is a lot of them, but some of them might include

cin
endl
INT_MIN
iomanip
main
npos
std
cout
include
INT_MAX
iostream
MAX_RAND
NULL
string
not to mention the Macros like __DATE__ and __TIME__

Operators:

Operators, are very much like functions or methods in that they define
processes, taking in arguments and returning outputs (and having side
affects).  In the C Language, Operators are immutable.  You can't change
their meaning.  In C++ many of them can be overloaded, that is that you
can create, and change their meaning.  A lot of C++ study involves
discussing the overloading of Operators.

All Operators, as they do in Mathematics, have precedence and
associativity.   For example, in arithmetic:

4 x 3 - 10 = 22

and not  -28 or 2.  That is because multiplication has a higher
precedence that subtraction and the associativity is left to right.  A
complete list of C++ operators is considerable and as follows:

┌────────────────────────┬─────────────────────────────────────────┬────────────────┐
│ Operator               │ Type                                    │ Associativity  │
├────────────────────────┼─────────────────────────────────────────┼────────────────┤
│ ::                     │ binary scope resolution                 │                │
│ ::                     │ unary scope resolution                  │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ ()                     │ parentheses                             │                │
│ []                     │ array subscript                         │                │
│ .                      │ member selection via object             │                │
│ ->                     │ member selection via pointer            │ left to right  │
│ ++                     │ unary postincrement                     │                │
│ --                     │ unary postdecrement                     │                │
│ typeid                 │ run-time type information               │                │
│ dynamic_cast< type >   │ run-time type-checked cast              │                │
│ static_cast            │ compile-time type-checked cast          │                │
│ reinterpret_cast       │ cast for non-standard conversions       │                │
│ const_cast             │ cast away const-ness                    │                │
├────────────────────────┼─────────────────────────────────────────┼────────────────┤
│ ++                     │ unary preincrement                      │                │
│ --                     │ unary predecrement                      │                │
│ +                      │ unary plus                              │                │
│ -                      │ unary minus                             │                │
│ !                      │ unary logical negation                  │                │
│ ~                      │ unary bitwise complement                │                │
│ ( type )               │ C-style unary cast                      │ right to left  │
│ sizeof                 │ determine size in bytes                 │                │
│ &                      │ address                                 │                │
│ *                      │ dereference                             │                │
│ new                    │ dynamic memory allocation               │                │
│ new[]                  │ dynamic array allocation                │                │
│ delete                 │ dynamic memory deallocation             │                │
│ delete[]               │ dynamic array deallocation              │                │
├────────────────────────┼─────────────────────────────────────────┼────────────────┤
│ .*                     │ pointer to member via object            │                │
│ ->*                    │ pointer to member via pointer           │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ *                      │ multiplication                          │                │
│ /                      │ division                                │                │
│ %                      │ modulus                                 │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ +                      │ addition                                │                │
│ -                      │ subtraction                             │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ <<                     │ bitwise left shift                      │                │
│ >>                     │ bitwise right shift                     │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ <                      │ relational less than                    │                │
│ <=                     │ relational less than or equal to        │ left to right  │
│ >                      │ relational greater than                 │                │
│ >=                     │ relational greater than or equal to     │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ ==                     │ relational is equal to                  │                │
│ !=                     │ relational is not equal to              │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ &                      │ bitwise AND                             │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ ^                      │ bitwise exclusive OR                    │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ |                      │ bitwise inclusive OR                    │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ &&                     │ logical AND                             │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ ||                     │ logical OR                              │                │
├────────────────────────┼─────────────────────────────────────────┼────────────────┤
│ ?:                     │ ternary conditional                     │                │
├────────────────────────┼─────────────────────────────────────────┤                │
│ =                      │ assignment                              │                │
│ +=                     │ addition assignment                     │                │
│ -=                     │ subtraction assignment                  │                │
│ *=                     │ multiplication assignment               │                │
│ /=                     │ division assignment                     │ right to left  │
│ %=                     │ modulus assignment                      │                │
│ &=                     │ bitwise AND assignment                  │                │
│ ^=                     │ bitwise exclusive OR assignment         │                │
│ |=                     │ bitwise inclusive OR assignment         │                │
│ >>=                    │ bitwise left shift assignment           │                │
│ <<=                    │ bitwise right shift with assignment     │                │
├────────────────────────┼─────────────────────────────────────────┼────────────────┤
│ ,                      │ comma                                   │ left to right  │
└────────────────────────┴─────────────────────────────────────────┴────────────────┘

We will walk through this complete list of operators later.

Data Assignments in C++ Statements:

We've now bootstrapped enough background information in order to examine
data assignment and variables in C++ statements, and to compile small
programs to explore C++ data better.

C and C++ are know as compiled typed languages.  What this means is that
the C source code itself does not directly run a program, unlike
scripting languages like Perl, Python, Rudy, Bourn Shell Scripting and
such.  The program has to be compiled, linked with standard library
binary libraries, and outputted into a binary file that the machine and
operating system can run.

The TYPED aspect of C++ means that all variables have to be defined as
some specific data type, one of the data types such as we discussed
earlier (such as int, char, short, long, double, float).  Functions and
Methods are data and have a type.  Its a code type.  But for the
purposes of Syntax they are typed according to the kind of data that
they return.

When we create variables in C and C++ they have to be typed.  In
addition, there is 3 phases in variable creation, which can be either in
separate statements of combined into one statement.  The three phases
are:

Declaration
Definition (needed for functions and methods)
Initialization

We declare a variable using the name and its type, its symbol and is
initializing it, we include an assignment of data.  For example:

int i;

declares a variable i which is of type int, and most usually defines it.

float j;

declares a variable j of type float, as it is defined specifically to
your architecture and environment.  We can then assign an integer to i
and a float to j using the following syntax.

#include <iostream>
using namespace std;

int i;
float j; //Declarations of variables

int main(int argv, char * argc[])
{

i = 10;
j = 5.6;

return 0;
}

<program file1.cc>

Now here is the catch.  WHEN WE DECLARE A VARIABLE in an ordinary
C++ coding file, one that has the function main within it, placing the
variable in the global namespace above or outside of the "main"
function, then we are defining and declaring the variable and the
variable exists globally for your program, from the point that the
variable is declare on to the end of the file, and all the other
functions and code that you bring into your program from that point
forward.  However, it is normal for real programming for one to use at
least three files for your program.  Your main C++ code file, your
library source file which is used to create reusable objects in your
code base, and then finally, in HEADER files, the .h files that are
imported with

#ifndef MYOWNLIB
#include "myownlibrary"
#define MYOWNLIB
#endif

statements.

However, it is a programming error to ***define*** variables within your
headers.  It can confuse the linker, because then they compiler will
make space for all the defined variables in every object that uses the
header file in your program, and all those objects may not agree as
to their values later on.  So you want to declare it, so that everyone
knows what the common type and symbol is, but you don't want to define
it, so that the linker doesn't come along and fill in all those
definitions, possibly incorrectly, and separately if not independently,
in all the objects of your complex program.  In order to allow for the
declaration without the definition of a variable in the header file, you
must use the "extern" keyword in header files to create proper global
scope of variables.

This is not true with functions or classes, which we will see later.
They have implicit "extern"s.

To beat a dead horse, since this is one of the most common areas of
programming bugs, understand this:

Where we declare a variable is important and determines where your
variable is viewable in your program.  Complex programs have so many
variables in them that it is critical to try to restrict their access
and usage to the smallest subset of need as practical.  This is
accomplished through two mechanisms, Scope and Namespace.  We've seen
namespace already with the using directive

using namespace std;

imports into our section of programming all the reserved symbols of the
standard namespace.  Scope, on the other hand, restricts variable access
to specific blocks of our programs, not just the symbols.  When we
declare our variables outside of main, as we did in the program
file1.cc, then the scope is considered to be global and the variables i
and j are available throughout our program.

Scope is a bit more complex  We've looked at global scope already and we
will run up against it time and again.  According to Lippman there are three
kinds of Scope: local, namespace and class.   Scope in C++ is fairly
complicated, and not well described in the standard texts.  Anywhere you
have a block you can declare and define variables, and they are local to
that block.  This is particularly important in functions.  A block is a
group of statements which are surrounded by curly braces.  Like the unix
shell, a block inherits the variables in its scope from the name space
it is called from, including blocks that wrap it.  So you can have
blocks within blocks.  If you declare a variable already in play, it
creates a new temporary version of that variable.  Variables in
functions, while they might be declared outside the function, and
defined within a function, would be created by the compiler, but don't
become active until they are called.  Each function creates a stack of
runtime processes.  The top most process, ignoring threading and forked
processes for the moment, needs to run its course and end prior to
calling function can continue.  Since everything begins with a function
called main, a runtime stack might look like this, each platter on the
stack having its own unique environment and variables scoped within,
excluding global variables which they can affect and see.

                       function2
                        |
        function1-------|
        |
main ---|

In this scenario, when function2 finishes, all of its local variables
die and no long exist in runtime, and control returns back to function1.
function1 can then call function3, and add new plates to the runtime
stack.  We will explore this more when we look at functions and methods.

Now lets return to the representation of data in C++ and look at
programming examples.

I will now increasingly be showing example code on the NYLXS website
because of nice HTML tool in VIM that can show code in syntax color.
See http://www.nylxs.com/docs/workshops/?C=M;O=D for coding examples as
mentions here.

We can start by building a nice example program, which will have 4
working files, a header file, a main programming file, a library
programming file and a make file.  There are easier ways of showing
these data type examples, but I feel it is important that new C and C++
programmer become comfortable and familiar with multiple file
programming projects, especially for C++.  Also, I might mention a tip
or two with VIM and GVIM, in order to outline more general GNU tool sets
to help the C and C++ coder.  There are also, no doubt, likewise
examples in EMACS and other editors, but I'm not going to mention them,
and I really don't want an editor war to break out in the workshop.

Data Representation in Detail:

chars:  Characters are the most basic data type in C and C++ being
stored in memory in a single signed or unsigned byte.

chars can be represented in several ways, and internally are represented
as small ints, either signed or unsigned.  We can create character
constants by putting keyboard characters into single quotes ==> 'B' and
assigned them to either char, unsigned char, or pointer indirectly
by addressing to char variable or some other memory construction that
stores a char or unsigned char.

Characters can be represented in programming code, not only with the

char var = 'X';

syntax, but also by Integer, Octal Code and Hexadecimal code which
represent ASCII mapped characters.  The syntax for these codes includes
the following:

Integer constants: (note the single quotes)

char letter = 65; //stores an ASCII A.
char letterOct = '\102';
char letterHex = '\x43';
char letterOct = 0102;
char letterHex = 0x43;

Octals have the generic form of 3 digits preceded by a backslash if
quoted, but the integer syntax can be used.  And integer that starts
with a 0 is interpreted as non-decimal.  Remember that you feel more
comfortable with decimal numbers, but the machine couldn't care less.
If the 0 is followed by an x or X, then it is hexadecimal.  If you put
it in single quotes, it needs a backslash first.

Their is also a short hand for special characters that can not be
readily typed from a US 105 key keyboard.  These are also backslashed.
I'll show a complete list in the code example, but the most important
two, by far, are '\n', the line feed, and '\t' the horizontal tab.  The
'\n' can also be represented in C++ with the endl symbol (which stands
for end of line).  I'm not going to discuss the broken MS end of line.

A character literal can be also have an 'L' in front of it to use for
double wide characters, such as in Chinese etc.  It has to be stored in
an appropriate variable called a wchar_t.

You can't use any of these literal representations on the left side of
an assignment operator.  For some, this might seem obvious that you can
do this assignment:

'0' = 'G';

but trust me, some day you will do just this for some twisted reason.

Here is a sample program to show all the character variations:

Three Files:  First the Header File

http://www.nylxs.com/docs/workshops/cpp/data.h.html
http://www.nylxs.com/docs/workshops/cpp/data.h

1 #ifndef DATA_H
2 #define DATA_H
3 #endif /* DATA_H */
4
5 void show_chars();
6 void show_ints();
7 void show_floats();
8 void show_arrays();
9 void show_cstrings();
10 void show_strings();
11

The Library Source File:
http://www.nylxs.com/docs/workshops/cpp/file_1.cc.html:
http://www.nylxs.com/docs/workshops/cpp/file_1.cc

1 #include <iostream>
2
3 using namespace std;
4
5
6 void show_chars()
7 {
8         //declare and define char types in a function
9         char letter;
10         unsigned char letterU;
11         //assigning chars means using a single quote mark
12         letter = 'R';
13         letterU = 'u';
14         //declare,define and assign
15         char letterNull = 0;
16         char letterZero = '0';
17         char letterINT = 65;
18         char letterAlert = '\a';
19         char letterBackspace = '\b';
20         char letterFormFeed = '\f';
21         char letterNewLine = '\n';
22         char letterCarriageReturn = '\r';
23         char letterVertTab = '\v';
24         char letterHorzTab = '\t';
25         char letterBackslash = '\\';
26         char letterQuestionMark = '\?';
27         char letterSingleQuote = '\'';
28         char letterDoubleQuote = '\"';
29         char letterOct = '\103';
30         char letterHex = '\x44';
31         char letterOct2 = 0104;
32         char letterHex2 = 0x45;
33 //depreciated and causes a segfault     char * letterptr = "\x48";
34         cout << "signed char ==> " << letter << endl;
35         cout << "unsigned char==> " << letterU << endl;
36         cout << "NULL char ==> " << letterNull << endl;
37         cout << "ZERO char ==> " << letterZero << endl;
38         cout << "INT char ==> " << letterINT << endl;
39         cout << "Alert char ==> " << letterAlert << endl;
40         cout << "Backspace char ==> ::" << letterBackspace << "end" << endl;
41         cout << "FormFeed char ==> " << letterFormFeed << "end" << endl;
42         cout << "New Line char ==> " << letterNewLine << "end" << endl;
43         cout << "Carriage Return char ==> " << letterCarriageReturn << "end" << endl;
44         cout << "Verticle Tab char ==> " << letterVertTab << "end" << endl;
45         cout << "Horizonal Tab char ==> " << letterHorzTab << letterHorzTab << "end" << endl;
46         cout << "Backslash char ==> " << letterBackslash << endl;
47         cout << "Question Mark char ==> " << letterQuestionMark << endl;
48         cout << "Single Quotes char ==> " << letterSingleQuote << endl;
49         cout << "Double Quote char ==> " << letterDoubleQuote << endl;
50         cout << "Octal char ==> " << letterOct << endl;
51         cout << "Hexidecimal char ==> " << letterHex << endl;
52         cout << "Octal char 2 ==> " << letterOct2 << endl;
53         cout << "Hexidecimal char 2 ==> " << letterHex2<< endl;
54 //      cout << "NOT REALLY a pointer to  char but a string ==> " << *letterptr << endl;
55
56 //can't asign a value to a string literal or const char:        *letterptr = 'd';
57 //      cout << "pointer to  char ==> " << *letterptr << endl;
58
59
60
61 }
62
63 void show_ints()
64 {
65 }
66
67 void show_floats()
68 {
69 }
70
71 void show_arrays()
72 {
73 }
74
75 void show_cstrings()
76 {
77 }
78
79 void show_strings()
80 {
81 }
82

The Main programing file:
http://www.nylxs.com/docs/workshops/cpp/file_1.cc.html
http://www.nylxs.com/docs/workshops/cpp/file_1.cc

#include <iostream>
#include "data.h"

int main(int argv, char * argc []){
        show_chars();
        show_ints();
        show_floats();
        show_arrays();
        show_cstrings();
        show_strings();
}

and the Makefile to compile

http://www.nylxs.com/docs/workshops/cpp/makefile.html
http://www.nylxs.com/docs/workshops/cpp/makefile

data : data.o data_main.o
         g++ -o data data.o data_main.o

data.o  : file_1.cc data.h
        g++ -Wall -o data.o -c file_1.cc

data_main.o :   file_1_main.cc
        g++ -Wall -o data_main.o -c file_1_main.cc

Just to say it, there is some differences in the character handling in
modern C++ and C, specific to pointers.  This syntax, which is almost
always wrong

char * ptr = "A"; //Double QUOTES THERE

needs to specify itself as a const

const char * ptr = "A";

and you should be aware that the double quotes is not a character, but a
string of the size of 2 chars, a null char is implied, something we will
:be exploring more closely in the near future.

Furthermore, there is no direct way to assign the address of a literal
char to a pointer.

char * ptr = 'A'; // Wrong
char * ptr = "A"; // Wrong and depreciated and a string

You can do this:

char letter = 'A';

chat * ptr = &letter; //we will look at this syntax when looking at
                      //pointers in full

Workshop Assignment:  Print Out a complete set of ASCII chars and the
decimal and numbers associated with them.  You can use a for loop

int i = 0;
for( i = 0; i < 127; i++){

//you code in here

}

and then using unsigned chars, for fun, extend it to a complete set of
256 chars.

C++ Integer types - Data Representation

Constant and Literal Integers are represented as numbers without any
quotes.  They can be represented in decimal, octal, and Hexadecimal
forms.  They can be assigned to int, short, long and signed and unsigned
variables and they are literal, and therefor can not be left values: ie
values that go on the left of an assignment operator.

Decimal Forms look like these examples:

int i = 255;
short i = 255;
long i = 4000;
unsigned short = 32;
signed int = -273;

Octal examples are similar to the integers that we saw with chars and
begin with zeros:

int o = 042;
short o = 0101;
long o = 0076;
unsigned o = 0101;
singed long o =011102;

Hexadecimal examples begin with x or X and again are without quotes:

int h = 0xFF;
short = 0x12;
long  = 0xA2E44D;
unsigned = 0xA2;
signed long = 0xAF23E4;

Integers also have the ability to be expressed as Unsigned or Long
literal values, if the need arises to do so.

unsigned regist = 121U;
long lightspeed = 123456789L;
#define MAXVAL 1234567U

While discussing integer values, the programmer also needs to be aware
of the size_t typedef that C and C++ uses for the sizeof() operator.
The sizeof() operator returns the size of any data object in size of
bytes.  Its return value is an integer of type size_t.  Because in C
and C++ we manage memory directly, the sizeof() operator plays a significant
role in your programming.  And example of sizeof() is:

size_t  i = sizeof(oint);

One of the unusual properties of integers and chars is that both represent
real integer values.  As a result, many of the mathematical operators can be
used with them and they can be assigned to each other.  There are automated
rules for "recasting" the data types as they interact with each other.  We'll
look at this rules in detail later.  But for now one should be aware
that statements theses are common in C and C++.

char letter, letter2;
int number;
short num2;
long num3;

letter = 'c';
letter2 = 'G';
letter++; //now stores 'd'

cout << letter << " " << endl; //prints 'd' and a line feed

num2 = letter;

cout << num2 << " " << endl; //prints '100' and a line feed

cout << letter + leter2;

Floating Point

Floating Point data essentially can be represented in the two styles
already discussed, as decimal and scientific notation. Decimal notation
can be followed by an 'F' or 'f' for single precision or a double
precision with an 'L' or 'l' (not a 'D' for double).  Here are some
examples:

double avog = 6.23xE23;
float pizza = 0.125;
float trip = 102.7F
float population = 8323456L;
float pop_brklyn = 8.32xE6;

Casting:
C and C++ are typed languages, but they have some flexibility built into
their design in this regard.  This can be good and bad, because this
also means that the language will give you room to hang yourself if you
don't learn the explicit rules for the accommodations that C++ will make
for you.  For example, one can assign a float into an integer - but then
you are left to understand what the resulting outcome is.  And these
are very difficult bugs to catch because the code looks correct, seems
logical, and it is a raw syntax error.

For example, what does this legal code do?

#include <iostream>
#include <cmath>;

int a,b,c,z;
float d,e,f;

a = 1.25;
b = 2.50;
c = 5;

d = 0.6125;
e = 0.30625;
f = 0.153125;

z = 25U + 75;

z = z * a;
cout << b/2 << endl;
z=((pow(b,2)) + e)/((pow(f,e)) * z);

cout << "I have no clue what this results in and can't be paid enough to
debug it " << z << endl;

Rules for Casting:

Implicit Rules:

When the compiler is confronted with two different data types, it tries
to work operations by casting one data type to another which is
compatible to the expression.  The programmer can also manually do such
casting.  The Implicit casting rules are as follows:

There are 4 events that trigger C and C++ to do implicit type
conversion:

A) When the operants in a mathematic or logic expression are of two
different type: example

char a = 'A';
int b = 9;
long c;
c = a + b;
if(c == b){
//do something
}

B) When the assignment on the right side of an equation doesn't match
the type of the variable or lvaue on the left: example

char a = 'a';
double b;
b = c;

C) When the argument to a function or method doesn't match the parameter.
We will see this when we look at functions.

D) When the return statement doesn't match the function type.  Again, we
will see this when we examine functions.

Rules:

Generally the implicit cast rules are designed to loose the least about of
precision possible.  If an arithmetic operation includes floating
point data then the implicit casting of data follows the following rules

float --promotion-->double-->long double

Neither operand in an arithmetic operation has a float:

int-->unsigned int-->long-->unsigned long  :  Note that these promotions
can lose there sign (lose negativity).  I also not that when attempting
to verify these promotion rules in C and in C++, that they don't hold.
I can not create an example program that will promote the signed data
type to an unsigned one, and to then lose the sign.

According to Lippman the C++ rules for implicit promotion in C++ is as follows:

In arithmetic operations involving binary operators and data of mixed
type will convert to the widest data type present.  All arithmetic
expressions involving types less wide that an INT are promoted to an INT
before processing.  When data is present in the expression as a long
double, everything is converted to long double's....

        otherwise, if neither is a long double and one is of type
double, then everything is converted to double...

                otherwise, if neither is a double, then if one type is a
float, everything is converted to a float...

                        Otherwise, when there are no floats involved,
then integer promotion is evaluated.  At the beginning of evaluation all
integers small than a INT is promoted to an INT.  Unsigned short ints are
promoted to ints as well unless they can't fit, then they are promoted to
unsigned ints.  Now, after the floats are done being evaluation the
larger ints are evaluated for promotion.  If there is an unsigned long,
all are converted to unsigned long (not the loss of negative values)...

                                Otherwise, if there is no unsigned
longs, and we have a long, the others are converted to long, and an
unsigned int is converted to long if it is large enough to hold the
bytes, otherwise it is converted to a unsigned long ...

                                        Otherwise...if there is an
unsigned int, then everything is promoted to unsigned it.

Again, I will repeat that I have not been able to confirm the documented
implicit conversion in arithmetic operations in cases where sign is lost
(conversions from unsigned ints to long for example), unless the value
is being assigned to an unsigned variable.

Explicit Cast:

For a variety of reasons, one might need to cast data intentionally.
There are two styles to do this, the older C style and the newer C++
standard.    First the new style.

The kinds of New Style Casting:

static_cast, dynamic_cast, const_cast, and reinterpret_cast.  Syntax for
these casts follows the following conventions:

int int_variable = static_cast<int>(char_variable);

cast_name<type>(variable) in the general form.  I'm not going to yet
explain the differences at this point, but will come back to it soon
enough.  I will say that the result is to forcefully convert the data
from one type to another, in the case above, from a char to a int.

In the C style, parenthesis are used to make the cast:

char letter;
int var = (int) letter;

casts the value of letter to an int.

Aggregate Data Types:
Most of the action involving your program will involve more than a single
indepent integer, char or float.  Groups of data types together creates
most of the useful.  C and C++ gives multiple tools for handling these
agregate data types.  The key element is the C style array.  An array's
syntax is declared, defined and assigned like the elementry data types,
and looks like this, using the square bracket operator:

char mystraing[]; // Declares an array of chars without dimension
char mystring[100]; //Declares an array of chars with 100 chars within
                    //it
char * mystring[]; //Declares an array of pointers to chars similar to
                   //the paramenter of main char * argv[];

One can assign and declare your array with a single statement.  When
doing so, C and C++ has several syntax tools to help you create many
necessecary subtle data contructions that you need for your programming.
The comments below outlines these examples and behaviors.

char mystring[] = "This is our first string"; //Declares a char array of
                        //27 chars which is terminated with a null value

char mystring[] = {'a','b','c','d','e'}; //Creates an array of 5 chars.

int matrix[100] = {1.2,3,4,5};  //This creates an array of 100
                //integers filling the first 5 locations with 1,2,3,4,5
                //and then adds 0's or NULLS to the remaining 95 indexed
                //locations

int matrix[100] = {'1'.'2','3','4','5'};  //This creates an array of 100
                        //integers where the equivilent of the short
                        //intergers which represent the ascii values for
                        //the characters '1' and '2' etc, and then fills
                        //the rest of the array with zeros.  It is
                        //similar to the next statement (but not
                        //exactly)

char matrix[100] = "12345"; // This example creates a string literal
                           //"12345" which ends in a null, and then
                           //pads the rest of the array with nulls.  The
                           //result is the same as above, but via a
                           //different mechanism because all string
                           //literals end in null.  The above examples
                           //has implicit promotion from char to integer
                           //types.  This example must be a char type,
                           //otherwise the the compiler will not accept
                           //the assignment.  Furthmore, only the care
                           //type will print a string when asked.  The
                           //top example needs an explicit cast.  See
                           //and thry this example for a demonstration.

#include <iostream>
using namespace std;

int main(int argc, char * argv[]){
        unsigned short int matrix[100] = {'1','2','3','4','5'};
        char matrix2[100] = "12345";
        cout << "First Martix "<< matrix << endl;
        cout << "Second Matrix " << matrix2 << endl;
        for(int i=0;i<5;i++){
                cout << matrix[i] << endl;
        }
        for(int i=0;i<5;i++){
                cout << static_cast<char>(matrix[i]) << endl;
        }
        return 0;
}

ruben@www2:~/cplus> g++ -Wall test.cc -o test.bin
You have mail in /var/mail/ruben
ruben@www2:~/cplus>
ruben@www2:~/cplus> ./test.bin
First Martix 0xbfc98c3c
Second Matrix 12345
49
50
51
52
53
1
2
3
4
5
ruben@www2:~/cplus>

Notice that the second matrix prints a seemingly random number.  That
number is actually the memory address that matrix points at.  It acts
like a pointer in the context of cout.  The for loop itself will be
looked at more closely when we discuss flow control operators.

We can not mix data types in an array.  An array is defined by as a
single data type only.

Arrays are indexed starting with zero.  You have to know the size of
your arrays, otherwise you can walk past the end of them into the
undefined sections of your memory.  Usually this will cause a
segmentation fault, but not always.  Arrrays have syntax that allow them
to be converted to pointers.   Pointers is the next section, after we
look at arrays, and we wil look closely at pointers and arrays at soon.

Arrays can have two dimensions like this:

float matrix[4][7];

That declares an array of 4 columns of nine rows (c before r),

for example, we can initialize such an array like this:

float matrix[4][7] = {
        { 2.11, 2.22, 2.33, 2.44, 2.55, 2.66, 2.77 },
        { 3.11, 3.33, 3.33, 3.44, 3.55, 3.66, 3.77 },
        { 4.11, 4.44, 4.33, 4.44, 4.55, 4.66, 4.77 },
        { 5.11, 5.55, 5.33, 5.44, 5.55, 5.66, 5.77 }
};

or you can drop in inside curly braces and the compiler will do the
rest..

float matrix[4][7] = {
         2.11, 2.22. 2.33, 2.44, 2.55, 2.66, 2.77 ,
         3.11, 3.33. 3.33, 3.44, 3.55, 3.66, 3.77,
         4.11, 4.44. 4.33, 4.44, 4.55, 4.66, 4.77 ,
         5.11, 5.55. 5.33, 5.44, 5.55, 5.66, 5.77
};

Although we stupid humans conceptualize this as columns and rows, in RAM
this is stored as a single linear block of memory.

There are alot of minefields with two dimensional arrays, and this
program shows some of them:

#include <iostream>
using namespace std;

int main(int argc, char * argv[]){
        unsigned short int matrix[100] = {'1','2','3','4','5'};
        char matrix2[1000] = "12345";
        float dmatrix[4][7] = {
        { 2.11, 2.22, 2.33, 2.44, 2.55, 2.66, 2.77 },
        { 3.11, 3.33, 3.33, 3.44, 3.55, 3.66, 3.77 },
        { 4.11, 4.44, 4.33, 4.44, 4.55, 4.66, 4.77 },
        { 5.11, 5.55, 5.33, 5.44, 5.55, 5.66, 5.77 }
        };

        float * track;

        cout << "First Martix "<< matrix << endl;
        cout << "Second Matrix " << matrix2 << endl;
        for(int i=0;i<5;i++){
                cout << matrix[i] << endl;
        }
        for(int i=0;i<5;i++){
                cout << static_cast<char>(matrix[i]) << endl;
        }
        for(int i=0;i<100;i++){
                cout << &matrix[i] << endl;
        }
        for(int i=0;i<5;i++){
                cout << "STRING " << reinterpret_cast<int *>(&matrix2[i]) << endl;
        }
        track = *dmatrix;
        float * last = &dmatrix[3][6];

        for(int count = 0; track <= last; track++){
                cout << "Position ==>" << count++ << "\tMemory Location==>"<<track << "\tValue==>" << *track <<endl;
        }
        return 0;
}

The difficulty in this example code is that the symbol for a
multidimensional array is often thought to be equal to a pointer.
It isn't.  Array names are implicitely converted to pointers
as needed by the compiler.  But with a multi-dimensional array, the
symbol converts to a ***pointer to an array of some type***, which is
not specifically the same as a pointer to a pointer, or even a pointer
to the beginning address of the two dimensional array.  So examining the
statement:

track = *dmatrix;

We are attempting to aquire the first address of the entire
multi-dimensional array.  The safest and most obvious way of doing this
is to just index the first position and request the address.  So a
functionally equivilent statement is as follows:

track = &dmatrix[0][0];

There is a difference, however, between the two statements.  In the
first statement *dmatrix is processing this statement in the following
order:

1) First dmatrix is evaluated, which is an a two dimensional array.

2) The compiler implicitly converts the array symbol dmatrix to a
pointer to an array and returns the address of the first element of the array
and its type.  This first element is a pointer to an array of floats, not a pointer
to a pointer of floats.  Though similar, they are not the same.  ,
The returned address is for the 1st element of the first
array, the column is an array pointer to the second array, the row.
So it returns a pointer to an array of floats.

3) Then this pointer to an array of floats address is derefenced because of the
(*) operator to the value of the first position of the first array in
the second dimension, the row array.

3) The compiler then again implicitedly converts that
array value to the pointer of the address of the first float element of the
second dimension.

a third way of gaining this function is as follows:

track = dmartix[0];

If your not convinced that the array symbol and the pointer symbol are
not the same, use the sizeof operator to return the size of each, and
you will see the compiler clearly knows when it is dealing with an array
pointer rather than a pointer to a data type.

#include <iostream>
using namespace std;

int main(int argc, char * argv[]){
        unsigned short int matrix[100] = {'1','2','3','4','5'};
        char matrix2[1000] = "12345";
        float dmatrix[4][7] = {
        { 2.11, 2.22, 2.33, 2.44, 2.55, 2.66, 2.77 },
        { 3.11, 3.33, 3.33, 3.44, 3.55, 3.66, 3.77 },
        { 4.11, 4.44, 4.33, 4.44, 4.55, 4.66, 4.77 },
        { 5.11, 5.55, 5.33, 5.44, 5.55, 5.66, 5.77 }
        };
        float * track;

        cout << "First Martix "<< matrix << endl;
        cout << "Second Matrix " << matrix2 << endl;
        for(int i=0;i<5;i++){
                cout << matrix[i] << endl;
        }
        for(int i=0;i<5;i++){
                cout << static_cast<char>(matrix[i]) << endl;
        }
        for(int i=0;i<100;i++){
                cout << &matrix[i] << endl;
        }
        for(int i=0;i<5;i++){
                cout << "STRING " << reinterpret_cast<int *>(&matrix2[i]) << endl;
        }

        track = *dmatrix;
        float * last = &dmatrix[3][6];

        for(int count = 0; track <= last; track++){
                cout << "Position with track ==>" << count++ << "\tMemory Location==>"<<track << "\tValue==>" << *track <<endl;
        }

        float * track2 = dmatrix[0];
        float * last2 = &dmatrix[3][6];

        for(int count = 0; track2 <= last2; track2++){
                cout << "Position with track2 ==>" << count++ << "\tMemory Location==>"<<track2 << "\tValue==>" << *track2 <<endl;
        }

        int size_dmatrix = sizeof(dmatrix);
        int size_dmatrix_row = sizeof(dmatrix[0]);
        int size_track = sizeof(track);
        int size_track2 = sizeof(track2);
        cout << endl << endl << "Size of Data Types:\ndmatrix ==>" << size_dmatrix << endl;
        cout << "Size of Track ==>" << size_track << endl;
        cout << "Size of Track2==>" << size_track2 << endl;
        cout << "Size of dmatix[0] ==>" << size_dmatrix_row << endl;

        return 0;
}

A particularly special array is the charater array, which in C forms the
basis for strings.  We already know that a single char is a C and C++
built in data type and we can have an array of chars, and lastly that we
can have string literals, which are constant.  For review, lets look at
code examples of each:

Example A:
char car = 'A'; //a single character assigned to a char variable.  Note
                //the single quote
Example B:
char cararray[] = {'A', 'B', 'C', 'D'}; // The definition and assignment An array of 4 chars,
                                        //which got it's size with the
                                        //initialization of the array
                                        //and of which each of which
                                        //element can be accessed
                                        //through indexing ie:
                                        //char b = cararray[3] or
                                        //pointers such as
                                        //char b = *(cararray + 3)

Example C:
const char *stringo = "ABCD";    //This is the assignment of a string
                                 //constant literal to a pointer to a
                                 //char constant.  This is a real string
                                 //that differs from the above example
                                 //because it creates an array of chars,
                                 //not 4 chars long but 5 chars long
                                 //because it appends a NULL character
                                 //to the end

Example D:
char[] = "My Dog Has Fleas\n";//similar to above with 19 char
                                    //assigned to the array ending with
                                    //a NULL char but not a constant
                                    //literal

There are some important but subtle differences between "true" string
literals and strings formed by manually creating arrays of chars as
shown in the technique of Example B and Example D.   We can see an
example of this difference in the following code.

Make a new directory and in GVIM or the editor of your choice create the
following files:

test.cc

------------------------------------------
#include <iostream>
#include "test.h"
using namespace std;

int main(int argc, char * argv[]){
        stringexample();
        return 0;
}

----------------------------------------------

test.h

----------------------------------------------

#ifndef TEST_H
#define TEST_H
#endif /* TEST_H */

void stringexample();

--------------------------------------------------

string_ex.cc

--------------------------------------------------

#include <iostream>
#include "test.h"
using namespace std;

void stringexample(){

        char test[] = "My Dog has Fleas\n";
        const char * test2 = "My Dog has Fleas\n";
        cout << test;
        test[3] = 'C';
        test[4] = 'a';
        test[5] = 't';
        cout << test;
        cout << test2;
        const_cast<char>(test2[3]) = 'C';
        const_cast<char>(test2[4]) = 'a';
        const_cast<char>(test2[5]) = 't';
        cout << test2;

}

--------------------------------------------------

test.cc

--------------------------------------------------

#include <iostream>
#include "test.h"
using namespace std;

int main(int argc, char * argv[]){
        stringexample();
        return 0;
}

----------------------------------------------------

and create the following makefile

----------------------------------------------------

test.bin : test.o string_ex.o
        g++ -Wall -o test.bin test.o string_ex.o

test.o : test.cc
        g++ -Wall -c test.cc

string_ex.o : string_ex.cc test.h
        g++ -Wall -c string_ex.cc

----------------------------------------------------

Note that the makefile MUST have those TABS and not spaces

Then run 'make'

gcc gives you the following output

g++ -Wall -c string_ex.cc
string_ex.cc: In function ‘void stingexample()’:
string_ex.cc:15: error: assignment of read-only location ‘*(test2 + 3u)’
string_ex.cc:15: error: invalid use of const_cast with type ‘char’, which is not a pointer, reference, nor a pointer-to-data-member type
string_ex.cc:16: error: assignment of read-only location ‘*(test2 + 4u)’
string_ex.cc:16: error: invalid use of const_cast with type ‘char’, which is not a pointer, reference, nor a pointer-to-data-member type
string_ex.cc:17: error: assignment of read-only location ‘*(test2 + 5u)’
string_ex.cc:17: error: invalid use of const_cast with type ‘char’, which is not a pointer, reference, nor a pointer-to-data-member type
make: *** [string_ex.o] Error 1

This is a very useful error message and the GCC compiler is now taking
the programming to school.  Lets look at the complaints of the compiler
about our code.  The first problem gcc makes is about line 15 in
string_ex.cc which is this line:

const_cast<char>(test2[3] = 'C');

The compiler is telling us that array (or string) and test2 points to is
read only.  That variable is defined on line 8:

const char * test2 = "My Dog has Fleas\n";

It is obvious from the code that the data is defined as a "const", less
obvious is that the compiler will complain and refuse to compile if you
do NOT make test2 a "const".  Because of the assignment of the string
literal to the char pointer, it must be a const.  Therefore, we tried to
cast the const away with const_cast<char>, and that fails as well
because, as the compiler says to us:

invalid use of const_cast with type ‘char’, which is not a pointer,
reference, nor a pointer-to-data-member type.  We can not just cast away
to constantness of the string literal assigned to test2.

So again we see that arrays and pointers have differences, and arrays
and strings have even great differences.  The standard iostream object
"cout" will recognize both as strings for printing to standard output.

Incidentally, we can compile this substitution for string_ex.cc

--------------------------------------------------
#include <iostream>
#include "test.h"
using namespace std;
void stringexample(){

        char test[] = "My Dog has Fleas\n";
        const char * test2 = "My Dog has Fleas\n";
        char * test3;
        cout << test;
        test[3] = 'C';
        test[4] = 'a';
        test[5] = 't';
        cout << test;
        cout << test2;
        test3 = const_cast<char *>(test2);
        test3[3] = 'C';
        test3[4] = 'a';
        test3[5] = 't';
        cout << test2;
}
---------------------------------------------

but it creates a segmentation fault on the line:

        test3[3] = 'C';

if you add the compiler options -ggdb to your g++ commands in your
makefile, you can trace this error.  This UNDERSCORES how dangerous that
explicit casting can be, even when allowed.

The next agregate data type that C++ provides is structs and unions.
Structs and Unions have largely been superceded in C++ by Classes.  They
are inherited from C and are designed as the key multi-type agregrate
data type that C uses to create related gruops of data, similar to what
a database can provide.  Through coding algorithism, their importance
have grown far greater than just mear static records, but in this
section we will look only at their basic syntax and usage.

The real limitation to arrays is that data must be all of the same type.
structs create a new type which contain many of types with in.  The
basic format of a struct is as follows:

Declaration:

stuct struct_type_name {
        data type;
        data tpye;
        ....
}[optional object name];

Notice the semicolon on the end of the struct declaration, which is
unusual and particular to struct's in that it is after the curley brace.

structs create a new user defined data type, and after their declaration
can be used as any other data type:

Example:

struct birds{
        char species[30];
        char gender[1];
        char color[10];
        int size_in_inches;
        double weight;
        char diet[30];
        };

birds african_grey;
birds canary;

or you can make the instances with the declation:

struct birds{
        char species[30];
        char gender[1];
        char color[10];
        int size_in_inches;
        double weight;
        char diet[30];
        } canary, african_grey;

Now it might be noted here an important diference between C and C++.  In
C, you need the struct keyword to create instances of your struct's.

C EXAMPLE:

struct birds african_grey;
struct birds canary;

Otherwise, you need to use typedef in C

typedef birds parrot;
parrot african_grey, scarlet_macaw, conour, monk, bundie;

or even use typedef in the declaration:

typedef struct {
        char species[30];
        char gender[1];
        char color[10];
        int size_in_inches;
        double weight;
        char diet[30];
        } birds;

birds african_grey;
birds canary;

This is a variation of typedef, which we haven't covered.  In general
typedef creates an alias for a datatype of any kind:

typedef data_type alias;

typedef int BOOL;
typedef char[300] buffer;

thus the above example:
typedef struct {
        char species[30];
        char gender[1];
        char color[10];
        int size_in_inches;
        double weight;
        char diet[30];
        } birds;

is not the same as
struct birds {
             char species[30];
             char gender[1];
             char color[10];
             int size_in_inches;
             double weight;
             char diet[30];
        } canary, african_grey;

None creates a new data type, "birds" and the other creates new
instances of the struct, "canary", "african_grey".

****in C***

struct's can be initialized in C++ as follows, with the use of the assignment
operator and curly braces:

birds african_grey = { "African Grey Congo", 'M', "Light Grey", 13, 8.34, "Fruit and Seed" };
birds canary = { "Red Factor Canary", 'M', "Red to Orange", 2, 1.2, "Seed and Grass" };

in C remember to add the struct keyword.

The internal data types are reached by use of the dot operator.

char * color = canary.color;
cout << canary.color << endl;

strcpy(canary.color, "Deep Red"); //can not assign a char[]

You can make a pointer to a struct, and this is often useful

birds * canary;

but remember that you have no memory allocated for members yet.

birds *finches, parrot={"Conure", 'F', "Green", 6, 60, "Oranges and Peanuts"};

finches = &parrot;

When you use a pointer, access to members is gained using the infix
operaotr "->"

Example:

struct birds{
         char species[30];
         char gender[1];
         char color[10];
         int size_in_inches;
         double weight;
         char diet[30];
         } *finches, parrots = {"Conure",'F',"Yellow",'6',60,"Peanuts and Oranges"};

finches = &parrots;

cout << "Your " << finches->species << "is " << finches->color << endl;

By creating arrays of struct's, large databases of records can be stored
in your program

birds finches[100];

strcpy(finches[0].species, "Zebra Finch");

birds *parrots[100];

strcpy(parrot[0]->species, "Amazon");