List: New Yorkers Linux Scene
Admin: To unsubscribe send unsubscribename-at-domian.com to hangout-request-at-www2.mrbrklyn.com
X-Evolution: 00000079-0020
Content-Length: 22608
Lines: 559
Status: RO
X-Status:
X-Keywords:
X-UID: 4261
--=_YiEDa0DAkWCtVe
Content-Type: text/plain; charset=ISO-8859-1
This needs editing
HTML File Attached
and seeable at
http://www.nylxs.com/journal/c_for_beginners.html
--
__________________________
Brooklyn Linux Solutions
__________________________
DRM is THEFT - We are the STAKEHOLDERS http://www.nyfairuse.org
http://www.mrbrklyn.com - Consulting
http://www.nylxs.com/radio - Free Software Radio Show and Archives
http://www.brooklynonline.com - For the love of Brooklyn
http://www.nylxs.com - Leadership Development in Free Software
http://www.nyfairuse.org - The foundation of Democracy
http://www2.mrbrklyn.com/resources - Unpublished Archive or stories and articles from around the net
http://www2.mrbrklyn.com/mp3/dr.mp3 - Imagine my surprise when I saw you...
http://www2.mrbrklyn.com/downtown.html - See the New Downtown Brooklyn....
1-718-382-5752
--=_YiEDa0DAkWCtVe
Content-Type: text/html; charset=us-ascii
Content-Disposition: attachment; filename="c_for_beginners.html"
Content-Transfer-Encoding: quoted-printable
Introduction to C on GNU/Linux
When working with GNU based Free Operating systems, most implementations are
based on the original work of earlier UNIX systems. Unix was originaly=20
developed out of AT&T. One of the key developments which made Unix an=20
early sucesses was the co-development of the C programing language. =20
This article hopes to introduce C to beginning users of GNU/Linux and BSD.
When C was developed, part of it's goal was to make a highly portable syntax
which still gave low level access to the memory and the CPU. The result was
a 3 tier development system. All C programs are compiled from some text
into a binary program. It is the binarywhich runs on your computer. =20
The program which creates the binary is called a compiler. The compiler=20
on GNU systems is Richard Stalman's gcc. The compiler parses a text file,=
=20
or a series of text files, processes all the instructions, and builds a bin=
ary
program from the instructions. It does this by compinary binary code. So=
me
of the code it produces is imported from external libraries. Some of it is
new binary code. Pathced together thismost often produces a single binary=
=20
application.
The three tier system of C includes libraries, source code and header files.
Header files tell the compiler where to find code definitions. It is also
sometimes needed to tell the compiler where libraries live which are defined
in the source code or the headers. While this can seem confusing, as you
become familiar with C, it will become more natural. Let's look at a simple
example to see how these three tiers interact with each other.
We start by opening a simple text file called prog1.c with the VI editor:
#include
int main(int argc, char **argv){
printf("Welcome to NYLXS\n");
return(1);
exit(1);
}
Exit the file and now run the compiler with the following command:
ruben$: gcc prog1.c
This command starts the compiler and creates a new file called a.out.
a.out is the executable program. Run it from the command line:
ruben$: ./a.out
Welcome to NYLXS
ruben$:=20
We can now examine all three components of our C program.
The fisrt line in our program tells the compiler to look for a file called
stdio.h and to bring it into our program. stdio.h is the main C in and out
library header file. It defines many function in C, including the printf
function. Without this file our compiler can not find printf. After this
line we are now dropd into our original code. In our case we begin with the
definition of the main subroutine. All C progams have a subroutine 'main'.
Main has a defined prototype=20
int main (int argc, char* argv[]);
This should never change. Main is the launcher of all activity within
your C program. Lastly, our compiler accessed libraries on the system in
order to build your binary. Despite the fact that our command to gcc did
not explicitely introduce any librarirs, our C program was built from them
anyway. Sometimes the compiler needs libraries it can not natively find.
Under these conditions our gcc command needs an option to tell it where to
find a library. For example, if we need to use an advanced math function, =
we
need to tell gcc to link with the math library like this:
gcc -lmath program1.c
Let's examine the nature of C more closely by looking at a slightly
more complex program:
#include <stdio.h>=
#include color=3D#ff00ff><string.h>
char name[255NT>] =3D {'\0'};
int main(int=
FONT> argc, char **argv){
printf("Welcome to NYLXS<=
FONT
color=3D#6959cf>\n");
printf("Enter your name-->ONT>\ncolor=3D#ff00ff>");
fgets(name, sizeof(name=
), stdin);
while(strcmp(r=3D#ff00ff>"\nff>", name) !=3D 0){
printf("value ->T>%s size->NT color=3D#6959cf>%d\nf00ff>", name, sizeof(name)=
);
fgets(name, sizeof<=
/B>(name), stdin);
}
return(f00ff>1);
}
This program includes two external header files to define library functions.
The first one we saw before, stdio.h. The second include file, strings.h=
=20
defines the standard C library for strings. The function strcmp is used
to test each string we recieve from standard input.
Before we declare main(), we define and initialize a symbol called 'name'. =
C
is a strongly typed language. Every variable in C needs to be pre-d=
eclared=20
as one which stores a particular kind of data. If we try to assign to the=
=20
variable data which is diferent that it's predefined type, the gcc compiler=
=20
will complain and probibly not create a binary file.=20
=20
In this case, the symbol 'name' is marketed as a variable of type char=
B>. The words int,=20
char, double, float are examples for key words in C which define data t=
ypes. In our editor
they are marked in green. char name means that this variable is marked as =
a character
type variable. It stores only carachters. In the example of 'name' the decl=
aration also declares
this variable as an array. An array is a group of data accessable throu=
gh an index.
Let's' look at this line more closely
char name[255NT>] =3D {'\0'};
- name is declared as a char data type through the keyword char
- name is declared as an array because of the the square bracket t=
o the right of the symbol.
- name is declared as an array with 255 chars because of the number in th=
e square brackets in the
declaration. Different data types are stored in different sized memory loc=
ations. Charactors are
universally defined as being of 1 byte or 8 bits. By declaring name to be =
an array of 255 charactors
in length, we essentially tell the computer to please allocate a space in m=
emory with 255 bytes. We will
look at this closer in a minute.
- When we declare the array, we can fill it with data. This is done thr=
ough the Curly Braces {}
- The array is initially filled with the 'zero' byte: 00000000. We do th=
is by initializing
the array with a String Contant null '\0'.
- String constant are defined using single quotes. The \0 is a sp=
ecial character which means
00000000
- When we initialize the array with less entries than all the array eleme=
nts, then C fills the rest
of the array with null characters.
- It is not necessary to initialize an array in a declaration. It is =
usually necessary to=20
define the size of the array when you declare it with a few exceptions as w=
ill be noted.
- One such exception to the above rule would be if we initialize the arra=
y and declare it together
like this:
int numbers[]=3D{1,2,=
3,4,5,6,7,8};
In this case, the array is declared with 8 elements, even without the numbe=
r in the square bracket.=09
The next line is where we define out main function. As we said before, all=
C programs require a=20
main function. Main is the jumping off point for all C programs. However,=
in most regards, main
looks like any other function in C. Let's look more closely at the main de=
claration:
int main(int=
FONT> argc, char **argv){
The int in green before the symbol 'main', tells C that main is returning a=
n integer. In fact, this
integer is returned to the shell when you run a program on the command line=
. You can check it's value=20
after your program is finished by entering: echo $? on the command line of =
a bash shell.
All functions are defined with a symbol(). The paranthesis tells C this sy=
mbol is a funcion, just as
the sqaure brackets tells C a symbol is an array. Within the paranthesis w=
e put parameters which are
expected to be passed to our function. Unlike other languages, such are Pe=
rl, the parameters defined in
our function must be used when these functions are used. In the case of ma=
in, the funcion is used by the
operating system of shell and our two arguments (argv and argc) are automat=
ically filled by the=20
Operations or shell when the program is called.
argc is represents the number of arguements which are called with the progr=
am. argv is the arguements
themselves, represented as arrays of chars. Hence, argc is declared as =
an int data type and argv is a char data type.
Inside of main, our program begins to work. Our program not processes thes=
e lines from top to bottom
in order. The first line prints the greating, "Welcome to NYLXS" and adds =
a line feed. The \n is a
special character, in some ways like \0 combination which means add a line =
feed and start at the new=20
line. We will look at the printf function in more detail later. The next =
line prints to standard out
a prompt for user input: "Enter your name-->". The next line retrieves inf=
ormation from the=20
Standard Input Device, most often a keyboard, and stores that inform=
ation into the array
of characters which we asked to be previously allocated with the symbol 'na=
me'. We can store up to
255 characters into our array.
Let's look at the fgets function. Like most C functions, fgets is document=
ed in the man page of your
Gnu/Linux system. Let's look at the manual page:
ruben$: man fgets
GETS(3) Linux Programmer's Manual GETS(3)
NAME
fgetc, fgets, getc, getchar, gets, ungetc - input of char=AD
acters and strings
SYNOPSIS
#include <stdio.h>
int fgetc(FILE *stream);
char *fgets(char *s, int size, FILE *stream);
int getc(FILE *stream);
int getchar(void);
char *gets(char *s);
int ungetc(int c, FILE *stream);
DESCRIPTION
fgetc() reads the next character from stream and returns
it as an unsigned char cast to an int, or EOF on end of
file or error.
getc() is equivalent to fgetc() except that it may be
implemented as a macro which evaluates stream more than
once.
getchar() is equivalent to getc(stdin).
gets() reads a line from stdin into the buffer pointed to
by s until either a terminating newline or EOF, which it
replaces with '\0'. No check for buffer overrun is per=AD
formed (see BUGS below).
fgets() reads in at most one less than size characters
from stream and stores them into the buffer pointed to by
s. Reading stops after an EOF or a newline. If a newline
is read, it is stored into the buffer. A '\0' is stored
after the last character in the buffer.
ungetc() pushes c back to stream, cast to unsigned char,
where it is available for subsequent read operations.
Pushed - back characters will be returned in reverse
order; only one pushback is guaranteed.
Calls to the functions described here can be mixed with
each other and with calls to other input functions from
the stdio library for the same input stream.
For non-locking counterparts, see unlocked_stdio(3).
RETURN VALUE
fgetc(), getc() and getchar() return the character read as
an unsigned char cast to an int or EOF on end of file or
error.
gets() and fgets() return s on success, and NULL on error
or when end of file occurs while no characters have been
read.
ungetc() returns c on success, or EOF on error.
CONFORMING TO
ANSI - C, POSIX.1
BUGS
Never use gets(). Because it is impossible to tell with=AD
out knowing the data in advance how many characters gets()
will read, and because gets() will continue to store char=AD
acters past the end of the buffer, it is extremely danger=AD
ous to use. It has been used to break computer security.
Use fgets() instead.
It is not advisable to mix calls to input functions from
the stdio library with low - level calls to read() for the
file descriptor associated with the input stream; the
results will be undefined and very probably not what you
want.
SEE ALSO
read(2), write(2), ferror(3), fopen(3), fread(3),
fseek(3), puts(3), scanf(3), unlocked_stdio(3)
The man page tells us several important thing about this function and
it's use in C. All functions (in all programming languages) represent=20
a process. A process has 3 components: input, output and side effect.

Diagran of a process
The inputs of functions are the parameters. The output is the return value=
which
for main is an int. The side effects is all the work the program does whic=
h is not
it's return value.
=46rom the man page, we can see that fgets is one of a group of C functions=
which include
gets, getc and others. In addition, the man page tells us that fgets is in=
the stdio library.
It tells us to include put #include <stdio.h> into our code to gain a=
ccess to the function.
It defines the function for us as follows:
char *fgets(char *s, int size, FILE *stream);
fgets takes 3 parameters for input. A pointer to character data, an intege=
r, and a file stream. =20
Let's look at all thre definitions:
- char *s: A pointer to character data: A pointer is a symbol which has=
at it's value a
memory address as a value. In this case, the memory adress has to be an al=
located area in memory
which is typed as a char set of data. In our example, we have a char array=
called 'name'. With
arrays, C will convert the symbol of an array to a pointer of the address w=
here the array is located.
C does this for us automatically. This is a specific property of arrays an=
d can not be depended upon
to happe with other kinds of data constructions unless specified in the C p=
rogramming specification.
- int size: An integer which represents a SIZE_T data type. SIZE_T is a=
special data type in C
which is used to store and describe the size of data constructions in our p=
rograms.
- FILE *stream: File streams are pointers to devices and or other progra=
mming constructions which=20
provide a stream of data into and out of our program. All programs in Unix=
inherit three streams:
- Standard In (stdin) - usualy the keyboard
- Standard Out (std) mos=
t normaly the screen - stderr(stderr) - another output most normaly to the=
screen, but in this case, it is used only for error=20
messages and the like
Because C has strict data typing, a function definition is very clear and s=
pecific about the use of=20
a function. Other information which is described in the man page mostly co=
ncerns the side effect of=20
the function. In the case of fgets we are told it reads in at most one les=
s than size characters=20
from stream and stores them into the buffer pointed to by s. In a=
ddition, we are told it
stops read when it recieves an End of File marker (EOF) or a new line (line=
feed) charactor. We are=20
told the line feed character is added to the buffer, and then fgets adds an=
additional character '\0'.
Let's now see how we used it in our program:
fgets(name, sizeof(name), =3D#ff00ff>stdin);
We call fgets with the parameter 'name' which is the symbol which defines o=
ur array of chars. It
automatically convert for us to a pointer to a char data construction, or o=
ur array of chars. The
second argument is sizeof(name). sizeof is a marco in C (similar to a fucnt=
ion) which returns the=20
size of a state construction. In this case, that data construction is nam=
e which is of size 255=20
(which means it has 255 bytes). The third parameter is stdin. stdin is th=
e default symbol for our
Standard Input File Stream pointer. We inherit it from the environment.
Finally, you might notice that we disregard the return value of fgets. Sin=
ce the function stores the
input into 'name', we can do this. However it is often prudent to test the=
return value of a function
to assure that it worked properly. If fgets returns a 'NULL', it would mea=
n that our program=20
encountered a problem in its environment.
It is ciritical that fgets can not try to put more characters into our arra=
y than is allocated for
it in memory. If we did that, we can create a security problem, and invade=
the memory of other
programs in our syste. This is bad. Therefor, we limit the input ability =
of fgets by the size of
our array. This is good and proper programming practice which you must ado=
pt.
The next section of our program introduces looping and flow control. Much =
of our time programming=20
involves working on conditional actions (do this if you hear a click) or lo=
ops (do this over and over
until the the user says uncle). The while key word in C creates a c=
onditional loop. The
expression inside the paranthesis is tested. If it returns a possitive int=
eger, or a non-null
character, it enter the loop. The actions within the loop are inside the c=
urly braces. When
the last action within the braces is evaluated, then it returns to the top =
and tests the expression
in the paranthesis again.
Inside the paranthesis of our while loop, we call a function called strc=
mp (do a man strcmp now).
strcmp looks at two strings and compares them. It then returns a positive =
number, a negitive number or
a 0 (zero) depending upon if the first string is great than, less than or e=
qual to the second.
Characters in a string are reprented by integer numbers which are one byte =
in size. Since their is=20
eight bits in a byte, at most, you can represent 256 characters in a char. =
There is a standard integer
which respesents each key on the keyboard. This standard association of ch=
aracters to byte integers
is called the ASCII standard table. In this table, the letter A is 65 and =
Z is 90. All the rest of
the capital letters fall inbetween in order. The letter 'a' is 97, and 'z'=
is 122. Again, all the=20
lower case letters fall inbetween in order. In this manner, strcmp can com=
pare the strings by their
ASCII representation. It is important to note at this point that there is =
a very tight relationship
between short integers (integers stored in a single byte) and characters in=
C. It should be also noted
that strcmp reads the arrays of chars until it reaches a '\0' (nul) charact=
er. Anything stored after
the nul is ignored.
Our program checks if our input buffer (name) is equal to "\n". "\n" is a =
string constant. All string
constants add a '\0' to the end of their allocated array. So the comaprisi=
on is actually to '\n\0'.
Since fgets adds the null to the end of the string, everyone is happy with =
this comparision.
Our program now repeats all the steps in the curly braces until the user en=
ters "\n" into the keyboard
on an empty line.
This sample program and the explanation is a good introduction to C for a b=
eginner. But their is far
more to learn, even for a beginner, which I hope to explore in the coming m=
onths of the NYLXS journal.
In the meantime, I challenge you to try a few things with this program.
First, change the size integer in fgets to 5 and try to enter 10 characters=
into your keyboard. What
happens with your program?
Second, try rewriting this program so that you fill the char array with 255=
characters and NO NULL value
at the end. How does this affect the strcmp function.=20
(hint try adding this code into your program and comment out the fgets:
for(i =3D name;i<(name+256);i++){
*i =3D getchar();
printf("Char entered->%c\n", *i);
}
)
Third - Try changing the size of the name array to 5 and enter 10 character=
s on the prompt. =20
What happens?
--=_YiEDa0DAkWCtVe--
____________________________
New Yorker Free Software Users Scene
Fair Use -
because it's either fair use or useless....