Note: Although relatively complete, it is still a work in progress. Check back every now and then for updates. Last update 07-22-2005 (minor fixes on 4-6-2007).
This crash course is entirely self contained in one HTML file. Save this file to disk for offline viewing. If you haven't visited this page online in a while, hit the refresh button on your browser to ensure you're viewing the latest version.
Please, please, please! If there is anything in here that you're having trouble with understanding, or if you have any suggestions please email me at dbmoyes@gmail.com so that I can make this crash course better. Please include your age/grade level with your comments so that I have a better idea of how I need to change things in this crash course, and others that I may write later. This information will only be stored as an email, and possibly printed, and will not be transfered to anyone else for any reason. Also be sure include the the title of this document in the email subject line so I won't think your email is spam.
You can find this document at http://www.stellimare.com/~aragorn/scouting/mb/computers/c-cc.html, or you can go to my Boy Scout Merit Badge Web page for information on this, and other merit badges.
Introduction | top |
I wrote this document to help kids learn the C programming language where they may not have a book readily available to them and may not have the money to buy one. Back in the early 1980s, there were many books for kids who wanted to learn a programming language, computer magazines even had special sections just for them. Today, there are few books written for kids for a main stream language like C, and few for kids in general. This was also written to help those Boy Scouts working to obtain the "Computers" merit badge to complete some of the requirements for the badge. Ideally, any 6th grader with a knowledge of computers would be able to learn how to start making working programs in C after reading and using this crash course.
Because this is a crash course, I have left many things out that you would find in a book on C. The good news is, with this crash course, you don't have to buy one of those Learn C in No-Time-Flat or C for not very smart people books first. Instead, you can buy a good reference book on C, like Herbert Schildt, C, The Complete Reference, and other books, fill in many of the gaps I've left here.
The C compiler used in this document is the GNU C/C++ Compiler (gcc.gnu.org), which is shipped with nearly every Linux distribution like Fedora (fedora.redhat.com) and Slackware (www.slackware.com), is part of Mac OS X Xcode development tools, and is used by MinGW (www.mingw.org) a C/C++ development package for Microsoft Windows (You can also use dgjpp for Windows or DOS, but it's harder to set up and has some trouble running on Windows NT/2000/XP due to bugs in the Windows operating system).
For those installing MinGW and MSYS, you need to first install MinGW, then MSYS. Use all the default values when installing MinGW. When installing MSYS, answer 'y' to all yes/no questions. When it asks you for you to enter the directory name, type:
c:/mingw/
You need to use the forward slash, not the backslash that DOS and Windows normally uses for directory names.
You do not need the GNU C compiler to use this document, but you will need an ANSI C compliant compiler, the Borland C/C++ compilers are a good choice.
In addition to a ANSI/ISO compliant C/C++ compiler, you will want a good ASCII text editor. For Windows users, Notepad will work, but there are better ones out here. Unix users should consider using some VI clone, such as Vim or Elvis which come on nearly all Unix systems. For everyone else, here is a list of editors you can download and use:
Both of these editors have context highlighting, which means they highlight some parts of your text in different colors to make things such as comments, strings, and other statements stand out while editing a program or HTML document. It's a very handy feature that helps you catch syntax errors as you type, and makes your code even easier to read. If you're just starting out, the best editor for you from this list is jEdit as it is the simplest to use.
If you have any comments about this document please send them to dbmoyes@gmail.com. Be sure that only the title of this document appears in the subject line, otherwise I may disregard your message as spam.
Now lets get started!
Language Overview | top |
Don't get discouraged if this section is confusing to you, it'll all make sense in later sections as we begin writing programs. Think of this more as a dictionary description of a word, then, in the later sections, we'll start showing you how to use it.
A C program consists of a few basic elements: keywords, operators, code blocks, functions, and compiler directives. The C language also comes with something called The Standard C Library which provides a standard way of talking to the operating system and other hardware in a computer, and also performs many many useful tasks.
A keyword performs many tasks, whether it means telling the compiler to reserve some space for a variable, performing some evaluation, looping, and any number of tasks. The following is a list of C keywords:
| auto | do | goto | signed | unsigned |
| break | double | if | sizeof | void |
| case | else | int | static | volatile |
| char | enum | long | struct | while |
| const | extern | register | switch | |
| continue | float | return | typedef | |
| default | for | short | union |
We'll be talking more about some of these later, but we won't be covering all of them is beyond the scope of this crash course.
Code blocks contain C statements, and can exist be created anywhere within a function. Variables can only be defined at the top of a code block before any other C statements are used, and outside of all code blocks to defied a global variable. An open brace '{' marks the beginning of a code block, and closed brace '}' marks the close of a block.
{ /*marks the beginning of a code block*/
int var; /*define some variable*/
/*Look at me I'm a code block!*/
} /*marks the end of a code block*/
Compiler directives begin with a pound sing '#'
and exist only on a line by themselves. They tell the
compiler to parse the code a differently, or to do something else
different during compiling. For instance, the
#include tells the compiler to another parse a file
as if it were part of the current file. In effect, it
includes, or inserts a file at that spot. There
are also many others.
Functions are an important part of C. In fact,
you couldn't do much with C at all without them. One function,
the main() is a very special function that you
define: it is the first function that is called when your
program is run. It basically is the starting point of your program.
It is also the only function that does not require a
function prototype.
/*A definition for the main function*/
int main(){
}
/* a prototype for a user-defined function */
int func(int a);
/*a user-defined function*/
int func(int a){
}
Whenever you start a code block, you should always type the closing brace on the next line, then go back up and fill in the rest, this way, you're to not forget a closed brace. Forgetting to close a code block is a very tough syntax error to find and fix, especially large.
C comments are started with a /* and ended with a */. It is important to include these in your program to explain to others, and yourself when you go back to look at your code, what exactly you're trying to do in that part of your program. All comments are ignored by the compiler. The ANSI C standard does not allow comments to be nested:
/* This comment is okay*/
/*************
so is this one
**************/
/* This one /* is not */ okay */
Every statement in C, with the exception of function definitions (not their prototypes), and compiler directives (which are not C statements), end with a semicolon ';'. A good way to think of statements as sentences in in C, and the semicolon marks the end of a sentence, while code blocks mark the paragraphs of the C language.
Now it's time to write our first C program, and to make sure we can compile our C code. Start up your favorite ASCII text editor (NOT WORDPAD, OR ANY OTHER WORD PROCESSOR) and type in the code below (you can leave out the comments):
Caution:C is case sensitive, that means, an upper case letter and a lower case letter might as well be entirely different letters to C.
/* This is a C comment, and this is our first C program */
#include <stdio.h> /* include some function prototypes from the standard
library*/
int main(){ /* this line defines a function, and the '{' indicates a
start of a code block */
/* call the printf function in the standard library to print
the message on the screen. The '\n' is an escape sequence
that the compiler translates into a newline character. */
printf("Hello World!\n");
} /* This line indicates the end of a code block, in this case, the
end of the main() function*/
Notice that I indent everything within a code block by a tab? This is a good programming practice and will make your code a lot easier to read. If you had a code block within a code block, it should also be indented as well, like this:
{
Some statements;
{
more statements;
{
Even more statements;
}
}
}
It is helpful to add a comment at the end of each closed brace that marks the end of a code block that lets you know which code block it belongs to, otherwise you could loose track very quickly in your program.
Some editors default to using 4, or even 3 spaces for indentation when your press the tab key. This can make code hard to read, so I, and many other programmers prefer to use the standard 8 spaces for a tab. We also use editors that default to using the tab character (such as Vim), rather than just using spaces, to make editing easier. When an editor encounters a tab, it will jump right to the next tab stop, rather than having to move back or forward so many spaces.
You should also try and keep your lines shorter that 70 characters, and definitely no longer that 80. If you don't then you'll have trouble printing out your code, and it will be harder to read. jEdit has a box that surrounds your text, when you begin to type outside of it, then you know your lines are longer that 80 in length. Vim, when it starts up, defaults to an 80x24 line display. This is one of many reasons why you should use one of these editors, or another generic programmer's editor for your code.
Now that you've entered in your code and saved it to disk as first.c. If you're using MinGW, then you want to save in c:\msys\1.0\home/[your_login] (if you're using a newer version of MSYS, then you'll have to replace the "1.0" with your version number). On MacOS X, you'll want to save it in /Users/[your_login] (you'll also want to add the "Terminal" program to your application bar below). To compile your program, open a terminal (All Unix systems, and Mac OS X), MSYS terminal (For MinGW/MSYS users), and type the following to compile your program (everyone else, refer to your compiler documentation first):
gcc first.c -o first
The -o parameter tells GCC to save the compiled binary as "first." It is very important that you don't ever type something like this:
\/ THIS IS VERY BAD!!! \/
gcc first -o first.c
/\ THIS IS VERY BAD!!! /\
This will cause your source file to be overwritten, because, instead of using "first" as your output file, you told gcc to use "first.c" the output file, which is your code! Yes, I've made this mistake a few times... Be careful. Later, we'll talk about how to use Makefiles which will greatly reduce the chance you'll make this kind of boo-boo.
Now, to run our program, all you need to do is type:
./first
That's it! All done! If you're using MSYS, you'll want to use the MSYS shell for compiling your programs, but you'll want to use Windows command shell (cmd.exe, or command.com), to run your programs due to some bugs in MSYS. Here's a few things you need to know about the the Windows command shell, if you're going to use it:
So, if you're using the Windows command shell, you'll want to type in the bold text to move you into the correct directory, and run your program (some directories will have different names on your system, so take that into account as you type):
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\aragorn>cd \msys
C:\msys>dir
Volume in drive C has no label.
Volume Serial Number is F09B-B76D
Directory of C:\msys
06/02/2005 02:26 AM <DIR> .
06/02/2005 02:26 AM <DIR> ..
06/02/2005 02:36 AM <DIR> 1.0
0 File(s) 0 bytes
3 Dir(s) 19,877,498,880 bytes free
C:\>cd 1.0\home
C:\msys\1.0\home>dir
Volume in drive C has no label.
Volume Serial Number is F09B-B76D
Directory of C:\msys\1.0\home
06/02/2005 02:27 AM <DIR> .
06/02/2005 02:27 AM <DIR> ..
06/02/2005 02:41 AM <DIR> aragorn
0 File(s) 0 bytes
3 Dir(s) 19,877,482,496 bytes free
C:\msys\1.0\home>cd aragorn
C:\msys\1.0\home\aragorn>first
Hello World!
C:\msys\1.0\home\aragorn>
Naturally, you'll want to replace "aragorn" with the name that shows up in your directory listing, same goes for "1.0". Once you're in this directory, you'll stay there until you exit the command shell.
Variables and Basic Data Types | top |
In algebra, just as in programming languages, we use symbols to represent variables, and constant values. Unlike algebra, variables in C, and other languages, have a limited size, and type of data they can store. These types include floating point numbers, integers, strings (arrays of characters), single characters, and a reference to some place in computer memory (see pointers below).
Here is a table of data types, and what kind of data they store (The minimum range is the minimum level of precision the data type can store as defined by the ANSI C standard-- but your system may be able to support a far greater range for these data types):
| C Data Type | Type of data stored | Minimum range |
|---|---|---|
| int | integers | -32,768 to 32,767 |
| short | same as int (usually smaller than int, if int is larger that 16-bits) | |
| short int | same as short | |
| signed | same as int | |
| signed int | same as int | |
| unsigned | positive integers | 0 to 65,535 |
| unsigned int | same as unsigned | |
| short unsigned | same as unsigned (usually smaller than unsigned, if unsigned is larger than 16 bits) | |
| short unsigned int | same as unsigned short | |
| long | integers | -2,147,483,648 to 2,147,483,647 |
| long int | same as long | |
| signed long | same as long | |
| signed long int | same as long | |
| unsigned long | positive integers | 0 to 4,294,967,295 |
| unsigned long int | same as unsigned long | |
| char | a single character, or a really small signed integer | any 8-bit character or -128 to 128 |
| unsigned char | a single character, or a really small unsigned integer | any 8-bit character, or 0 to 255 |
| float | a floating point number | six digits of precision, ±1e±37 |
| double | a floating point number | typically 15 digits of precision, and ±1e±308 |
| long double | a floating point number | typically 19 digits of precision, and ±1e±4932 |
Your compiler may support more data types, and may have even larger ranges. For instance, GCC, and other C99 (named for the year 1999) compliant C compilers, support the long long type, which is at least 64-bits bits long,
However, the actual size of these data types vary from system to system, and from compiler to compiler. For instance, GCC on 32-bit systems, an int and a long are the same size, 32-bits. GCC on 64-bit systems, int and long are 64-bits by default. This is perfectly fine within the definition of ANSI C, but it is something you should be aware of if you ever want to port your programs to another system (which you will probably want to do at some point).
To define a variable to use in your program, you must declare it. First, you specify the type of variable your declaring, then give it a name (or list of variable names you want to use of that type), end then end your statement with a semicolon, like so:
int a;
float c,d,e;
long x;
long double y;
short z;
C is case sensitive, that means that the label 'a' and 'A' are are treated as two different labels by the C compiler. A label may contain the the a-z, A-Z, 0-9, and _ (underscore). A label name may not begin with number. That means that a3 is fine for a label name, but 3a is not. Also, you should not use labels longer than 32 character. Some C compilers allow labels to be longer that 32 characters, but many don't. If the label is too long, then the extra characters at then end of the name are ignored by the compiler.
To assign values to these variables, we use the equal sign. We can also add an 'L', or an 'l', at the end to let the compiler know that the number that follows is a long or long double (if you don't the compiler will assume that all numbers are ints or either floats or doubles)-- that is, we're adding the 'L' suffix to the number (the upper case 'L' is easer to read, and less likely to be confused with the number '1', so we use the upper case 'L' instead of the lower case 'l'). We can also put a zero in front of the number to tell the compiler that we want to enter in an octal number (base 8), or a 0x to tell the compiler that we want to enter in a hexadecimal number (base 16)-- that is to say, we're adding either the, 0, or 0x, prefix to a number. To specify a exponent on a floating point number, you add a 'e' or an 'E' to the end, followed by the exponent. This 'e' is the same thing at "x10" in scientific notation. Graphing calculators use the same notation for floating point numbers. Here are a few examples:
int x=1234; /*you can assign values to to a variable
when it is declared*/
int1=123456; /*a normal integer*/
longint1=1234567890L; /*a long integer*/
hexint=0xAF03; /*a hexadecimal integer*/
octal=0773; /*an octal number*/
floating=1.2342e-10; /*a floating point number*/
pi=3.1415926535897932384L /*a long double */
Caution: It is very important to remember that you must always initialize (assign a value to) a variable before you use it. This is because that variable may contain any random data that was in memory at the time of it's creation. This becomes even more critical when working with pointers as you can cause a program to crash if you do not initialize them before hand (Microsoft operating systems may issue a "general protection fault" error and can become unstable, Linux will generally kill the program if it accesses memory that doesn't belong to it).
Arrays | top variables |
We can also define an array of variables, either in one, two, three, or however many dimensions we wish. The way we do this is by adding brackets at the end of the name of a variable we're declaring. Inside the declaration, we give a number to tell the compiler how many elements are in this array. When we access an element of a variable, we start counting at zero. This means that, if you declare an array of 10 elements, you access all ten elements by using the numbers 0-9. It is very important you remember this as you write your programs to avoid a common mistake beginners make in working with arrays and pointers (pointers will be covered later).
Tricking a program to access elements of an array outside of the size declared by the program is a way crackers use to crack into computer systems. This is called a "buffer overrun attack." Basically, the attempt is to overwrite the binary instructions of a program by sending so much data, that the data overruns outside of the data space of a program, into the code space of a program. So yes, it is possible, however unlikely, to assign some values outside of an array, that will tell the computer to format your hard disk. Most likely, the computer will crash. Linux systems are generally pretty good about detecting this, and on 64-bit AMD system, Linux instructs the processor to "lock" the code space and prevent it from being overwritten by the program. This feature is unique to AMD-64, running in native 64-bit mode, and is not present on other 32-bit 80x86 processors. But, it still doesn't prevent your program from overwriting other data in it's data space.
And as usual, here are a few examples:
/*****************
arrays.c
*****************/
main(){
int a[10]; /*one dimensional array of 10 elements*/
int b[10][5]; /*two dimensional array of 10 rows, and 5
columns-- a total of 50 elements*/
a[0]=0; /* assign 0 to the first element in the array*/
a[9]=5; /* assign 5 to the tenth element of the array*/
b[2][3]=1; /*assign the 3rd row, 4th column, the value of 1*/
a[3]=b[2][3]; /*assign the 4th element of a, with the value
stored at the 3rd row and 4th column of b*/
a[10]=10; /*BAD!!! Attempt to access the 11th element of
a 10 element array will assign some value to
some location in memory that may be in use
by some other variable, or the machine
instructions themselves! */
}/** end main() **/
We can initialize an array when we create it, like so:
int array[10]={1,2,3,4,5,6,7,8,9,10};
int array2d[3,3]={
1,2,3,
1,2,3,
1,2,3
};
We end the closed brace with a semicolon to mark the end of the variable definition. This ending semicolon is required because these braces are not marking a code block, they are just containing an initializer list.
Characters and Strings | top variables |
Traditionally, a single text character occupies one byte of memory (8-bits). In C (as well as in Java), you can expect the char data type to also be one byte in size (in Java, there is also a byte class that you can use). This is important since being able to use some files relies on being able to access individual bytes of a file, such as for graphics and sound, as well as compressed files (gziped, ziped, bziped, etc). Some compilers support a 16-bit character type to represent "wide" or "UNICODE" characters have their own data type for them (many use "w_char," or some other variant), but you can still pretty much expect the char data type to always be one byte in size. We can assign a value to a char data type by either giving it a number (as long as it fits in 8 bits, otherwise the high bits will be lost), or as a character in single quotes such as 'a' (this quotation mark appears on the same key as the double quotation mark on QWERTY keyboards which are used almost exclusively in English-speaking countries). There will be examples later.
Some people have some trouble with the concept that the computer doesn't know whether or not the data stored in the char data type is actually a character, or just data. In fact, to the CPU, its all just data. It is up to the program to decide how to use and represent that data, whether it be as a number, a character, or a pixel on the screen. Just something to keep in mind. On most modern computers, printable text is encoded using ASCII for the lower 128 (0-127) characters and control codes (the upper 128 are determined by whatever character encoding the system is using, and these are called "extended" or "high-bit" characters). An ASCII Table is given in the Appendix at the end of this crash course.
Now a text string in C is just an array of characters terminated by a zero (null). This zero at the end of the string indicates the end of the printable string. This means, if you want to have a string of 80 characters, you need to add one more character to allow for the null-terminator (the zero at the end), so it would have to be a full 81 element array. Now, if your 81 element array is only storing a string of 8 bytes,then the null terminator would be in the 9th element of that array (but still, the entire 81 elements would occupy memory). C++ has a way around this with the String class, but it is a bit slow (as is with Java's string class). You can assign the value of a string to an array of characters by enclosing the entire string in double-quotes, such as: "Hello World." But you should only do this to initialize a string, not not after you have created it, because the data of the string is not copied into the array, the string, in fact, replaces the existing array (more on this when we talk about pointers).
Now for a few examples:
#include <stdio.h> /*Include the standard IO header file*/
int main(){ /*Define function main(), the first function a
program runs, and the programs entry point.*/
char a; /* a single character */
char s[30]="Hello world!"; /*an array of characters */
a=120; /*Assign 120 to a, which also is the value of the
ASCII character x */
a='A'; /*assign the character value 'A' to a (101 in
decimal)*/
s[2]=0; /*Assign the value of 0 to the third element of the
character array s. Now the string will read "He"
because the zero in the third element indicates the
end of the string (The array is still 30 elements
in length).*/
printf("%s\n",s); /*And now we'll prove it! The "%s" tells
printf to print a string and that the
next parameter is a string variable.
This will be explained in more detail
later.*/
} /*the end of function main*/
Pointers | top variables |
Pointers are a special kind of variable. Basically they store the location in memory, or reference, to where some data is stored. They have special uses, but, if care is not taken, they can introduce some very nasty bugs in a program that can be hard to trace. A pointer is defined just like any other variable, except the name of the variable is prefixed by a asterisk '*' to indicate that the variable is actually a pointer to some data of that type. A normal variable can be converted into a pointer by prefixing it's name with a ampersand '&' during assignment of function calls. To convert a pointer to a normal variable, you prefix it with a asterisk '*' during assignment (asterisks are also called "stars", but "asterisk" is the formal name for the character). Confusing? You bet, and it took me a while to figure all this out when I started learning C at 13. But here is an example to try and explain things a little better:
#include <stdio.h>
int main(){
int *ip1,*ip2; /*two pointer, but we can't safely use
them yet.*/
int i1,i2,i3; /*a few normal variables*/
i1=10; /*assign some values to our three integers*/
i2=11;
i3=12;
ip1=&i1; /*Assign the reference to the integer i1 to the
pointer ip1.*/
ip2=ip1; /*assign the value of ip1 to ip2 (which is only a
reference)*/
/*The %d tells printf to display an integer. The stars before
the two pointers tells the compiler to return the value of
the data stored at the reference which is stored in the
pointer.*/
printf("ip1:%d ip2:%d\n",*ip1,*ip2);
i1=100; /*assign 100 to i1*/
printf("ip1:%d ip2:%d\n",*ip1,*ip2);
}
The output will look like:
ip1:10 ip2:10 ip1:100 ip2:100
The reason why the numbers displayed are different even though the ip1 and ip2 weren't changed after the first call to printf is that we changed the value of the data both ip1 and ip2 were pointing to, which was the data in i1.
Whenever you use the name of an array without the brackets to reference a single element, you are using that array just as you would any other pointer.
You can also have a pointer to a pointer. That is a pointer that points to another pointer, that points to some data in memory. To define one of those you use two stars rather than just one. One of the places these are used are for arrays of strings (or an array of arrays). The main function, the first function that is called when your program runs, uses one such pointer to a pointer for holding the parameters that were passed to it when executed.
Pointer Arithmetic | top pointers |
There's another very interesting thing that we can do with pointers: We can add and subtract integers to them (and perform any other mathematical operation with integers). When we do this, we're telling the reference to point to some other location in memory. Now, special care must be taken to ensure that you only access memory that your program has allocated for it's use, otherwise some nasty things can happen, such as your computer crashing or the operating system killing your program abruptly.
The only time you can safely use pointer arithmetic is when your pointer is pointing to an array, such as a string. With strings (a character array, not to be confused by the string class in C++ or Java), we can pretty much assume that, on all computer systems, each element is exactly one byte in length, so we can just add or subtract one to access the next or previous element of that array. We can make no such assumption with any other data type in C. What you can do, is get the size of a data type by the sizeof operator. To use the sizeof operator, you just type something like "sizeof int" and then the compiler will replace that statement with the number of bytes that data type uses (in this case, an integer).
Here is an example:
char s[10],a; /*a character array, and a character variable*/
a=*s; /*assign a the first element of s*/
a=*(s+1); /*assign a the second element of s*/
a=*s(s+2*3);/*assign a the seventh element of s*/
s=s+1; /*Make s now start referring to the second element of the
initial array (now we're only using the last 9 of the
original 10 elements-- but they're all still allocated
in memory).
*(s*666)=a; /*Ummm... cause your program to crash really bad :-) */
Type casting, and cast conversion | top variables |
Unlike languages, such as Java, C and C++ will convert any data type to another (with the exception of pointers) without making a fuss about it, even if it is too big to fit into the, it will just lop off whatever doesn't fit, and put what remains into the smaller data type. So, if you're trying to put a 16-bit integer into a 8-bit integer (a char), only the lower 8-bits of the 16-bit integer will be stored in the 8-bit integer, the higher 8-bits will be lost.
For floating point numbers being converted to integers, the fractional part (everything to the right of the decimal) is dropped and may be rounded to the nearest whole number, and then stored in the integer. If the floating point number has a large positive exponent, then you're out of luck in trying to fit that number into your integer (but the compiler won't complain or issue any warnings if you try).
You can force a data type to be treated as something else. Basically, it is a forced conversion, and we call this type-casting. It can be useful, especially when you're doing some math and you want to ensure all the numbers are being treated the same way. To do this, we specify the data type we want it to be converted to before the variable. Here is an example:
int main(){
double fa=0.12,fb=0.006;
int ia=1,ib=2;
fa=(float ia)+fb;
}
Most of the time you won't have to use pointer arithmetic, but sometimes you can make your code a little faster and cleaner if you use it. It's just an extra tool to add to your programming toolbox.
const for Constant data | top variables |
const is a keyword you can add before specifying a data type to indicate that the data in that variable can not be changed by the program. This means, you must initialize constant variables at the time you create them such as:
const float version=9.9;
You can achieve similar functionality using the #define
compiler directive, which, instead of allocating space for a
variable, simply replaces every occurrence of that defined label
with whatever follows that statement. All #define
compiler directives should appear at the beginning of your
code, right after all #include compiler directives.
#define VERSION 9.9
We'll talk more about compiler directives later, but lest first concentrate on learning how to do something useful with C.
The const keyword finds its use most in function declarations where you want to be sure you don't alter the variables that are passed to it.
Global vs. Local variables | top variables |
In C, we have something we call the scope of a function. That
is, a variable can only be used within the scope it is defined.
In the examples we've provided, all of our variables have been
defined within the scope of the main() function,
and hence, only exist within that scope of that function. As soon
as we leave that scope by exiting the function, or coming to the
end of a code block, those variables defined within that scope
cease to exist.
If you want all of your functions to be able to use a variable, then you will have to define it outside the scope of your functions. To do this, you simply define the variable outside of all functions. If a variable is defined within another code block that conflicts with with the name of a global variable, the local variable is used instead. Here is an example to show how this all works:
#include <stdio.h>
int global=10; /*I'm a global variable*/
int main(){
int xy=14;/*I'm local to main()*/
printf("global:%d xy:%d\n",global,xy);
{ /*start code block*/
int global=60,xy=70; /*local to this code block*/
printf("global:%d xy:%d\n",global,xy);
}/*end code block*/
printf("global:%d xy:%d\n",global,xy);
} /*end main()*/
The output of this program is:
global:10 xy:14 global:60 xy:70 global:10 xy:14
The reason this happens is that all variables have a scope. If there two variables share the same name, the compiler uses the more local variable, hence the data in the other variables are not altered. As soon as we leave a code block, including a function, those variables that were defined within that code block cease to exist. The variables are recreated as soon as we enter a code block again.
Arithmetic | top |
Your calculator can do math, so definitely your computer can. We can perform several mathematical operations in C, such as addition, subtraction, division, multiplication, and modulus division (division, but the result is the remainder of the division). More complicated mathematical operations are performed by other functions in the standard library, along with the ability to work with complex numbers.
The following symbols are used to represent various mathematical operations. Note that there is no power symbol. The power symbol used on many graphing calculators, the carrot '^', is a bitwise xor operator and will give you a very different result than you expect.
| Operator | Operation |
|---|---|
| + | Addition |
| - | Subtraction/Negation |
| * | Multiplication |
| / | Division |
| % | Modulus division (remainder after division) |
The order of operations is just the same as was drilled into your head by your Algebra teacher: Parenthesis, Multiplication, Division, Addition, Subtracting. First, the expressions within parenthesis are evaluated, then multiplication and division, and finally subtractions and additions are performed. It is helpful to add parenthesis even though they are not technically needed to get the same result to make the code easier to read.
If you are familiar with a graphing calculator, these symbols and uses are already very familiar to you. In either case, they are very easy to understand and follow. So rather than boring you to death, and me writing more than I have to, lets move on.
But, we have two special operators for increasing and decreasing a value of a variable by one. We call these the increment and decrement operators and they are ++, and --. They can appear before or after a variable and do not require any assignment operators. Here is a brief example
a++; /*increase a by one*/
a--; /*decrease a by one(/
This is the best way to add one to a number because many CPUs have a special machine instruction that will perform these operations. In fact, if you have to add two to a number, it is better to just increment that number twice than to add two to it. Your code actually will be smaller, and faster.
When the increment or decrement operator is before the variable, then that the increment or decrement operation is performed first /++before anything else. If it comes after, the that operation is performed last. Confusing? Okay well here's a quick little example:
#include <stdio.h>
int main(){
int a=0;
/*first decrease a, then pass it to printf*/
printf("%d",--a);
printf(" %d",a);
/*pass a to printf, then increase a*/
printf(" %d",a++);
printf(" %d\n",a);
}
The code will output:
-1 -1 -1 0
You will see the increment and decrement operators often in loops.
Another set of many operators is the arithmetic assignment operator (also called shorthand operators). Basically, they perform some mathematical operation on two variables, and then store the result in the variable to the left of the operator. Here is a list of these assignment operators, followed by an example (there are others, but we're only going to cover these in this document):
| Operator | Operation |
|---|---|
| += | Addition |
| -= | Subtraction/Negation |
| *= | Multiplication |
| /= | Division |
#include <stdio.h>
int main(){
int a,c;
a=1;
c=3;
a+=a; /*a is now equal to 2 (1+1=2)*/
a*=c; /*a is now equal to 6 (2*3=6)*/
a/=2; /*a is now equal to 3 (6/2=3)*/
}/*end main()*/
You should use the short hand operators, increment, and decrement wherever they fit to help optimizing compilers better optimize your code when converting to machine code.
Making decisions | top |
A program would be pretty worthless if it couldn't make decisions. I'm not talking about complex decisions like whether to have pizza or Chinese food, but decisions based upon the state of the program (the data stored in the variables in the program). So lets discuss how to make a program choose it's own path.
The if, if-else statements, and logical operators | top decisions |
The if statement is pretty straight forward: If such and such, then go do such and such. It's that simple. A more complex form is: if such and such, then such and such, otherwise (else) do such and such. Simple eh? Well, this a little bit of a catch: we need to use something called logical operators to make comparisons between one number and another (or logical tests). Here's a list of them:
| Operator | Meaning |
|---|---|
| == | equal to |
| != | not equal to |
| > | greater than |
| < | less than |
| >= | greater than or equal to |
| <= | less than or equal to |
| || | or (logical) |
| && | and (logical) |
Notice that two equal signs are used for the "equal to" logical test? That's because the single equal sign is the assignment operator and never performs a logical test. The GNU C/C++ compiler will warn you whenever you use the assignment operator within an if statement because, in most cases, you really intended to use the logical "equal to" operator, which is the double equal sign.
When these statements are evaluated, a value of 0 is returned if the statement is false, and a non-zero value is returned if the statement is true (often times it is 1). When all the logical statements are processed, and if the final answer is non-zero then the statement, or code block, after the if statement is processed, otherwise those in the else statement are processed if it exists.
And, of course, we must have an example:
#include <stdio.h>
int main(){
int yaba,daba,poo;
yaba=1
daba=10
poo=-5;
if(1) printf("This will always display\n");
if(0) printf("This never will\n");
else printf("\tBut this will\n"); /*the \t is an escape
code for a tab */
if(yaba!=daba)printf("Yaba doesn't equal daba\n");
if( (yaba<daba) && poo==5){
printf("Yaba is less than daba,");
printf("and poo equals 5\n");
}/*end if( (yaba...*/
}/* end main() */
The switch Statement | top decisions |
You can nest if and else statements all day, but if all you're doing trying to find out if a=1, a=2, or a=3, then you're better off using a switch statement. It takes the following form:
switch(var){
case 1:
do-something;
/*If we don't have a break statement here,
then we'd start processing the code in the
following statement.*/
break;
case 2:
do-something;
break;
case 3:
do-something;
break;
default:
/*var isn't 1,2, or 3. so now we stop here*/
do-something;
break;
}
The equivalent code using if else statements is below:
if(a==1){
do-something;
}else if(a==2){
do-something;
}else if(a==3){
do-something;
}else {
do-something;
}
As you can see, the above switch statement is a lot easier to read.
The switch statement can't compare strings. It only can compare
integers and characters. If you want to compare strings, then you'll
have to use nested if-else statements and the strcmp()
described in the Strings section.
Loops | top decisions |
Well that's cool, but how do we make our program go back and run again? We use loops for that (there is a goto statement, but it is considered very bad form to use in structured programming and is never needed).
The first loop we're going to talk about is the while, and do-while loops, in the next we'll cover the for loop.
The while and do-while loops | top loops |
The while loop takes a similar form to the if statement, except there is no "else." The while loop will continue to execute the statements in the code block that immediately follow the while statement until the logical expression becomes false.
The do-while statement is similar, except it executes
everything between the do{ and the
}while(expression) until
that expression becomes false. Unlike the while loop, the do-while
evaluate the logical expression after it has gone through
the loop at least once.
while(1){
printf("I'll run forever!\n");
}/*end while(1)*/
while( (2+2)==5){
printf("Only runs if your computer fried\n");
}/*end while( (2+..*/
do{
printf("I will be displayed only once\n.");
}while(0)
There are two special statements that are used to break
out of a loop, or to start the loop over again. They are
break; and continue;. I see little
use for the continue; statement, and I've
personally never used it in any program I have written. The
break; statement finds its place in many of the
loops I write, and is the only way to break out of an otherwise
infinite loop (aside from issuing the return statement
which will break out of the current function).
The for loop | top loops |
The for loop is a little bit more complicated in it's syntax, and hence, a little more confusing. It takes the form of:
for( initialization; expression; operation) statement;
The initialization is performed first, the expression is the logical comparison that takes place, and the operation is something that is done every time the loop executes (in addition to the statement, or code block that follows the for loop, if it is present). Usually, the operation is an increment or decrement operation. If the decrement or increment operator is before (prefixed) the variable name, then the increment or decrement is performed first before the logical expression is evaluated. The for loop will continue until the logical expression is false, or it encounters a break statement.
On 80x86 (Intel/AMD and clones), and some other processors, the compiler can take advantage of a special case loop where you're counting a single integer down to zero, and then stops looping when the zero has been reached.
A quick example:
#include <stdio.h>
int main(){
int a,b;
/*We're using a as our counter*/
for(a=0;a<10;a++)printf("%d \n",a);
/*You can initialize more than one variable if you
separate the initializations with commas as so: */
for(a=0,b=1;0;); /*this will never execute, as the
logical expression is false (zero)*/
for(;;){ /*This loop will run forever*/
break; /*unless broken*/
}/*end for(;;)*/
}/*end main()*/
A key point to remember that some people new to C can get confused about: Even though these statements require parameters within parentheses, they are not functions. They are keywords and a part of the basic building blocks of the language. Functions will be discussed next.
Another thing about loops: If you're loop is counting up or down, should never use the == operator for testing the counter. You should always use either, <, >, <=, or >= to prevent your program from going into an infinite loop. If you use the straight equal-to comparison, then, if you accidentally miss that number for some reason, then your loop will continue on forever. Yup, done that more than once.... An that loop was spawning a new instance of it's self each time it looped! My computer didn't crash, it just got VERY slow (I was running Linux at the time, of course).
Functions | top |
Every time you program, there is some code that you will want to run over again, or reuse in another program. This allows you to extend the usefulness of the language. In order to do this, we can write our own functions, and call them rather than having to retype our code over and over again. Using functions also makes for cleaner code. As a general rule of thumb: If you run any code more than once, or think you may want reuse that piece of code in another program, then you should write a function for that code. Functions should be dedicated to a single task, not many, largely unrelated tasks such as:"load database and display list on screen." Function names should also reflect what they do.
C comes with many functions for your use in what is called the
standard C library. These functions perform many
common, and not so common, tasks that allow you to write useful
programs much faster than it would if you had to write them
yourself. Also, because this is a standard library,
you can safely assume that these functions will exist on all systems
that have a standard C compiler. This means that your code will
compile on all systems, whether it be a PC running Windows or Unix, or
a Mac, or even a super computer. Some of these functions you've
already seen, such as printf(). We will be covering many
more in the sections to come, and in the appendix. We will only be
covering the a subset of the functions in the ANSI/ISO C library.
That said, lets write and use our own functions! In fact,
you already have written one function on your own: main().
The main() function is a very special one in that it is
the first function that is called when your program is run. Basically,
it is the starting point of program execution. You can, if you
really want to, call the main() function again, but
that's probably not something you want to do, and can cause an infinite
loop if you're not careful.
To define a function, you need to decide on a function name, using
the same naming rules for variable label names (see Variables section).
ANSI and ISO C standards require that you also give a function prototype
for all functions, besides main(). When you use a header
file, such as stdio.h, it includes many definitions, including function
prototypes so that you may use functions in that library (in this case,
the standard C library). A function prototype
basically tells the C compiler how it is supposed to call a function,
and what type of parameters it accepts.
In addition to parameters, a function can have a return value. The
data type of the return value is given to the right of the function
name, much like you would define a variable. This allows your function
to return some value based upon whatever process it did to the
data inside. If a function will not return a value, the we
specify the void data type. The main()
function always returns an integer. You can, if you like,
put void in the parenthesis of the function definition
if the function has no parameters, but you don't have to (Java doesn't
allow this).
When writing a function prototype, you only need to give the data types that the function takes as parameters, not the label names (but you can if you really want to). Also, the function prototypes should appear towards the beginning of the file. Here is an example to make things a little more clear:
/****
Writing our first functions!
****/
/* Load some function prototypes in the standard C library, including
the one for printf()*/
#include <stdio.h>
/******
int square(int a);
Squares the number a, and returns the result.
*******/
int square(int a); /* we could have also written int square(int);*/
/****
int power(int x, unsigned y);
Returns x raised to the y power
***/
int power(int,unsigned);
int main(){
int num=5,i;
i=square(num); /*square the number num*/
printf("5 squared is %d\n",i);
printf("2 to the 8th power is %d\n",power(2,8));
printf("2 to the 0th power is %d\n",power(2,0));
printf("2 to the 2th power is %d\n",power(2,2));
}/*end function main()*/
/****** square() ******/
int square(int a){
return a*a; /*returns a*a */
}/*end square()*/
/***** power() *****/
/*Because we didn't give label names in the function prototype above,
we can use whatever labels we like to here. */
int power(int num, unsigned power){
int i=power; /*i will be our counter*/
int ret=1; /*ret will keep the result*/
if(power==0) return 1;
for(;i>0;i--)
ret*=num; /*multiply ret by a, and store result in a*/
return ret;
}/*end power()*/
Okay, the boring part is over, now time for the fun stuff!
Console Input/Output | top |
Whew! Now that we've covered the basic elements of the C language, we can start talking about how to make programs that actually do something useful for a change.
One of the things that makes a program useful is it's ability to interact with the outside world. In this case, the user. When we display something on the screen or send it to a printer, we call this output. When we read something from the keyboard, or some other source, we call this input. What we're going to discuss now is console input-- the no frills, plain text input and output that was once our only way to interact with the computer. This interface is also the simplest to program for, so that's the one we will be covering (99% of the programs out there don't need a graphical interface, and most need no human interface at all). Other books cover programming for the windowing systems on different systems (Windows, Unix, and Mac, all have their own weird way of handling GUIs).
You'll hear a few terms: standard output (stdout), standard input (stdin). They basically refer to the standard way a program receives input and output, which is usually through the console (the terminal screen and keyboard). Some operating systems (notably Unix-like operating systems, and, to some extent, DOS and Windows), allows redirecting the standard output to a file, or standard input from a file or another program. That's beyond the cope of this document, so we won't get into that. But they usually use the < symbol to direct input into a program from a file, and the > symbol to direct output to a file. The | symbol (pipe), is used to direct output from one program into another program (On Unix systems, the programs run simultaneously).
We refer to standard input and standard output as stdin and stdout, respectively. These are called input and output streams, or simply I/O streams and are buffered. The buffered nature of these streams isn't important to think about until you're reading data from stdin because the program won't continue running until after the enter/return key is pressed-- even if you're trying to read one character (there are some operating system specific functions which allow you to bypass this buffer and read, and test for text in real time). This means that, after you read one character, extra characters may still be in the buffer. There's a very simple solution to this problem, which we'll talk about in a minute.
Escape codes | top console IO |
You're going to be encountering a lot of escape codes in this section of the form \[some letter]. Until now, I've pretty much been avoiding the subject to discuss other topics, but now seems like a good time to discuss them in more detail.
There are many characters you can't type on your keyboard that you can't otherwise use in your program, such as, how do you put a newline character in a string? Press the return key? How about issuing the bell or backspace character? What if you want to use the double quote character in your string? Well, we use escape codes to do this. And here is a description of these escape codes:
| Code | Meaning |
|---|---|
| \\ | A single backslash |
| \n | newline |
| \r | carriage return |
| \a | Bell |
| \b | backspace |
| \t | Tab |
| \" | double-quote |
| \[newline] | treats two lines of text as one |
| \NNN | octal (base-8) code for a character |
| \xNN | hexadecimal (base-16) code for a character |
| \f | form feed |
| \v | vertical tab |
| \0 | The null character, which has a value of zero |
The bell character, when printed to the screen or an old line printer (the dot matrix or daisy-wheel printers we all used to use ages ago) will usually cause the terminal, or printer, to emit a beep or the terminal may flash instead (or both beep and flash). The bell can be thought of as an "alert" to the user.
The backspace character, when sent to a line printer, or the screen, will move the cursor back one space and usually leaves the text under the character in place.
The tab character will move the cursor over to the next tab stop, which are, by default, ever eight spaces, and so there are ten tab stops on a standard 80-column display.
The form feed character, if sent to a printer, tells the printer to spit out whatever page it currently is on, or spit out a new page if it's not printing on any page. On some systems, the form feed character, if sent to a terminal screen, will clear the screen. Other times, it will do nothing. On PC systems running DOS or windows, it will usually print the Greek symbol for female on the screen. You're not likely to ever need to use this at all.
The carriage return character tells the terminal, and line printer, to move the "carriage", or, in our case, the cursor, back to the beginning of the line. It does not move the cursor down a line.
The newline character moves the cursor down one line. Unix will send a carriage return to the terminal automatically if a new line character is sent to it. C/C++ on Windows systems, will add the newline character when writing to a file in text mode, or when writing to the screen. So, all you need to do, if you want to print on a nice new, clean line of text, is issue the newline character.
And that pretty much wraps it up for escape codes.
Standard Output (stdout) | top console IO |
printf(char *format, ...) | top stdout |
You've already seen the printf() function in action,
so lets explain
it in more detail. printf() allows you to output
formatted, and unformatted text to the screen, and will convert
numbers into printable text for you. The printf()
function has a very special definition which allows you to
specify as many, and whatever type, of variable after the
initial formatting string. That formatting string, which
you provide, tells printf() what the other
parameters you passed it are, along on how to display them.
The format string can have whatever text in it you like,
but every time it comes across the percent sign, %, it treats
it as an escape character. The characters that follow the
escape character tell
printf() what type of data to display (another string,
a single character, or a number), and how to display it. If you want
to just display a percent sign by it's self, you have to give
two percent signs as so to display just one: %%. A letter following
a percent sign indicates what type of data is to be displayed (s for
string, d for signed integer, etc).
A positive number x between the percent sign, and the letter specifying the data type, indicates that the text should be right justified to fit in a minimum x characters. For left justification, you can specify a padding character (like period, star, and many others) to use between the percent sign and the number x (you can not do this with a negative number). A negative number indicates right justification (what you normally see on a printed page). Confused? Yeah, it took me a while to figure it out too.
Here's a partial list of escape codes, and what they do, followed by an example:
| Code | Function |
|---|---|
| %s | display a string |
| $d or %i | display a signed integer |
| %u | display an unsigned integer |
| %ld | display a signed long integer |
| %lu | display an unsigned long integer |
| %lld | display a signed long long integer |
| %h | display a short integer |
| %hu | display an unsigned short integer |
| %f | display a float or double |
| %L | display a long double |
| %dx | Display an int in hexadecimal (lower case) |
| %dX | display an int in hexadecimal (upper case) |
| %e | display a float in scientific notation |
| %E | same as above, but the 'e' is upper case |
| %'.2f | display a float, rounding up to the nearest 1/100th place |
| %5.2f | Same as above, but also left justify to fit in a minimum space of 5 characters |
| %8d | left justify an int to fit in 8 characters |
| %08d | Same as above, but pad space with zeros |
| % 8d | Same as above, but pad with spaces |
| %-10d | Right justify the number to fit in 10 characters |
| %% | displays a single percent sign |
#include <stdio.h>
int main(void){
long num1=12345;
long num2=123456789;
double num3=12345.6789;
double num4=123456789.123456789;
char *string="Hello!";
printf("\n123456789+123456789+123456789+123456789\n");
printf("%9d % 13d\n",num1,num2);
printf("%0-13d%08d\n",num2,num1);
printf("%15s\n",string);
printf("%f %'.2f %e %'.4e\n",num3,num3,num4,num4);
}
And it's output should look like this:
123456789+123456789+123456789+123456789
12345 123456789
123456789 00012345
Hello!
12345.678900 12345.68 1.234568e+08 1.2346e+08
puts(char *s) | top stdout |
puts() prints a single string to the stdout, along
with a newline character (so we don't need to add that '\n' escape
code when printing). And, all you have to do to use it is this:
#include <stdio.h>
int main(){
char *s="World!"
puts("Hello");
puts(s);
}
putchar(int ch) | top stdout |
Prints a single character to a string. Nothing fancy. Here's an example:
#include <stdio.h>
int main(){
char *s="World!"
int i;
/*print all the characters in the string, until the
end is reached, as indicated by the null-terminator
which is zero, or \0)*/
for (i=0;s[i];i++)
putchar(s[i]);
putchar('\n');
}
Standard input (stdin) | top console IO |
scanf(char *format, ...) | top stdin |
scanf() is used to read data in from stdin, and
also takes a similar format to printf. The catch here is that
all parameters passed to scanf() must be pointers.
This allows scanf() to alter the data in the
parameters passed it. How can a function do this? Well, here's
a quick example.
#include <stdio.h>
void a_func(int); /*we must have a function prototype */
void b_func(int *);
int main(){
int i;
i=10;
printf("%d ",i);
b_func(&i);
printf("%d ",i);
a_func(&i);
printf("%d\n",i);
}
b_func(int i){
i=100;
}
a_func(int *i){
*i=12345;
}
If you compiled and ran this program, the output would be:
10 10 12345
Remember, to convert a non-pointer variable into a pointer, you prefix the name with a ampersand &. Arrays (including strings) are treated as pointers if you're referring to the entire array, and not a single element of it (given array ar, ar would refer to the entire array while ar[2] would just refer to one element of it).
scanf() also takes a format string, but it's used, not
for printing formatted text to the screen, but describe how data
should be read in. Here is a partial list of flags:
| Code | Function |
|---|---|
| %s | read a string up to the first whitespace |
| %d | read an integer |
| %l | read a long integer |
| %f | read a float |
| %u | reads an unsigned integer |
And a quick example:
#include <stdio.h>
int main(){
int age;
char name[40];
printf("What is your name?");
scanf("%s",name);
printf("What is your age?");
scanf("%d",&age); /*We must pass a reference to scanf
for non-pointers*/
printf("\nHello %s!",name);
if (age>=50)
printf("Wow, you're an antique!");
else if (age<=16)
printf("Hey there genius!");
printf(" See ya!\n");
printf("\n\nRead age as:%d read name as:\"%s\"\n",age,name);
}
If you were to run this program, then you'd get the following output (Text that you type in is in bold):
What is your name?Douglas What is your age?26 Hello Douglas! See ya! Read age as:26 read name as:"Douglas"
Cool eh? Well, remember that little note about buffered I/O, and, if you don't read all characters in from the buffer, there's still data left in it? Well, here's a good example of problems that can cause if we ran our program with different inputs:
What is your name?Johnny Doe What is your age? Hello Johnny!Hey there genius! See ya! Read age as:0 read name as:"Johnny"
What happened? Well, scanf() stopped reading character
after the first white space, and so "Doe" was left in the buffer
when it came time to read the age in. Since "Doe" isn't a number,
scanf just barfed and left a zero in age.
It gets worse:
the input stream is now corrupted because scanf()
didn't get what it was expecting, and now, whenever you try
reading input, scanf() will still be throwing a
temper-tantrum, and you won't be able to read any more data in.
This is a problem with pretty much all C and C++
compilers that I've used (including, when using C++
cin interface), hence I never use scanf() in any
program I write (and only use std::cin for reading
a single character or string in C++).
So what is the solution, well, use different functions (and write your own utility functions)!
char *gets(char *s) | top stdin |
gets() reads a string of text in from the terminal,
and puts it into s. It will continue reading until
a newline is encountered (as generated by the return or enter key),
or until the end of the input stream is reached (EOF
end of file/input).
gets(), like scanf()
performs no bounds checking, that means if you type more characters
than your character array can hold, it will begin overwriting other
data in your program, so make sure your array is
big enough to hold the data you want to read (or use a safer function).
In fact, GCC will warn you that you should use a different function
when you compile your code using it.
#include <stdio.h>
int main(){
char name[255];
printf("What is your name?");
gets(name);
printf("\nHello %s!\n",name);
}
And it's output:
What is your name?Johnny Doe Hello Johnny Doe!
Some of the example code in this crash course will use the gets()
function even though it is unsafe. This is only for simplicity's sake; however,
if you make a your own program for real use, you should use one of the
other functions below, or write your own safer function.
char *fgets(char *s, int size, FILE *stream) | top stdin |
This function was designed for reading from any stream,
including files. It works similar to gets(),
except it will not read any more characters than size
characters, leaving the rest in the buffer, and it also
reads in the newline character. This means that
fgets() is safer to use than gets()
when reading input.
FILE isn't a data type, it is actually a defined type that contains information about a stream. You can make your own streams when you open a file, however, a few streams already exist when you start your program. They are stdin, stdout, and stderr. We've already talked about stdin and stdout. stderr is another output stream, but is is used for printing errors to the screen. On Unix systems, text sent stderr will still display to the screen, even if the output from stdout is being redirected to a file or another program.
Okay, now for a little example:
#include <stdio.h>
int main(){
char name[50];
printf("What is your name?");
fgets(name,50,stdin);
printf("\nHello %s!\n",name);
}
And it's output:
What is your name?Johnny Doe Hello Johnny Doe !
Wait a second, what did the explanation point appear on a new line? That is because fgets also reads in the newline character, gets strips the newline character off the end. If you don't want to have the newline character at the end, you'll have to remove it as a separate step.
int getchar() | top stdin |
Reads a character from stdin, pretty simple.
#include <stdio.h>
/*****
sgets(char *s, int size)
A safer gets.
s is the string to be read
size is the size of the string
*****/
char *sgets(char *,int);
int main(){
char name[30];
printf("What is your name?");
sgets(name,30);
printf("Hello %s!\n",name);
}
char *sgets(char *s, int size){
int i;
char ch;
ch=getchar();
/*We have size=size-1, because we need space to add the
null terminator. */
size--;
for(i=0;ch!='\n' && i<size;i++){
/*make sure we haven't reached the end of
the input stream*/
if(feof(stdin))break;
s[i]=ch;
ch=getchar();
}
s[i]=0; /*add the null terminator*/
/* If we stopped short of reading up to the end of the
input line, clear the extra data in the buffer */
if(ch!='\n' && !feof(stdin)){
for(;ch!='\n' && !feof(stdin); ch=getchar());
}
return s;
} /*end of sgets()*/
Now your program output will be the same as it would be for
gets() and it will be safe to use. Guess what, you
just wrote yourself a safe version of gets()!
Now you can add this code to your own personal library
of functions to use, and reuse over and over again.
Converting text to numbers | top console IO |
Every now and then, we may want to convert a text string to a number. C has a few very useful functions for doing this. They are pretty straight forward, and, you should know enough about C now to know how to use the from their function prototypes:
int atoi(char *)
long atol(char *)
long long atoll(char *)for C99 compliant compilers.
double atof(char *)
There are other ways, but these are the easiest, and most useful
functions to use. the strtol(), strtod(),
strtoll(), strtold(), strtof()
and other like-named functions, allow for more options. They take
the form of:
float strtof(const char *nptr, char **endptr);
long int strtol(const char *nptr, char **endptr, int base);
Some of these functions are described by the ANSI C standard,
others, including stroll() and atoll() are
defined by the ISO C99 and POSIX.1 (1996) standards. In order to use
these functions, you
need to also include the stdlib.h header file which includes
the prototypes for these functions.
Writing your own function libraries | top |
Now that we've gotten this far, you're probably wanting to write
your own software library: a collection of functions for you to use,
and reuse in your programs. After all, who wants to keep typing
in the same code over and over again for use in their programs?
Remember that sgets() function we wrote above? Wouldn't
you like to reuse that function in your programs without having
to type the code, and function prototype, over again if you
want to use it in your own programs? Well, writing your own
software libraries lets you do that.
Writing software libraries makes your life a lot easier as a program, and can make your code look cleaner by putting useful functions, or class of functions, into their own files. Even better, there isn't much extra you need to know to write your own!
To write your own software library, you need to write two files:
one is the header file (those ".h" files we've been including in
our programs) and a source file (the ".c" files we've been writing and,
up until this point, all contained a function main()).
Since function libraries are to be linked to another program,
like the standard libraries, you do not write a main()
function. Header files contain the function prototypes, and other
definitions required by your library. It's also a good idea to
describe how to use the functions in your library at the top
of your library file for quick reference later. You must
include this file in both your library source code, and in
the source code of the program that is to use your library.
Compiler Directives and Header Files | top writing libs |
Before we go on, let's talk a little bit about the
#include compiler directive, and compiler directives
in general. Compiler directives aren't part of your C program, that
is, they aren't converted into machine code later. They are
processed by the compiler and, depending on which one's you use,
will alter the way the compiler processes your code. All compiler
directives begin on a line by themselves, and begin with the
the pound sign '#'.
The #include compiler directive tells the compiler
to literally include another file in your program as if it
were just one continuous file. It takes two forms:
#include <filename.h>
#include "filename.h"
The difference between the two is subtle. The first one will look only in the include file search path for filename.h, which normally only includes those include files required by the standard, and other system libraries. The second one will look for the file, first in the current directory, and then in the library search path. When using your own libraries, you'll want to use the second form and put the include file in the same directory as your file, or within the program's source directory tree.
The #define compiler directive takes the following
forms:
#define LABEL blablabla
#define MACRO(x) x*2
What this tells the compiler is, on every occurrence of LABEL, replace LABEL with whatever the text following it is in your program (the exception is in other compiler directive statements, and in quoted text making up strings.
The macro form isn't a function, even though it looks like one. The parameters you give to a macro tell the compiler to replace those labels in the macro with those parameters and then insert in in your program. So if I typed MACRO(bee), then it would replace MACRO(bee) with bee*2 in my program. Macros have their place, but it is better to use functions whenever possible.
One use for a macro might be, to simplify calling the
fgets()
function to get input from stdio. Here's and example:
#define fgetsstd(string,size) fgets(string,size,stdin)
Now, you only have to use two parameters to use the
fgets()
function, because this macro inserts the last one, stdin, for you
when you use it. Few programmers make use of macros, but they do
have there place, every now and then. They are also a good
thing to have in header files, and requires no additional
code.
I've used macros most when I want to make sure my program can compile on different systems where a few things might be a little different from compiler to compiler. One example is a 1990 C++ compiler, and a C++ compiler that follows the new 2000 ISO standard. They can also be used to make programs compile, and work, on different operating systems. For instance, Windows, Mac OS X, and other Unix systems keep user files in different places. You could alternately configure various defined and constant values to account for these differences.
There are a few extra compiler directives that you will want to
use: #ifndef and #endif. These are
conditional compiler directives. If the the statement is evaluated
to be true, then the code between the #if and
#endif will be processed by the compiler, otherwise
it will be ignored just as comments in your program are
ignored by the compiler.
There are many other compiler directives, but
we won't be covering them here. Almost any respectable book
on C will cover most of these compiler directives.
By using these tags, can ensure that your header file is only
processed once by the compiler (processing it multiple times in
the same code will certainly result in an error), and thus is
a good practice. Let's make a header file for the sgets()
function we showed above (save it as safeio.h):
/****
safeio.h -- Safer I/O functions
char *sgets(char *s, int size)
Reads a string of size [size] and stores it in [s], returns
a pointer to the read string. The trailing
long sgeti()
Reads in a integer from stdin.
double sgetf()
Reads in a float point number from stdin.
char sgetc()
Gets a single character from stdin, and removes all
trailing characters from stdin to the first newline in the
buffer.
char spromptc(char *prompt)
Prints [prompt] to stdout, and reads in a single character
from the stdin, and removes all training characters to
the first newline from the buffer.
char *sprompts(char *prompt,char *string,int size)
Prints [prompt] to stdout, and read in a string of size
[size] and stores it in [string]. Returns a pointer
to the string stored string.
****/
#ifndef __SAFEIO_H__ /*is __SAFEIO_H__ defined? */
/*Nope, so lets process the rest of the file*/
/****
An empty definition only to serve as a indicator that this
file has already been processed by the compiler at least once
before. The label name should contain the name of the header
file. If it was "zippy.h"
then "__ZIPPY_H__" would be the label you should use. This
convention avoids confusion, and possible conflicts later.
We can't use periods in labels, so we use the underscore instead.
****/
#define __SAFEIO_H__
/*Since our library requires functions in the stdio and stdlib
libraries, then then we should also include the required header
file here.*/
#include <stdio.h>
#include <stdlib.h>
/*our function prototypes*/
char *sgets(char *,int);
long sgeti();
double sgetf();
char sgetc();
char spromptc(char *);
char *sprompts(char *, char *, int);
#endif /*forget this, and the rest of your program won't compile!*/
That's it! Nothing to it!
If you wanted to have a global variable
accessible by other programs that use your library, you need to
declare that variable in your source file, and again in your header
file. In your header file, you prefix the declaration with the
extern keyword. This is the only way to let the
compiler know that the variable exists outside of the source file
you defined it in.
So, if in your source file, you have a global variable you defined as:
const char *zippy_version="ZippyLIB Version 10.7";
You would add the following to your header file, if you wanted programs that use your library to access it:
extern const char *zippy_version;
When writing header files be careful: If you mess up something in your header file, your compiler will spit out a screen full of errors. So, if that happens, and you know the problem isn't in your source code file, then check your header file for syntax errors.
One thing you can be sure of whenever you write your own programs: You're going to make mistakes somewhere, even in small programs. Like, when writing the code in this section, I had to go back and fix a few things. The more you program, the faster you'll get at finding errors, and fixing them, so don't get discouraged if it takes you hours to find one little character you missed.
Finding syntax errors | top header files |
If the compiler finds a syntax error, it will give you hints on finding those errors, but sometimes the compiler can get confused and not be able to give you any help. All it will know is that you made an error somewhere. If you're using a text editor that has syntax highlighting, such as jEdit (www.jedit.org) or Vim (www.vim.org), then it can make finding some syntax errors faster (if the colors don't look right, then you know you made a type-o).
Here is a few hints on finding those kinds of syntax errors:
Most hard to find errors are from failure to close a statement correctly, and that's what really confuses a compiler. Vim will highlight mismatched parenthesis and braces in red to help you find those errors while programming. It can't tell you where the error is, only that one exists.
Aside from syntax errors, there are "logical" errors. Those are program bugs that result from a mistake in reason. No compiler can figure those out for you-- if they could, then they would be able to write programs for you :-).
One thing you can do to find logical errors, is to put "printf" statements in your code to let you know what stage your program is in before it barfed. It can help a lot, especially in loops and conditional statements.
Writing and Using Function Libraries | top writing libs |
Okay, now that we have our header file that we typed in above, lets write our function code (you'll want to call this file safeio.c):
#include "safeio.h"
/***************
** sgets() **
***************/
char *sgets(char *s, int size){
int i;
char ch;
ch=getchar();
/*We have i=size-1, because we need space to add the
null terminator. */
size--;
for(i=0;ch!='\n' && i<size;i++){
/*make sure we haven't reached the end of
the input stream*/
if(feof(stdin))break;
s[i]=ch;
ch=getchar();
}
s[i]=0; /*add the null terminator*/
if(ch!='\n' && !feof(stdin)){
for(;ch!='\n' && !feof(stdin); ch=getchar());
}
return s;
} /** end of sgets() **/
/***************
** sgeti() **
***************/
long sgeti(){
char s[80];
/*read in a string, convert it to a long integer*/
return atol(sgets(s,80));
}/** end sgeti()**/
/***************
** sgetf() **
***************/
double sgetf(){
char s[80];
/*read in a string, convert it to a double*/
return atof(sgets(s,80));
}/** end sgetsd()**/
/***************
** sprompts() **
***************/
char *sprompts(char *prompt, char *s, int size){
printf("%s",prompt);
return sgets(s,size);
}/** end sprompts() **/
/***************
** sgetc() **
***************/
char sgetc(){
char ch,c;
c=ch=getchar();
/*Uh oh, no more data in the input stream*/
if(feof(stdin))return 0;
/*Keep reading characters until we reach the end of the
line, or the end of the buffer.*/
for(;c!='\n' && !feof(stdin);c=getchar());
return ch; /*Return the first read character*/
}/*end sgetc()*/
/***************
** spromptc() **
***************/
char spromptc(char *prompt){
printf("%s",prompt);
return sgetc();
}/** end sprompts() **/
Whew! all done. Now that we have our library all written, lets write ourselves a program to test it. Call this file safe-test.c:
/**
safeio-test.c -- a program to test out our safe I/O library.
**/
#include <stdio.h>
#include "safeio.h"
/*Why not see the #define directive in action :-)*/
#define STRSIZE 80
int main(){
char s[STRSIZE];
long i;
double d;
printf("sgets:");
sgets(s,STRSIZE);
printf("\tI got \"%s\"\n",s);
printf("sgetc:");
s[0]=sgetc();
printf("\tI got %c\n",s[0]);
printf("sgeti:");
i=sgeti();
printf("\tI got %ld\n",i);
printf("sgetf:");
d=sgetf();
printf("\tI got %f\n",d);
sprompts("sprompts:",s,STRSIZE);
printf("\tI got \"%s\"\n",s);
s[0]=spromptc("spromptc:");
printf("\tI got %c\n",s[0]);
}
Now it's time to compile this code. If you copied and pasted the code into your text editor, then there should be no errors at all. How you compile your code depends entirely on the compiler you're using. In Borland Builder and Microsoft Visual C++, you have to go through several steps in creating a project, adding files, etc to use a library. Using command line development tools, like GCC and MinGW, it's a lot simpler. The instructions here are exactly the same for Mac OS X Xcode, MinGW, Linux, BSD (using GCC), and all other systems using GNU tools.
The first thing you want to do is compile safeio.c, but not link it. Instead, you want the compiler to generate an object file. GCC gives object files the .o extension (Borland, Microsoft uses the .obj extension). To do this, you need to use the "-c" flag when compiling and specify the output file as safeio.o.
Caution: Make sure you do not use the ".c" extension for your output file, otherwise you will overwrite your source code file!!
The next thing that you want to do is compile your test program, and link it to your new safeio.o object file. So here is everything that you type:
gcc -c safeio.c -o safeio.o gcc safeio-test.c safeio.o -o safeio
Now you're all set to run your test program! On Unix systems, you'll probably need to specify the current directory when running a program (unless you've added "." to your path). So type ./safeio to run your program.
If you're using MinGW, you will want to use the MSYS shell to compile your program, but you don't want to run your program in the MSYS shell because of a bug in the way it handles console IO. Instead, you want to open a command prompt by running either cmd.exe (Windows NT/XP) or command.com (All Microsoft operating systems), or you can run the program from explorer, by locating it, and clicking on it.
Makefiles | top writing libs |
There's another way to compile your programs that will make your life easier, GNU has a handy program called "make" which, based on what's inside of your Makefile, will compile your program for you with one command, make. Also, it will only recompile those parts of your program who's files it depends on have changed since the last time you ran make.
A Makefile (the filename should always be "Makefile" and be in the same directory as your program and code) takes the following form:
# I'm a comment
File-To-Make: File It depends on
commands needed to make File-To-Make
File-To-Make2: Files It Depends On
commands needed to make File-To-Make2
The indentation you see there isn't a bunch of spaces, it's actually a single tab. You must always use one or more tabs to offset the commands needed to make your file otherwise GNU make will give you an error like this:
*** missing separator (did you mean TAB instead of 8 spaces?).Some text editors (not any of the ones I've mentioned) convert tabs to spaces-- You don't want to do this. Borland's Builder editors can strip out tabs and convert them to spaces, so use Notepad, Vim (or any Vi clone, like Elvis), or jEdit instead. Here's a Makefile for the above program (save it as Makefile):
# Makefile for safeio-test
safeio: safeio-test.c safeio.o safeio.h
gcc safeio-test.c safeio.o -o safeio
safeio.o: safeio.c safeio.h
gcc -c safeio.c -o safeio.o
What's neat about this is that, if you change safeio.c, then all of the code is recompiled. If you just change safeio-test.c , then only safeio-test.c is recompiled and linked to make the safeio executable. This isn't such a big deal for a small program like this, but, when you have many source files, the this really cuts down on the recompile time of your program if you just make changes to only a few source files. It also dramatically cuts down on the amount of typing you have to do to recompile your program, and reduces the chances you'll make a mistake when doing so.
Now that you have your makefile, and have named it "Makefile", you can compile and run your program with the following commands:
make ./safeio
Even though that is a neat little library, I will not use the
functions in that library without including them in the source
listings below to avoid confusion. Instead, for simplicity,
and example purposes only, I'll be using the unsafe
gets() function.
Strings | top |
The Standard C Library has a several useful functions for working with strings. They are:
char *strcpy(char *dest, const char *src);
char *strncpy(char *dest, const char *src, size_t n);
int strcmp(const char *s1, const char *s2);
char *strcat(char *dest, const char *src);
size_t strlen(const char *s);
There are more, but these are the only ones we will cover here.
In order to use these functions, you need to first include the string.h header file.
char *strcpy(char *dest, const char *src) | top strings |
The strcpy() copies the contents of src into
dest. If you want one string to hold the contents of
another, you can't do it using the assignment operator '='. This
is because, if you use the assignment operator, all you're doing
is copying a reference to a pointer, not the data stored in it.
The only exception to this is when you're initializing a string as
such:
char *name[80]="Ima String";
Here, we've created a string with the contents, "Ima String", and the array name is still 80 elements long. Now, if we later do this:
name="Ima String";
We've assigned the reference of the string containing "Ima String"
to name, which is an array of only 11 elements.
We did not copy the string into the existing character array, instead,
we just copied the reference into the name pointer which
was holding a reference to the 80 element character array we defined before.
So, a better
way to copy this string would be to use the strcpy
function:
strcpy(name,"Ima String");
char *strncpy(char *dest, const char *src, size_t n) | top strings |
strncpy() is almost the same as strcpy()
except that it will only copy up to n bytes of
src into dest and hence, is a safer function
to use than strcpy() because it will prevent you
from copying more data into the array than it has space to hold.
However, if the length of the string src is bigger
than the value of n then the null-terminator will not
be added to the end of dest. This means, you
will have to add it on your own (which is easy to do).
size_t is a defined data type used for storing the size of things in memory, or on disk. Normally, it is a defined type of the largest integer data type the compiler has. A definition for the size_t type might look like:
typedef size_t long long;
This is how you create a user defined data type. We won't be
discussing any more about the typedef keyword in this
crash course.
char *strcat(char *dest, const char *src) | top strings |
The strcat() function, joins, or concats, two stings
together. Basically, tacks on str onto the end of
dest.
int strcmp(const char *s1, const char *s2) | top strings |
This function returns 0, if the strings match, less than zero if s1 is less than s2, and greater than zero if s1 is greater than s2. This can be troublesome because it doesn't return a boolean true (a non-zero) if the strings are equal. Instead, it returns a value that can assist in determine how one string relates to another, and hence assist in sorting strings.
Here is some code that illustrates this:
#include <stdio.h>
#include <string.h>
int main(){
char s1[80]="",s2[80]="";
for(;;){
printf("Enter a string (type \"exit\" to exit):");
gets(s1);
/*check to see if the user typed in exit, exit loop
if he did*/
if(!strcmp(s1,"exit"))break;
printf("Enter another string:");
gets(s2);
if(strcmp(s1,s2)<0)
printf("\"%s\" < \"%s\"\n", s1,s2);
if(strcmp(s1,s2)>0)
printf("\"%s\" > \"%s\"\n",s1,s2);
if(strcmp(s1,s2)==0)
printf("\"%s\" == \"%s\"\n",s1,s2);
}/*end for*/
}
The output looks like this:
Enter a string (type "exit" to exit):s1 Enter another string:s2 "s1" < "s2" Enter a string (type "exit" to exit):aa Enter another string:bb "aa" < "bb" Enter a string (type "exit" to exit):bb Enter another string:cc "bb" < "cc" Enter a string (type "exit" to exit):bb Enter another string:aa "bb" > "aa" Enter a string (type "exit" to exit):aa Enter another string:aa "aa" == "aa" Enter a string (type "exit" to exit):a Enter another string:a "a " > "a" Enter a string (type "exit" to exit):aa Enter another string:a "aa" > "a " Enter a string (type "exit" to exit):a Enter another string:aa "a " < "aa" Enter a string (type "exit" to exit):exit
size_t strlen(const char *s) | top strings |
You've probably already guessed it: This function returns the length of a string.
/*Removes the newline from a string, if it exists*/
length=strlen(string)-1;
if(length>=0){
if(string[length]=='\n')string[length-]=0;
}
Structures | top |
A structure allows us to clump a lot of different data into one nice neat container, and reuse that container over and over again. To access variables inside a structure you need to use the dot-operator ".". If you're accessing a pointer to a structure, then you need to use the arrow-operator "->" to access elements of that structure. To define a structure. To define a structure, we can do it in a few ways:
/*Create a definition for the Record structure*/
struct Record{
char name[40];
int age;
};
/*Now lets create a few Record structures to use in our program*/
struct Record rec1;
struct Record rec2;
struct Record recs[20]; /*An array of Record structures*/
When we define the Record structure above, we've basically created a new data type, but have not used that data type yet, until we declare the rec1, rec2, and recs variables below it.
Another way is such as:
struct {
char name[40];
int age;
}rec;
With this second example, we've created a structure variable, and space has been created for it. We can't create any more structures using this definition because we haven't given the structure a name. We've only given it a variable name to use.
struct Record{
char name[40];
int age;
}rec;
Here is a little different, we've declared the rec variable, and also given the structure a name. We can now declare more Record structure variables.
Here's how access members of this structure:
struct Record rec;
struct Record *r;
r=&r;
rec.age=13;
strcpy(rec.name,"Johnny Doe");
printf("%d",r->age);
We can also have structures in structures:
struct {
char name[40];
struct {
char year;
char month;
char day;
}birthday;
}rec;
rec.birthday.year=94;
rec.birthday.month=1;
rec.birthday.day=2;
Or like this:
struct Date{
char year;
char month;
char day;
};
struct Record{
char name[40];
struct Date birthday;
};
Here is a simple phone book program that you can write using structures to store data:
/*******************
phonebook.c -- a simple phone book program demonstrating the use
of structures.
********************/
#include <stdio.h>
#include <string.h>
#define ENTRIES 20
struct Record{
char name[40];
struct {
unsigned char year; /*years after 1900*/
char month;
char day;
}dob;
char phone[20];
}records[ENTRIES];
char chget();
void display(struct Record *);
void edit(struct Record *);
void list();
/*************
** chget() **
*************/
char chget(){
char ch,x;
x=ch=getchar();
for(;x!='\n';x=getchar());
return ch;
}/*end chget()*/
/************
** main() **
************/
int main(void){
char cmd;
int index;
/* Initialize array*/
for(index=0;index<ENTRIES;index++){
records[index].name[0]=0;
records[index].phone[0]=0;
records[index].dob.year=0;
records[index].dob.month=0;
records[index].dob.day=0;
}/*end for(..I<ENTRIES,, */
index=0;
do{
int x; /*a temporary variable*/
printf("\nRECORD %02d OF %02d\n",index,ENTRIES-1);
display(&records[index]);
puts("\n(E)dit Entry");
puts("(N)ext Entry");
puts("(P)revious Entry");
puts("(L)ist entries (brief)");
puts("(Q)uit");
printf("\nCommand:");
cmd=chget();
/*process the command*/
switch(cmd){
case 'E':
case 'e':
edit(&records[index]);
break;
case 'N':
case 'n':
index++;
/*loop around if we've come to the
end*/
if(index>ENTRIES)index=0;
break;
case 'P':
case 'p':
index--;
/*loop around if we've come to the
end*/
if(index<0)index=ENTRIES-1;
break;
case 'L':
case 'l':
list();
break;
case 'Q':
case 'q': break;
default:
puts("BAD COMMAND");
break;
} /*end switch(ch)*/
}while(cmd!='q' && cmd!='Q');
} /*end main()*/
/************
** list() **
************/
void list(){
int x;
puts("*** LISTING ***");
for(x=0;x<ENTRIES;x++){
printf("%02d:%-40s|%-20s|", x,records[x].name,
records[x].phone);
/*we need to type cast these as ints to ensure they
are passed to printf integers*/
printf("%04d-%02d-%02d\n",
(int)records[x].dob.year+1900,
(int)records[x].dob.month,
(int)records[x].dob.day);
}/*end for()*/
printf("** Pause **");
chget();
}/* end list*/
/***************
** display() **
***************/
void display(struct Record *r){
int x;
/*print out a bar of stars*/
for(x=0;x<60;x++)putchar('*');
putchar('\n');
printf("Name: %s\nPhone:%s\n",r->name,r->phone);
printf("DOB:%04d-%02d-%02d\n", (int)r->dob.year+1900,
(int)r->dob.month, (int)r->dob.day);
for(x=60;x;x--)putchar('*'); /*draw a bar of 60 stars*/
putchar('\n');
}/*end display()*/
/************
** edit() **
************/
void edit(struct Record *r){
char s[128];
int i;
printf("Name:");
/*You should never use gets() in a program that matters */
gets(s);
s[39]=0; /*Ensure the null terminator is in the right place*/
strcpy(r->name,s);
printf("Phone:");
gets(s);
s[19]=0;
strcpy(r->phone,s);
puts("--Birthday--");
printf("Year:");
gets(s);
i=atoi(s);
if(i>=1900)i-=1900;
r->dob.year=i;
printf("Month:");
gets(s);
r->dob.month=atoi(s);
printf("Day:");
r->dob.day=atoi(gets(s)); /*Yup, you can do that too*/
}/*end edit()*/
File I/O | top |
After a program quits, all of it's data is lost, and you'll have to go enter that data back in again if you want to use it again unless you save and load that data to and from a disk, or other mass storage device. That's where file I/O comes in. There are several facilities for writing to and from files, and manipulating them, but we'll only be covering basic buffered file I/O here.
Before you can access a file, you must open a stream for reading, writing, or reading and writing to a file. After you are finished with that file, you must close that stream. Unlike Java, when you close a file stream, all the data is automatically written to disk, you don't have to flush the data to disk as a separate step (I don't know why it is this way-- I think the people who wrote Java had a few "magic Java beans" in their Java [coffee] mix).
Let's talk about the functions you'll be using, and then write a program that utilizes some them.
File I/O Functions | top File IO |
FILE *fopen(const char *path, const char *mode) | top file functions |
Opens a file, and returns a pointer to that stream if successful, otherwise it returns a null pointer (zero).
path is the path to the file (including the file name), and mode describes the access mode for the file. Remember, Windows, and other Microsoft operating systems, use the backslash '\' for the directory separator. Unix systems, Mac OS X, and basically the majority of all other operating systems in existence, use the forward slash '/' (the slash below the question mark) for the directory separator (URLs on the Internet also only use the forward slash).
Here is a list of values you can use for mode:
| Mode string | Access Mode |
|---|---|
| r | Open an existing text file for reading |
| r+ | Open, or create, text file for reading and writing. |
| w+ | Create, or overwrite, a text file for reading and writing. |
| w | Create. or overwrite, a text file for writing. |
| a | Write to the end of a text file. |
| rb | Same as r, except it's a binary files |
| r+b | same as r+, except for binary files |
| wb | same as w, except for binary files |
| ab | same as a, except for binary files |
When you open a file as a text file, as opposed to a binary file, additional translations occur for that system. In the case of Microsoft Systems, and some very early operating systems, the carriage return character is removed from the input stream on reading, and added to the output stream on writing. Unix systems require no such translations, hence the 'b' option is ignored on Unix systems.
Caution: Some operating systems have restrictions on what characters you may use in a file name (Unix has no such restrictions). Operating systems written by Microsoft, don't allow "*?+\/" characters for use in filenames (the '+' was reserved because they wanted to use for the "copy" command to combine two files together-- it also was a command that was hardly ever used), as well as many others. You shouldn't use "*?\/" characters on any system for file names, even if they are allowed, you should also avoid the use of "$&%!" characters as well.
int fclose(FILE *stream) | top file functions |
Closes a stream you opened with fopen() returns
0 if successful. Trying to use that stream again after it has
been closed, without opening it again with fopen()
could have any number of unknown results. In other words: It's
a really bad idea to use a stream after you have closed it.
int fgetc(FILE *stream) | top file functions |
Reads a single character/byte from the stream
char *fgets(char *s, int size, FILE *stream) | top file functions |
Reads up to size bytes of a string as covered in the Standard Input section.
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream) | top file functions |
Reads in nmemb records of size var from
the stream stream and stores the data into ptr.
A void pointer is a pointer of no type. This way, the
fread() function can read in data of any type, whether it
be an integer, a character string, a structure, whatever.
The return value is the actual number of bytes read in.
fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream) | top file functions |
fwrite() is similar to fread(), but writes
data to a stream, rather than reading it.
int fputc(int c, FILE *stream) | top file functions |
Writes a singe character/byte to a stream.
int fputs(const char *s, FILE *stream) | top file functions |
Writes a string to a file-- everything up to, and including, the null terminator.
void rewind(FILE *stream) | top file functions |
Moves the read/write position of a file back to the beginning of the stream.
long ftell(FILE *stream) | top file functions |
Returns the current position in the open stream.
int fseek(FILE *stream, long offset, int whence) | top file functions |
Moves the file position indicator to some other location
in the stream, based on offset and whence.
Here is a table of valid values for whence and how it
affects fseek():
| Whence | Effect |
|---|---|
| SEEK_SET | Moves the file position by offset relative to the beginning of the file |
| SEEK_CUR | Moves the file position by offset relative to the current position in the file |
| SEEK_END | Moves the file position by offset relative to the end of the file |
So, if you use the function like this:
fseek(fp,0,SEEK_END);
That would take you to the very end of the file. And if you
used fseek() like this:
fseek(fp.0,SEEK_SET);
It would move to the start of the file.
int feof(FILE *stream) | top file functions |
Returns a zero value (false) if the the end of the file has
not yet been reached. Returns a non-zero (true) if it has. After
the EOF flag has been set, you must use the clearerror()
function to clear the flag even if you seek to a new position in
a the file.
void clearerr(FILE *stream) | top file functions |
Clears any error flag it the file, including the EOF flag.
int fflush(FILE *stream) | top file functions |
Writes all buffered data of a stream, or forces an update of the stream. For files, this forces a write to disk of all data still in the buffer and not yet written to disk. For stdout, and stderr, it forces a write to the screen.
Improved Phone Book Program | top file IO |
Now that we know how to read and write to files, lets improve our phone book program to be able to read and write to a file. Make a copy of phonebook.c and name it as phonebook2.c, and get ready to make a few changes.
You'll want to change the initial comment in phonebook2.c to read something like:
/*******************
phonebook2.c -- an improved phone book program, with file support.
********************/
We're going to need two new functions to load and save the database, so add these two function prototypes with the others in your program:
void savebook();
void readbook();
Next, we will want a way for the user to use these new functions, so add two more lines to our display code right between "(L)ist entries (brief)" and "(Q)uit" as shown below:
puts("(S)ave database");
puts("(R)ead database");
Now we need to add a few lines to the switch()
statement code to recognize these new options. Adding them
right after the line containing switch(cmd){ is a good
place.
case 'S':
case 's': savebook(); break;
case 'R':
case 'r': readbook(); break;
Now that the user interface is finished, let's write our
file access functions. Remember when we talked about the
sizeof keyword? We're going to need to use it here.
The sizeof statement is effectively replaced by the size, in bytes,
of whatever the data type that comes after it. So, the statement:
printf("%d\n",sizeof char);
Should print 1 on your screen, because a char is one byte in length. However:
printf("%d\n", sizeof int);
May print 2, 4, even 8 or more, depending on your system and compiler.
Now lets write our file access functions:
/****************
** readbook() **
****************/
void readbook(){
FILE *fp;
char name[256];
printf("File to read:");
gets(name);
/*add our custom extension*/
strcat(name,".phn");
fp=fopen(name,"rb");
if(!fp){
fprintf(stderr,"ERROR: Can't open %s\a\n",name);
return;
}
fread(records,sizeof (struct Record),ENTRIES,fp);
fclose(fp);
}/*end readbook()*/