Copyright (c) 1998-1999 by Tom Torfs
(tomtorfs@pandora.be (preferred),
tomtorfs@mail.dma.be,
2:292/516 (Fidonet))
Current document size: HTML: approx. 230 KB
text: approx. 202 KB
This document, in both HTML and text format, may be distributed freely. Modifications made by anyone but the author must be clearly marked as such and a reference must be provided as to where the reader can obtain the original, unmodified version. This copyright information may not be removed or modified. It will apply only to the original text, not to any text added by others.
The author cannot be held responsible for any damage caused directly or indirectly by the use of information in this document.
Any suggestions for additions, improvements and/or error corrections are welcome at any of the addresses mentioned above.
All the source code in this document should be distributed together with it as plain text files. If you don't have them, you can obtain them, and the latest version of this document, at:
http://members.xoom.com/tomtorfs/c.html
1. Overview and rationale 2. Hello, world! 2.1. The Hello, world! program 2.2. Comments 2.3. #include 2.4. Functions and types 2.5. return 3. Using variables 3.1. The 42 program 3.2. Defining variables 3.3. printf() 4. Doing calculations 4.1. The calc program 4.2. Operators and typecasts 4.3. Using the functions from math.h 5. Enums and typedefs 5.1. The colours program 5.2. Defining and using enums and typedefs 6. Structs, unions and bit-fields 6.1. The cars program 6.2. Using structs, unions and bit-fields 7. Conditionals (if/else) 7.1. The yesno program 7.2. Obtaining random numbers 7.3. Using if/else 7.4. Using the ?: operator 8. switch/case/default 8.1. The animals program 8.2. Using the switch() statement 9. Loops (do,while,for) 9.1. The loops program 9.2. Using loops 9.3. Using break,continue,goto 10. Arrays, strings and pointers 10.1. The strings program 10.2. Arrays and strings 10.3. The pointers program 10.4. Addresses and pointers 10.5. Example: using command line arguments 11. Using files 11.1. The fileio program 11.2. Using disk files 11.3. The interact program 11.4. Using the standard I/O streams 12. Dynamic memory allocation 12.1. The persons program 12.2. Dynamically allocating memory 13. Preprocessor macros/conditionals 13.1. The preproc program 13.2. Using preprocessor macros/conditionals 13.3. Using the assert() macro 14. Variable argument functions 14.1. The vararg program 14.2. Using variable argument functions 14.3. Example: a printf() encapsulation 15. Modular approach 15.1. The module program 15.2. Using different modules 16. Overview of the standard library 16.1. Introduction to the standard library 16.2. assert.h (diagnostics) 16.3. ctype.h (character handling) 16.4. errno.h (error handling) 16.5. float.h (floating-point limits) 16.6. limits.h (implementation limits) 16.7. locale.h (localization) 16.8. math.h (mathematics) 16.9. setjmp.h (non-local jumps) 16.10. signal.h (signal handling) 16.11. stdarg.h (variable arguments) 16.12. stddef.h (standard type/macro definitions) 16.13. stdio.h (input/output) 16.14. stdlib.h (general utilities) 16.15. string.h (string handling) 16.16. time.h (date and time) 17. Epilogue 17.1. Credits 17.2. Other interesting C-related online material
This document is intended to give people who are interested in learning C, whether they already know another programming language or not, a quick introduction to the language. It does not pretend to be complete, but it should get you familiar with most concepts of the C language. It is not intended to replace a good introductory book on the subject; on the contrary, it is probably best used together with such a book; if you haven't programmed before some of the explanations in this introduction may be a bit short and getting more detailed information from a book would often be useful.
This introduction discusses the standard C language as defined by the International Standards Organization [*], also commonly referred to as "ANSI C" (the American National Standards Institute had standardized the language before ISO). It does not discuss platform/compiler specific extensions such as accessing a screen in either text or graphics mode, interacting with external devices, direct access to memory, using operating system features etc.
[*] ANSI/ISO 9899-1990 aka C89/C90; normative addenda are not discussed here; C9X proposed features may be referred to occasionally; the implementation is assumed to be hosted
Rationale
It has been - rightfully - reported that the material in this tutorial
may not be very easy reading for a complete beginner to the C language,
mainly because I try to be as complete as possible even in early stages
of this tutorial.
This has two reasons: first, this tutorial can also be useful as reference
material, e.g. to look up what printf() specifier can be used for what
purpose, etc.
The second, more fundamental reason, is that the principle objective
of this tutorial was to prefer accuracy over simplicity. In other words,
I feel there are already enough tutorials and introductory books that
don't adhere to the standard in several cases to keep things simple.
This has the consequence that some people who learned C from such a
tutorial/book will have a hard time adjusting their programming habits
to write conforming code whenever possible. Since this tutorial is
partly intended as a "cure" for those people, it requires some
degree of accuracy which also means that it may not read very
fluently if you're completely new to C. This only confirms the
above statement that this tutorial is not meant to be used on
its own, but rather as a complement to a good introductory book.
When reading this tutorial, it is OK to skip certain descriptions and listings which you feel are not required yet. Examples may include the complete list of printf() specifiers, standard types, etc.
/* prog2-1.c: Hello, world! */
#include <stdio.h>
int main(void)
{
puts("Hello, world!");
return 0;
}
If you compile and run this small program [*], you'll see that it writes
the message Hello, world! onto your screen (or printer, or whatever output
device you're using) and then ends the program successfully.
[*] see your compiler documentation / online help for instructions
Also, keep in mind that you are free to choose your own formatting style. E.g. the above could have been written as: (the reasons it hasn't should be obvious)
/* prog2-1.c: Hello, world! */
#include <stdio.h>
int main(void) {puts("Hello, world!"); return 0;}
/* prog1.c: Hello, world! */Everything that is inside /* and */ is considered a comment and will be ignored by the compiler. You must not include comments within other comments, so something like this is not allowed: [*].
/* this is a /* comment */ inside a comment, which is wrong! */[*] your compiler may allow this, but it's not standard C
You may also encounter comments after //. These are C++ comments and while
they will be allowed in the next revision of the C standard, you shouldn't
use them yet. They are more limited than the flexible /* */ comments anyway.
2.3. #include
#include <stdio.h>In C, all lines that begin with # are directives for the preprocessor, which means that all these directives will be processed before the code is actually compiled. One of the most commonly used preprocessor directives is #include. It has two forms. The first form is:
#include <header>This form should only be used to include a standard header, such as stdio.h. This gives you access to all the goodies in the header. In this program we are going to use puts() (see below), and to be able to do that we need to [*] include the standard header stdio.h (which stands for standard input/output header).
[*] in some cases you may get away with not including the right standard header, but in other cases it may cause nasty things to happen (ranging from a simply malfunctioning program to a core dump or even a system crash: this is called undefined behaviour, in other words anything might happen); basically you should always include the right headers.
The following standard headers are available: (see 16. Overview of the standard library for a detailed discussion of them)
Commonly used headers:
stdio.h input/output stdlib.h general utilities string.h string handling time.h date and time math.h mathematics ctype.h character handlingStandard definitions:
stddef.h standard type/macro definitions limits.h implementation limits float.h floating-point limits errno.h error handling assert.h diagnosticsMore advanced headers:
stdarg.h variable arguments setjmp.h non-local jumps signal.h signal handling locale.h localization
The second form is:
#include "file"This directive will be replaced with the contents of file. Usually this form is used for non-standard headers that are part of your program (see 15. Modular approach).
Now onto the next line: (the { and } braces belong to this line as well)
int main(void)
{
}
This is a function definition. A function is a part of a program that can be
called by other parts of the program. A function definition always has the
following form:
type name(parameters)
{
/* the function code goes here */
}
The function will return a value of type 'type' to the caller. C supports the
following types: (note that the ranges indicated are minimum ranges, they may
(and will!) be larger so you shouldn't rely on them having a certain size)
Integer types: (non-fractional numbers)
signed char minimum range: -127..+127 unsigned char minimum range: 0..255 signed short minimum range: -32767..+32767 unsigned short minimum range: 0..65535 signed int minimum range: -32767..+32767 unsigned int minimum range: 0..65535 signed long minimum range: -2147483647..+2147483647 unsigned long minimum range: 0..4294967295The type char may be equivalent to either signed char or unsigned char (that depends on your compiler), but it is always a separate type from either of these. Also notice that in C there is no difference between storing characters or their corresponding numerical values in a variable, so there is also no need for a function to convert between a character and its numerical value or vice versa (this is different from languages like Pascal or BASIC).
Floating point types: (fractional numbers)
float minimum range: +/- 1E-37..1E+37 minimum precision: 6 digits double minimum range: +/- 1E-37..1E+37 minimum precision: 10 digits long double minimum range: +/- 1E-37..1E+37 minimum precision: 10 digitsWith several compilers double and long double are equivalent. That combined with the fact that most standard mathematical functions work with type double, is a good reason to always use the type double if you have to work with fractional numbers.
Special-purpose types: (don't worry too much about these yet)
Commonly used ones:
size_t unsigned type used for storing the sizes of objects in bytes time_t used to store results of the time() function clock_t used to store results of the clock() function FILE used for accessing a stream (usually a file or device)Less commonly used ones:
ptrdiff_t signed type of the difference between 2 pointers div_t used to store results of the div() function ldiv_t used to store results of ldiv() function fpos_t used to hold file position informationMore advanced ones:
va_list used in variable argument handling wchar_t wide character type (used for extended character sets) sig_atomic_t used in signal handlers jmp_buf used for non-local jumpsIf the function does not return anything you can use the pseudo-type void as the return value type (this is not allowed for main(), see below).
The function name follows the rules for all names (formally: identifiers): it must consist entirely of letters (uppercase and lowercase are different! [*]), digits and underscores (_), but may not begin with a digit.
[*] during linking, case differences in external names might be ignored; so it's often not a good idea to use 2 variable/function names that only differ in their case (this would also reduce clarity)
The parameter list may be void in which case the function won't take any parameters. Otherwise it should be a list of the following form:
type1 name1, type2 name2, type3 name3 etc.The possible types are the same as for the return value, and the names follow the same rules as those for the function name. An example:
double myfunction(int foo, long bar)
{
}
Is a function called 'myfunction' that accepts a parameter of type int and
a parameter of type long and returns a value of type double
The parameters' values will be filled in by the callers. The above function might be called as follows:
myfunction(7,10000);In this case the return value is ignored. If it's needed we may assign it to a variable as follows: (see 3. Using variables)
somevariable = myfunction(7,10000);A function that takes no parameters (which has void as its parameter list) would be called as follows:
myfunction();Notice that unlike in e.g. Pascal you can't do something like: [*]
myfunction; /* wrong! */[*] This will probably compile, but it does something entirely different than you think (basically, it does nothing).
Here's an example of a situation where using a function provides advantages, apart from the readability advantage that using a function instead of lumping all code together always provides.
do_something(); check_error_condition(); do_something_else(); check_error_condition(); etc.Suppose the do_something() and do_something_else() functions perform some useful tasks. However, it is possible that an error occurs while performing these tasks, and if that happens these functions set some sort of error condition to a specific value. By looking at this value, the calling code can then print an appropriate error message and/or take appropriate actions. But if the code to do this would have to be copied everywhere a function like do_something() is used, the program would get enormously bloated, not to mention the fact that if something has to be changed in this error handling code, it may have to be changed in hundreds of places.
main() is a very special function: it is the function that is called at the beginning of our program's execution. Because it's so special we are not free to choose its return value or parameters. Only two forms are allowed:
int main(void)
{
}
Which you should use if you don't need access to command line arguments.
int main(int argc, char *argv[])
{
}
Which allows access to the command line arguments
(see 10.5. Example: using command line arguments).
In some source code you may find other definitions of main(), such as returning void or taking 3 parameters etc. These may work on certain compilers but they are not standard so you should avoid them.
The actual function code (called the function body) goes inside the { and } braces. In this case the first line of the function body is:
puts("Hello, world!");
The puts() function is declared in the standard header stdio.h as follows:
int puts(const char *s);This is not a function definition, because it is not followed by { and } braces but by a semicolon, which in C means 'end of statement'. Instead, it is a function declaration (also known as 'prototype') which informs the compiler of the return value and parameter types of this function. The actual function definition can be found somewhere in your compiler's libraries, but you needn't worry about that because the declaration provides all the information you need in order to be able to call this function.
The return value of the puts() function is an int, and its meaning is rather straightforward: it returns EOF (which is a negative value defined in stdio.h) if an error occurs, or a nonnegative value if it's successful. We ignore this return value in our small program, but in 'real' code it's often a good idea to check these return values, and if an error occurs deal with it properly.
But the parameter, that's something else. To understand it, we must first know what puts() does: it writes a string (= text) to the standard output device (usually the screen, but it may be another device or a file) and then moves on to the next line. So obviously this 'const char *s' parameter must accept the string we want to print.
If you look back to the types listed above, you will notice that there is no string type among them. That is because C doesn't have a special string type. A string in C is just a series of chars, and the end of the string is indicated by a char with value 0. Such a series is called an array. But instead of passing the array to the puts() function, a pointer to it is passed. The pointer is indicated by the * and the const means that the puts() function will not modify the string we pass it. If this seems like Chinese for now (provided you don't live in China, of course), don't worry: this will be more thoroughly explained later in this document (see 10. Arrays, strings and pointers). All you need to know for now is that the above declaration means that puts() will take a string as parameter, and that that string will not be modified.
In our program, we pass puts() the parameter "Hello, world!". The double quotes indicate that this is a string literal (which basically comes down to a string that may not be modified [*]). Since puts() accepts a string that it will not modify, this works out perfectly. If we would have attempted to call puts() with e.g. a number as parameter, the compiler would most likely have complained.
[*] Attempting to modify a string literal will cause undefined behaviour. Some compilers/operating systems may allow string literals to be modified, but this will fail miserably on others (or even the same ones with different compiler options set), so should definitely never be done.
You must be careful to use double quotes ("), which indicate a string literal, because single quotes (') indicate a character literal (e.g. 'A') and these are not interchangeable. [*]
[*] Another peculiarity is that character literals in C are not of type char, but of type int. This is different from C++.
String and character literals may contain special escape codes to perform some implementation-defined actions:
\a alert (causes an audible or visual alert when printed)
\b backspace
\f formfeed (may clear the screen on some systems, don't rely on this)
\n newline (goes to the beginning of the next line; will be automatically
translated to/from \r\n or another (if any) end-of-line
sequence on systems where this is necessary)
\r carriage return (goes to the beginning of the current line)
\t horizontal tab
\v vertical tab
\\ backslash (be careful when using backslashes in literals!)
\' single quote (useful inside character literals)
\" double quote (useful inside string literals)
\? question mark (useful to avoid unwanted trigraph translation when two
question marks are followed by certain characters)
\<octal digits> character value in octal
\x<hex digits> character value in hexadecimal
The next, and final, line of our small program is:
return 0;return exits a function and unless the function has return type void it must be followed by a value corresponding to its return type.
As we mentioned above, main() always returns an int value. In the case of main(), this value indicates to the operating system whether the program ended successfully or because of an error. Success is indicated by returning the value 0. Also, the constants EXIT_SUCCESS and EXIT_FAILURE may be used; they are defined in the standard header stdlib.h. [*]
[*] Usually, other return values than 0, EXIT_SUCCESS or EXIT_FAILURE are allowed as well for the return value of main(), but this is not standard so you should avoid it unless you have a specific reason to do so, provided you're aware this may restrict the range of operating systems your program will work on.
Note that the above is specific to the main() function; other functions that you write are of course not subject to these restrictions, and may take any parameters and return any value they like. [*]
[*] Some other functions, like the compare function for qsort() or the function passed to atexit() have to have a certain form as well, but those few exceptions are of a more advanced nature.
/* prog3-1.c: 42 */
#include <stdio.h>
int main(void)
{
int answer = 42;
printf("And the answer is: %d\n", answer);
return 0;
}
When compiled and run, this program should write the following onto
your output device:
And the answer is: 42
int answer = 42;This line defines a variable named 'answer' of type int and initializes it with the value 42. This might also have been written as:
int answer; /* define uninitialized variable 'answer' */ /* and after all variable definitions: */ answer = 42; /* assigns value 42 to variable 'answer' */Variables may be defined at the start of a block (a block is the piece of code between the braces { and }), usually this is at the start of a function body, but it may also be at the start of another type of block (unlike in C++ where variables may be defined anywhere in the code).
Variables that are defined at the beginning of a block default to the 'auto' status. This means that they only exist during the execution of the block (in our case the function). When the function execution begins, the variables will be created but their contents will be undefined (unless they're explicitly initialized like in our example). When the function returns, the variables will be destroyed. The definition could also have been written as:
auto int answer = 42;
Since the definition with or without the auto keyword is completely equivalent, the auto keyword is obviously rather redundant.
However, sometimes this is not what you want. Suppose you want a function to keep count of how many times it is called. If the variable would be destroyed every time the function returns, this would not be possible. Therefore it is possible to give the variable what is called static duration, which means it will stay intact during the whole execution of the program. For example:
static int answer = 42;This initializes the variable answer to 42 at the beginning of the program execution. From then on the value will remain untouched; the variable will not be re-initialized if the function is called multiple times!
Sometimes it is not sufficient that the variable be accessible from one function only (and it might not be convenient to pass the value via a parameter to all other functions that need it), but you need access to the variable from all the functions in the entire source file [*] (but not from other source files).
[*] The term the standard uses is "translation unit". On most implementations such a translation unit is simply a text file containing source code, often marked with a ".c" extension or something similar. Therefore in the rest of this document the term "source file" will be used where "translation unit" is meant.
This can also done with the static keyword (which may be a bit confusing), but by putting the definition outside all functions. For example:
#include <stdio.h>
static int answer = 42; /* will be accessible from entire source file */
int main(void)
{
printf("And the answer is: %d\n", answer);
return 0;
}
And there are also cases where a variable needs to be accessible from the entire program, which may consist of several source files (see 15. Modular approach). This is called a global variable and should be avoided when it is not required. This is also done by putting the definition outside all functions, but without using the static keyword:
#include <stdio.h>
int answer = 42; /* will be accessible from entire program! */
int main(void)
{
printf("And the answer is: %d\n", answer);
return 0;
}
There is also the extern keyword, which is used for accessing global variables in other modules. This is explained in 15. Modular approach).
There are also a few qualifiers that you can add to variable definitions. The most important of them is const. A variable that is defined as const may not be modified (it will take up space, however, unlike some of the constants in e.g. Pascal; see 13. Preprocessor macros/conditionals if you need an equivalent for those). For example:
#include <stdio.h>
int main(void)
{
const int value = 42; /* constant, initialized integer variable */
value = 100; /* wrong! - will cause compiler error */
return 0;
}
Then there are two more modifiers that are less commonly used:
The volatile modifier requires the compiler to actually access the variable everytime it is read; i.o.w. it may not optimize the variable by putting it in a register or so. This is mainly used for multithreading and interrupt processing purposes etc., things you needn't worry about yet.
The register modifier requests the compiler to optimize the variable
into a register. This is only possible with auto variables and in many
cases the compiler can better select the variables to optimize into
registers itself, so this keyword is obsolescent. The only direct
consequence of making a variable register is that its address cannot
be taken (see 10.4. Addresses and pointers).
3.3. printf()
Then there is the line:
printf("And the answer is: %d\n", answer);
printf(), like puts(), is a function that is declared in stdio.h. It
also prints text to the standard output device (often the screen).
However, there are two important differences:
1) While puts() automatically adds a newline (\n) to the text, this must be done explicitly when using printf() [*]
[*] If the newline is omitted, the output may not show up. If you really want no newline, you must add fflush(stdout); afterwards. (see 11.4. Using the standard I/O streams)
2) While puts() can only print a fixed string, printf() can also be used to print variables etc. To be able to do that, printf() accepts a variable number of parameters (later we'll discuss how to write such functions yourself, see 14. Variable argument functions). The first parameter is the format string. In this string % signs have a special meaning (so if you need a % sign to be printed using printf() you must double it, i.e. use %%): they are placeholders for variable values. These variables are passed as parameters after the format string. For every variable passed, there must be a corresponding format specifier (% placeholder) in the format string, of the right type and in the right order. These are the possible format specifiers:
%d signed int variable, decimal representation (equivalent to %i)
%u unsigned int variable, decimal representation
%x unsigned int variable, lowercase hexadecimal representation
%X unsigned int variable, uppercase hexadecimal representation
%o unsigned int variable, octal representation
(there are no format specifiers for binary representation)
%f float/double, normal notation
%e float/double, exponential notation (%E uses E instead of e)
%g float/double, notation %f or %e chosen depending on value (%E if %G)
%c character (passed as int), text representation
%s string (see 10. Arrays, strings and pointers)
%p pointer (see 10. Arrays, strings and pointers)
%n number of characters written upto now will be written to int
that the corresponding argument points to
You can change the type of the printed variable by inserting one of
the following characters between the % sign and the type character
(for example: %ld for long int instead of an int).
h for d,i,o,u,x,X: short int instead of int
(the short int will be promoted to int when passed anyway)
for n: store result in short int instead of int
l for d,i,o,u,x,X: long int instead of int
for n: store result in long int instead of int
Do NOT use for e,E,f,F,g,G for e.g. printing doubles.
L for e,E,f,F,g,G: long double instead of float/double
There are some flags and modifiers that can be put between the % and the
type character:
- left alignment, pad on the right with spaces (default=right alignment)
+ print plus sign if positive (default=only print minus sign if negative)
(for signed numbers only)
space print space if positive (default=only print minus sign if negative)
(for signed numbers only)
0 pad with zeros instead of with spaces (for numbers only)
# "alternate form": - o: 0 will be prepended to a non-zero result
- x/X: prepends 0x/0X to result
- f/F,e/E,g/G: decimal point even if no decimals
- g/G: trailing zeros are not removed
<nonzero decimal value> specify field width to which result will be padded
(this can be used together with the 0 flag)
* field width will be passed as int parameter before the actual argument
.<nonzero decimal value> specify precision (default for f/F,e/E = 6)
(for s, precision will limit the number of printed characters)
.0 no decimal point is printed for f/F,e/E
.* precision will be passed as int parameter before the actual argument
Here's an example:
printf("Record %lX: name = %s, age = %d, hourly wage = %.3f\n",
record_num, name, age, hourly_wage);
Would print the hexadecimal record number, the name, age, and hourly
wage with 3 digits precision, provided record_num is of type unsigned long,
name is a string (see 10. Arrays, strings and pointers), age is
an integer (a smaller type would also be OK since it would automatically be
promoted to an integer when passed to printf()) and hourly_wage is a
double (or float, for the same reason).
Now you should understand how this program works, so we can move on to new territory...
/* prog4-1.c: calc */
#include <stdio.h>
#include <math.h>
int main(void)
{
double a, pi;
int b;
a = 500 / 40; /* Fraction is thrown away (integer math) */
printf("a = %.3f\n", a);
a = 500.0 / 40; /* Fraction is NOT thrown away (floating point math) */
printf("a = %.3f\n", a);
a++;
printf("a now = %.3f\n", a);
b = (int) a;
b ^= 5;
printf("b = %d\n", b);
pi = 4 * atan(1.0);
printf("pi ~= %.10f\n", pi);
return 0;
}
The output of this program is: [*]
a = 12.000 a = 12.500 a now = 13.500 b = 8 pi ~= 3.1415926536
[*] If you get "undefined symbol" or similar errors, you have to
enable floating-point support in your compiler (see manual / online help)
4.2. Operators and typecasts
The first lines of the body of main() should be obvious:
double a, pi; int b;2 uninitialized double-precision floating-point variables, called a and pi, and an unitialized integer variable called b are defined. The next line contains something new:
a = 500 / 40; /* Fraction is thrown away (integer math) */
We've encountered the assignment operator = before. The / is the division operator. So what this code does is dividing the value 500 by the value 40 [*] and storing the result in the variable a.
[*] Because both operands are constants, the compiler will most likely optimize this by calculating the result at compile-time.
The following operators are available (in order of precedence):
Precedence Operator Explanation
1. Highest () Function call
(see 2.4. Functions and types)
[] Array subscript
(see 10. Arrays, strings and pointers)
-> Indirect member selector
(see 6. Structs, unions and bit-fields)
. Direct member selector
(see 6. Structs, unions and bit-fields)
2. Unary ! Logical negation: nonzero value -> 0, zero -> 1
~ Bitwise complement: all bits inverted
+ Unary plus
- Unary minus
++ Pre/post-increment (see below)
-- Pre/post-decrement (see below)
& Address
(see 10. Arrays, strings and pointers)
* Indirection
(see 10. Arrays, strings and pointers)
sizeof Returns size of operand in bytes; two forms:
1) sizeof(type)
2) sizeof expression
3. Multi- * Multiply
plicative / Divide
% Remainder (only works for integers)
4. Additive + Binary plus
- Binary minus
5. Shift << Shift bits left, e.g. 5 << 1 = 10
>> Shift bits right, e.g. 6 >> 1 = 3
6. Relational < Less than
<= Less than or equal to
> Greater than
>= Greater than or equal to (not =>)
7. Equality == Equal to (not =)
!= Not equal to
8. Bitwise AND & Bitwise AND, e.g. 5 & 3 == 1
9. Bitwise XOR ^ Bitwise XOR, e.g. 5 ^ 3 == 6
10. Bitwise OR | Bitwise OR, e.g. 5 | 3 == 7
11. Logical AND && Logical AND
12. Logical OR || Logical OR
13. Conditional ?: Conditional operator
(see 7.4 Using the ?: operator)
14. Assignment = Simple assignment
*= Assign product (see below)
/= Assign quotient
%= Assign remainder
+= Assign sum
-= Assign difference
&= Assign bitwise AND
^= Assign bitwise XOR
|= Assign bitwise OR
<<= Assign left shift
>>= Assign right shift
15. Comma , Separates expressions
The precedence can be overruled by using parentheses. E.g. 5+4*3 would
give 17, while (5+4)*3 would give 27.
The unary (2), conditional (13) and assignment (14) operators associate from right to left, the others from left to right. E.g. 4/2*3 groups as (4/2)*3, not 4/(2*3), but e.g. a = b = 7 groups as a = (b = 7), so you can assign a value to multiple variables in one line, but this reduces the clarity, at least in my opinion.
The left operand of the assignment operators must be what is called an "lvalue", i.o.w. something to which a value can be assigned. Of course you can't put a constant or an expression on the left side of an assignment operator, because they are not variables that have a location in which a value can be stored. For the same reason it cannot be, for example, the name of an array or string (see 10. Arrays, strings and pointers).
Don't worry if you're a bit overwhelmed by this list. It's normal if you don't understand most of them yet; the more advanced ones will be explained later in this document (see the forward references); the prefix/postfix and special assignment operators are explained below; the mathematical operators should be obvious, the bitwise operators too if you know something about binary arithmetic; just watch out for their precedence: they have higher precedence than the logical operators but lower precedence than the relational operators. For example:
a!=b&cYou may think the above expression performs a bitwise AND of b and c, and then compares this value to a for inequality. However, since the bitwise & operator has lower precedence than the equality operators, in fact a is compared to b for inequality, and the result of this comparison (1 or 0, see below) is then bitwise ANDed with c.
The logical AND/OR, relational and equality operators are usually used in conditions, such as those in an if statement (see Conditionals (if/else)). The relational and equality operators return 1 if the relationship or equality is true, 0 if it isn't. The logical AND/OR operators have the following logic:
logical AND: (&&)
left operand right operand result zero zero 0 zero nonzero 0 nonzero zero 0 nonzero nonzero 1logical OR: (||)
left operand right operand result zero zero 0 zero nonzero 1 nonzero zero 1 nonzero nonzero 1Onto the next line of code:
printf("a = %.1f\n", a);
This line outputs the value of a, which is 12.0. But 500/40 is 12.5, and since a double is used one could reasonably expect the decimal digits to be correct. The reason for the wrong result is that the constants 500 and 40 are both interpreted as integer values by the compiler. At least one of the constants should be explicitly made a floating-point constant by writing it as e.g. 500.0. That's what's done in the next line of code:
a = 500.0 / 40; /* Fraction is NOT thrown away (floating point math) */
printf("a = %.1f\n", a);
And this time the printf() will print the correct result.a++;This is an example of the postfix increment operator. This is a special operator because it has the side effect of actually changing the value of the variable. The ++ (double plus) operator adds 1 to the value, the -- (double minus) operator substracts one from the value. Whether the operator is placed before (prefix) or after (postfix) the variable determines whether the value used in the expression is the one after resp. before the modification. An example will probably make this clear: (assume a and b are integer variables)
a = 5; b = a++; /* b will now be 5, a will be 6 */ a = 5; b = ++a; /* b will now be 6, a will be 6 */
You can't use something like:
a = a++; /* wrong! use a++; or a = a + 1; instead */Because this code changes the variable a twice in the same line: in the assignment, and with the ++ operator. This has something to do with the so-called sequence points. For more information, see question 3.8 in the c.l.c FAQ, the URL can be found in: 17.2. Other interesting C-related online material However, in our example program we don't use the result of the expression, so the only effect is that a will be incremented by one, and will contain the value 13.5.
b = (int)a;This line assigns the value of a to b. But there is something special: the int between parentheses. This is called a typecast (or conversion). It basically tells the compiler "turn this double value into an integer value" [*]. If you would simply write b = a; this would automatically happen, but the compiler would probably give a warning because an int can't hold all the information that a double can (for example, the fractional digits we just calculated will be lost). Typecasting can be useful, but also dangerous, so you should avoid it when you can.
[*] Note that a typecasts does NOT reinterpret the bit pattern of a variable for the new type, but converts the value of the variable.
b ^= 5;This line uses a special assignment operator. It is actually a shorthand notation for:
b = b ^ 5; /* bitwise XOR of b with 5 */Such a "shorthand" assignment operator exists for most operators (see table above).
printf("b = %d\n",b);
This will output the value of b. Since the value of a that was assigned
to b was 13 (1101b), the bitwise XOR operation with 5 (0101b) results in the
value 8 (1000b) for b (you'll also need to know binary to understand this).
pi = 4 * atan(1.0);
printf("pi ~= %.10f\n", pi);
This piece of code calculates and prints an approximation for pi. It
uses a function from the standard header math.h to do this: atan(),
which calculates the arc tangent in radians. The arc tangent of 1
(written as 1.0 to make clear this is a floating-point constant)
is pi/4, so by multiplying by 4 we get an approximation of pi. There
are many useful functions like this in math.h
(see 16. Overview of the standard library)
They operate on doubles. For example, there is a pow() function in
case you were worried that there was no power operator in the above
table.
/* prog5-1.c: colours */
#include <stdio.h>
enum colours {RED, ORANGE, YELLOW, GREEN, BLUE, INDIGO, VIOLET,
NUMBER_OF_COLOURS};
typedef enum colours colour_t;
int main(void)
{
colour_t sky, forest;
printf("There are %d colours in the enum\n", NUMBER_OF_COLOURS);
sky = BLUE;
forest = GREEN;
printf("sky = %d\n", (int)sky);
printf("forest = %d\n", (int)forest);
return 0;
}
The output of this program is:
There are 7 colours in the enum sky = 4 forest = 3
enum colours {RED, ORANGE, YELLOW, GREEN, BLUE, INDIGO, VIOLET,
NUMBER_OF_COLOURS};
This is the definition of an enumeration type (enum). Every one of
the word constants (here colour names) will be assigned a numeric value,
counting up and starting at 0 [*]. So the constants will have the
following values:
[*] You can override this default numeric value by adding an explicit =value after the constant; the counting for subsequent constants will then continue from there.
word constant numeric value RED 0 ORANGE 1 YELLOW 2 GREEN 3 BLUE 4 INDIGO 5 VIOLET 6 NUMBER_OF_COLOURS 7The reason that the NUMBER_OF_COLOURS is at the end of the enum is that because of the counting system, by adding a constant like this at the end of the enum, this constant will always correspond to the number of defined constants before it [*].
[*] Of course this will not be valid when different values are explicitly assigned to the constants as mentioned in the above note.
typedef enum colours colour_t;If we wanted to define a variable of the enum type defined above, we would normally have to do that like this:
enum colours somevariable;However, sometimes we may prefer to make the name "fit in" better with the rest of the types, so that we could simply use it like int etc. without the enum required. That's what the typedef keyword is for. If we add typedef before what would otherwise be a variable definition, instead of creating a variable with a certain name we are creating a new type with that name.
In our example this means that everywhere where we use colour_t this is compiled as if there was an enum colours instead. So:
colour_t sky, forest;Is equivalent to:
enum colours sky, forest;Onto the next line of code:
printf("There are %d colours in the enum\n", NUMBER_OF_COLOURS);
C has a rather relaxed way of dealing with types (this is called weak
typing). This is especially noticeable when it comes to enums: they are
equivalent [*] with int variables.
[*] That doesn't necessarily mean that enums and ints will be exactly the same type, but they are compatible types
So we could also have defined the variables like this:
int sky, forest;And still have used the assignment of enum constants below.
Another consequence of this equivalence is that there is no special printf() format specifier to print an enum value. We have to use the format specifier for an integer: %d. An enum constant needn't even be cast as it is guaranteed to be of type int.
printf("There are %d colours in the enum\n", NUMBER_OF_COLOURS);
This also means that when we print an enum value, a number will be
output, not the symbolic name.
sky = BLUE; forest = GREEN;Here we assign values to the variables using our defined constants. These statements are equivalent to:
sky = 4; forest = 3;But obviously that's much less clear, and when the enum's definition is modified this code may no longer be equivalent to the above.
printf("sky = %d\n", (int)sky);
printf("forest = %d\n", (int)forest);
The values of the enum variables are printed, again as integers. [*]
However, in this case we have typecast the enum to int first.
This is because, although unlikely, the compiler may have chosen
another (larger) type for the enum itself (not the constants).
[*] If you would really like the symbolic names to be printed, you'll have to write code to that yourself. You may want to use either an array of strings (see 10.2. Arrays and strings) or a switch() statement (see 8. switch/case/default).
/* prog6-1.c: cars */
enum brands {VOLKSWAGEN, FORD, MERCEDES, TOYOTA, PEUGEOT};
enum colours {RED, GREEN, BLUE, GREY, BLACK, WHITE};
struct car {enum brands brand;
enum colours colour;
int second_hand;
union {unsigned long mileage; int months_guarantee;} carinfo;
unsigned diesel:1, airbag:1, airconditioning:1;
};
int main(void)
{
struct car mycar = {VOLKSWAGEN, GREY, 1, 50000, 0, 0, 0};
struct car yourcar;
yourcar.brand = MERCEDES;
yourcar.colour = BLACK;
yourcar.second_hand = 0;
yourcar.carinfo.months_guarantee = 12;
yourcar.diesel = 1;
yourcar.airbag = 1;
yourcar.airconditioning = 1;
return 0;
}
This program has no output.
enum brands {VOLKSWAGEN, FORD, MERCEDES, TOYOTA, PEUGEOT};
enum colours {RED, GREEN, BLUE, GREY, BLACK, WHITE};
But the next few lines contain several new things:
struct car {enum brands brand;
enum colours colour;
int second_hand;
union {unsigned long mileage; int months_guarantee;} carinfo;
unsigned diesel:1, airbag:1, airconditioning:1;
};
OK, this is the definition of a structure type. By writing something
of this kind:
struct structure_tag {type1 member1; type2 member2; /* etc. */ };
You define a structure type which from this point on you can use
to define structure variables, e.g.: [*]
struct structure_tag mystructure; /* don't forget the struct keyword */
[*] This could be written in one line as struct structure_tag {type1 member1; type2 member2;} mystructure; (in which case the structure_tag is optional)
A structure is a collection of different variables (called members) into a single type. It is usually used to describe all properties of a certain item in a single variable. In our cars example we could store information like the brand, colour etc. in a different variable for every car. But it is a lot more clear to concentrate all the information about one specific car into a type of its own, analogous to e.g. Pascal records. That's why a structure is used.
The first two members are familiar enums:
enum brands brand;
enum colours colour;
The next variable:
int second_hand;
Is used to store whether the car is a second hand car or a new car.
In the former case a 1 is stored in this member variable, in the
latter case a 0.
But the information that we need to store about a second hand car needn't be the same as that for a new car. For example, we only need to store the mileage information for second hand cars and guarantee information for new cars. So sometimes it may be a good idea to not have unneeded information take up space in our structures. That's why unions are available:
union {unsigned long mileage; int months_guarantee;} carinfo;
This is a union definition inside our struct definition (you can put
structs within structs etc.). The union definition looks very similar
to a struct definition. There is one very important difference though:
in a struct space is allocated for every member. In a union enough
space is allocated so that the largest member can be stored (and
therefore the other members as well). But only one member can hold
a value at a time; by assigning a value to one member of a union the
values of the other members of a union are destroyed. You can only
use the value of the union member that a value was last assigned to. [*]
[*] Many systems allow reading other members than the one last assigned to, but this is not standard.
Because of this, we have to keep track of whether the mileage member or the months_guarantee member contains a valid value. This can be determined by looking at the second_hand variable. If that is 1, the mileage member contains a valid value; if that is 0, the months_guarantee member contains a valid value.
unsigned diesel:1, airbag:1, airconditioning:1;
Sometimes member variables in a structure only need to store a few
values, often only 1 or 0 which would fit in a single bit. To avoid
having to allocate an entire int or char for every such variable,
you can use bit-fields. The above definition will allocate an unsigned
int and define three variables (diesel, airbag and airconditioning)
of each 1 single bit wide. [*]
[*] Bit-fields should always be of type signed int or unsigned int. If you use plain int, it is implementation-defined whether they'll be signed or unsigned. Many systems support other types, but this is not standard.
OK, now we move into the body of main():
struct car mycar = {VOLKSWAGEN, GREY, 1, 50000, 0, 0, 0};
struct car yourcar;
As mentioned above, these are definitions of two struct car variables,
called mycar and yourcar. What's special is that mycar is initialized.
This is comparable to how regular variables are initialized, except
for the form of the initializer. For a struct this should be between
{ and } braces, and initializer values for the members should be
separated by commas. For a union (such as our carinfo member,
which may contain either mileage or months_guarantee information)
it is always the first member that is initialized, so in our case
the 50000 will be used to initialize mileage, not months_guarantee.
Bit-fields are initialized like normal member variables.
yourcar.brand = MERCEDES; yourcar.colour = BLACK; yourcar.second_hand = 0; yourcar.carinfo.months_guarantee = 12; yourcar.diesel = 1; yourcar.airbag = 1; yourcar.airconditioning = 1;Each member of yourcar is initialized manually. A member of a structure is accessed by appending .membername to the structure variable name. The same goes for a union, that's why we have to use yourcar.carinfo.months_guarantee in the fourth line instead of simply yourcar.months_guarantee.
The reason yourcar can't be initialized in the same way as mycar is that, as mentioned above, we can only initialize the first member variable of a union in that way, not the others (in this case the months_guarantee member variable).
/* prog7-1.c: yesno */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void)
{
int yesno;
srand((unsigned)time(NULL));
yesno = rand() % 2;
if (yesno==0)
printf("No\n");
else
printf("Yes\n");
return 0;
}
This program outputs either:
Noor
Yes
However, the oversize calculator that your computer really is cannot return really random numbers [*]. Instead, it uses something called pseudo-random numbers: they are numbers that are calculated using a mathematical algorithm, and which are therefore not really random at all, but which have the appearance of being random.
[*] Some specialized hardware may be used to measure random factors from e.g. the environment, but this falls way outside of the scope of the standard C language.
Obviously, if a program uses a certain mathematical algorithm to obtain these numbers, it will produce the very same numbers every time it runs. Not very random. That's why we must first initialize the random number generator with a value that is different nearly every time the program is run. A good value for that seems to be the current date and time.
The current date and time is returned when we call the time() function with NULL as parameter (see the section on time() in 16. Overview of the standard library if you want to know what this parameter is for). The value it returns is a time_t, and we don't really know what that is (it can be different on every system). All that interests us here is that the value will probably be different every time the program is run. [*]
[*] On some systems without a clock time() may return the same value (time_t)-1 every time, and on those this approach will of course fail.
#include <stdio.h> #include <stdlib.h> #include <time.h>The standard header stdlib.h contains the declarations for the srand() and rand() functions, and the standard header time.h contains the declaration for the time() function.
srand((unsigned)time(NULL));
The srand() function initializes the random number generator. It takes an unsigned integer value as parameter. As explained above, we use the return value of time(NULL) as the initializer (often called the seed of the random number generator), which is this unknown type time_t. We explicitly typecast (see 4.2. Operators and typecasts) this type to an unsigned integer [*] to suppress possible compiler complaints (since we know what we're doing and don't really care if for example we may lose some of the time information in the conversion; all that matters is that the value should be different every time the program is run, when possible).
[*] On some rare systems this may cause an overflow error.
yesno = rand() % 2;The rand() function returns a random number between 0 and RAND_MAX (the value of RAND_MAX is defined in stdlib.h and may be different on every system). You can't force rand() to return random numbers in a certain range. Instead, we convert its return value so that it fits in this range. The most obvious method to do this is the one we use above: using the remainder operator. For example, rand() % 10 will always give a value in the range 0..9. [*] So in our case yesno will contain either the value 0 or 1.
[*] This method is not very good, but used here for its simplicity.
A better method (taken from the c.l.c FAQ) would be to use something like:
(int)((double)rand() / ((double)RAND_MAX + 1) * N)
For more information, see question 13.16 in the c.l.c FAQ, the URL can be
found in: 17.2. Other interesting C-related online material
7.3. Using if/else
if (yesno==0)
printf("No\n");
else
printf("Yes\n");
In the preceding code, yesno was set randomly to either 0 or 1.
This code now prints "No" when the value of yesno was 0, and
"Yes" if it was any other value (which would be 1 in this case).
The if() statement works as follows:
if (condition) statement_that_is_executed_if_the_condition_is_true; else statement_that_is_executed_if_the_condition_is_false;Note that, unlike in Pascal, there must be a semicolon after the code in the if part of the statement.
If the code in the if or else part of the statement consists of more than one statement, a block must be used:
if (condition)
{
possibly_more_than_one_statement;
}
else
still_one_statement;
When a block is used, no semicolon may be put after the closing }.
You can chain if statements as following:
if (condition) some_code; else if (another_condition) some_other_code; else even_different_code;
a = c + (b<4 ? 1 : 2);Is equivalent to:
if (b<4) a = c + 1; else a = c + 2;So, the expression:
condition ? expr1 : expr2Will give expr1 if condition is true and expr2 if condition is false.
You have to be careful with predecence: the ?: operator has very low precedence (only assignment and comma operators have lower precedence). If you're not sure, it's safest to use enough parentheses.
/* prog8-1.c: animals */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
enum animal {CAT, DOG, COW, SNAIL, SPIDER, NUMBER_OF_ANIMALS};
void animalsound(enum animal animalnumber)
{
switch(animalnumber)
{
case CAT:
printf("Miaow!\n");
break;
case DOG:
printf("Woof!\n");
break;
case COW:
printf("Moo!\n");
break;
default:
printf("Animal makes no audible sound.\n");
break;
}
}
void animalfood(enum animal animalnumber)
{
switch(animalnumber)
{
case COW:
case SNAIL:
printf("Eats plants.\n");
break;
case CAT:
case SPIDER:
printf("Eats beasts.\n");
break;
}
}
int main(void)
{
enum animal myanimal, youranimal;
srand((unsigned)time(NULL));
myanimal = rand() % NUMBER_OF_ANIMALS;
animalsound(myanimal);
animalfood(myanimal);
youranimal = rand() % NUMBER_OF_ANIMALS;
animalsound(youranimal);
animalfood(youranimal);
return 0;
}
This program's output varies. It might be something like this:
Moo! Eats plants. Animal makes no audible sound. Eats beasts.
enum animal {CAT, DOG, COW, SNAIL, SPIDER, NUMBER_OF_ANIMALS};
This is yet another familiar enum. Again, NUMBER_OF_ANIMALS will
be set to the number of defined animals.
void animalsound(enum animal animalnumber)
{
}
This time we put part of our program in a separate function. This
way the same code can be used several times without having to be
completely written out every time (see also 2.4. Functions and types).
In this case, the animalsound() function will print out the corresponding sound for a given animal of our above enum type.
switch(animalnumber)
{
case CAT:
printf("Miaow!\n");
break;
case DOG:
printf("Woof!\n");
break;
case COW:
printf("Moo!\n");
break;
default:
printf("Animal makes no audible sound.\n");
break;
}
This is the new part in our program. A switch statement executes
certain code depending on the value of the variable that is switched
upon (the variable between the switch() parentheses, in this case
animalnumber). For every different possible case a "case value:" is
present, followed by the code that is to be executed in case the
switched variable is equal to the specified value.
Every case should end with a break; statement, because otherwise the code simply keeps continuing into the next case statement. This is known as fallthrough, and has both its advantages and disadvantages. Once you're used to it it shouldn't pose any problems, you just have to remember to put the break; in there. See below for a useful example of fallthrough.
A special case is the default: case. If a default: case is present, its code is executed when none of the other cases match. In our case the animals SNAIL and SPIDER don't have explicit case labels, and for them the default: case of "Animal makes no audible sound." will be executed.
The above code would be equivalent to: (apart from the fact that the animalnumber is evaluated multiple times in the if form)
if (animalnumber==CAT)
printf("Miaow!\n");
else if (animalnumber==DOG)
printf("Woof!\n");
else if (animalnumber==COW)
printf("Moo!\n");
else
printf("Animal makes no audible sound.\n");
But is obviously clearer and easier to expand.
Another difference between the if() and switch() form is that in the if() form non-integral types like double may be used. However, floating point variables may not be able to store an exact value, so you should not compare them with the equality operator (==). Instead you should limit the absolute value of the difference between the two to a certain percentage of the values, something like:
if (fabs(x-y) < x * 0.00001) /* x and y may be considered equal */ ;
The following functions prints out the type of food the specified animal number will consume:
void animalfood(enum animal animalnumber)
{
}
It also uses a switch() statement:
switch(animalnumber)
{
case COW:
case SNAIL:
printf("Eats plants.\n");
break;
case CAT:
case SPIDER:
printf("Eats beasts.\n");
break;
}
However, there are some differences here. This is where a useful
example of fallthrough is demonstrated: the COW and CAT case labels
are not ended with a break; statement so they will continue right
into the following case labels (SNAIL and SPIDER, respectively).
Also, there is no default: statement, which means that for the animals for which no explicit case is present (such as DOG here), no code will be executed, and therefore nothing will be printed.
enum animal myanimal, youranimal; srand((unsigned)time(NULL)); myanimal = rand() % NUMBER_OF_ANIMALS; animalsound(myanimal); animalfood(myanimal); youranimal = rand() % NUMBER_OF_ANIMALS; animalsound(youranimal); animalfood(youranimal);The rest of the program is equivalent to the code in the yesno program, except that there are NUMBER_OF_ANIMALS possible values instead of 2, and our own functions animalsound() and animalfood() are called to handle the generated random animal numbers.
/* prog9-1.c: loops */
#include <stdio.h>
int main(void)
{
int i;
i = 1;
while (i<4)
{
printf("%d\n",i);
i++;
}
printf("\n");
i = 1;
do
{
printf("%d\n",i);
i++;
}
while (i<4);
printf("\n");
for (i=15; i>=10; i--)
printf("%d\n",i);
return 0;
}
This program should output:
1 2 3 1 2 3 15 14 13 12 11 10
i = 1;
while (i<4)
{
}
The first line assigns 1 to the i variable. What follows is
a while() loop. A while loop has this form:
while (condition) statement;If the code inside the loop consists of multiple statements, a block must be used, just like with if statements:
while (condition)
{
code;
}
Be careful not to put a semicolon after the closing parentheses
of the while(condition).
The code will be repeatedly executed (looped) for as long as the condition is true. Obviously the code somehow needs to do something which may change this condition, or the loop will execute forever.
In our case the code in the loop body is:
printf("%d\n",i);
i++;
Which prints out the current value of i, and then increments
its value by 1. This way the condition i<4 in the while loop
will no longer be true as soon as i reaches 4. Since i
was originally set to 1, the output will become:
1 2 3Next, a printf() call to insert a blank line:
printf("\n");
Now, the next loop:
i = 1;
do
{
printf("%d\n",i);
i++;
}
while (i<4);
This seems quite similar, and indeed its output will be
identical in this case. Then what's the difference ?
Well, in a while() loop the condition is at the start,
which means that if i was, for example, 5 at the beginning,
the loop would never have executed and nothing would have
been output.
However, in a do-while() loop the condition is at the end, which means that the loop will always be executed at least once, even if the condition was false to begin with. So in the above example where i was 5 at the beginning, this loop would still have output 5 before ending.
for (i=15; i>=10; i--)
printf("%d\n",i);
This is a for loop. A for loop is a sort of shorthand form
for a loop, which consists of the following parts:A for loop has the following form:
for (initialization; condition; modification) code;Or, if multiple statements need to be put inside the loop:
for (initialization; condition; modification)
{
code;
}
And is equivalent to: [*]
initialization;
while (condition)
{
code;
modification;
}
[*] Not 100% equivalent, because the continue statement will work differently in the code using while() than that using for()
You can put more than one expression in the initialization, condition and/or modification part by separating them by the comma operator (,).
In our case:
for (i=15; i>=10; i--)
printf("%d\n",i);
The initialization statement assigns 15 to the variable i.
The loop condition is i>=10.
The modification decrements i by 1.
So this loop will start with i=15, checks that condition i>=10 is fulfilled, print out the value 15, decrement i by 1, check the condition again, etc. until the condition is no longer fulfilled.
This will therefore produce the following output:
15 14 13 12 11 10Which is, of course, the same as that would have been produced by the equivalent while() loop:
i = 15;
while (i>=10)
{
printf("%d\n",i);
i--;
}
Sometimes you will encounter something like this:
for (;;)
{
code;
}
Or:
while (1)
{
code;
}
These are infinite loops. They never end until they are aborted
by, for example, a break; statement. The for(;;) version will
work because when the condition is not filled in, it will default
to "always true". The while(1) version will work because the
condition 1 is always true.
It works in a similar way for loop statements, such as while(), do-while() or for(). It aborts the execution of the loop without checking the condition. It's a way of saying "exit now, no questions asked". You should try to avoid this when it is not really necessary, because it may cause the program's execution flow to be difficult to follow. Also be aware that a break; statement can only exit the innermost loop or switch() block, not any surrounding ones.
The continue; statement is a slightly more 'civilised' version of the break; statement. It says "abort this round of the loop execution, jump immediately to the condition evaluation". If the condition is still fulfilled, the loop will continue executing at the next round. If it is no longer fulfilled, the loop will be aborted. For example, suppose you want to process all elements i,j of a matrix except those on the main diagonal line (those where i==j), you could do something like this: (assume i and j are integer variables)
for (i=0; i<10; i++)
{
for (j=0; j<10; j++)
{
if (i==j)
continue; /* skip elements on main diagonal line */
/* code to process element i,j */
}
}
The goto statement can be used to jump to anywhere in the current function [*], but it's best to use it only to jump out of blocks, never into blocks, and jumping past variable initializations is usually a bad idea.
[*] To jump between functions - if you really must! - you should use setjmp()/longjmp() (see 16. Overview of the standard library)
You have to prefix the statement you wish to jump to by a label as follows:
mylabel: printf("some statement\n");
The statement may also be an empty statement, like this:
mylabel: ;You can then immediately jump to this statement from anywhere in the function by using:
goto mylabel;You have to be very careful about using goto. Unthoughtful use can make your code look like a bowl of spaghetti, i.e. very difficult to follow. A good use for it might be to jump out of a set of nested loops when an error condition is encountered, for example: (assume i, j and k are integer variables)
for (i=1; i<100; i++)
{
for (j=1; j<100; j++)
{
for (k=1; k<100; k++)
{
if (some_error_condition)
goto abort_for_loops;
}
}
}
abort_for_loops: ;
/* prog10-1.c: strings */
#include <stdio.h>
#include <string.h>
int main(void)
{
char s[20];
strcpy(s, "strings");
printf("s = %s\n", s);
printf("s[3] = %c\n", s[3]);
printf("s[7] = %d\n", s[7]);
printf("strlen(s) = %lu\n",(unsigned long)strlen(s));
strcat(s, " program");
printf("s now = %s\n", s);
printf("strlen(s) = %lu\n",(unsigned long)strlen(s));
return 0;
}
The output of this program is:
s = strings s[3] = i s[7] = 0 strlen(s) = 7 s now = strings program strlen(s) = 15
#include <string.h>This header file declares several useful functions for dealing with strings, such as strcpy(), strlen() and strcat() which we use in this program (see below).
As usual, at the beginning of main() we find a variable definition:
char s[20];The new thing here is the [20]. This is a definition of an array. An array is simply a contiguous series of variables of the same type. In this case we define s to be an array of 20 chars. The number inside the brackets [ ] must be a constant. [*]
[*] Variable length arrays are proposed for the new C standard
Multi-dimensional arrays (e.g. one that has rows and columns) can be defined as, for example, int multiarray[100][50]; etc.
You can have arrays of any type of variable. However arrays of char are often used for a special purpose: strings. A string is basically just text. Some languages have direct support for a 'string' type, but C doesn't. A string is an array of char with a special feature: the end of the string is marked by a char with value 0. That means we need to allocate one additional byte to store this 0 terminator. The consequence for our example is that s is able to hold a string of at most 19 characters long, followed by the 0 terminator byte.
strcpy(s, "strings");The string s we've defined above doesn't contain any useful contents yet. As we've seen before (see 2. Hello World!) a string literal between double quotes (") can be used as contents for the string. You might be tempted to do something like this:
s = "strings"; /* wrong! */However, this doesn't work. C doesn't support assigning arrays (including strings) in this way. You must actually copy all the characters in the string literal into the string you wish to initialize. Fortunately there is a standard function to this: strcpy(), declared in string.h.
To copy string1 to string2 you would write:
strcpy(string2, string1); /* notice the order of the parameters! */When you fill in the string variable (array of char) s for string2 and the string literal "strings" for string1 (not vice versa, as we've seen before you can't write to string literals, see 2. Hello World!), we have the code from our program.
printf("s = %s\n", s);
printf("s[3] = %c\n", s[3]);
printf("s[7] = %d\n", s[7]);
The first line prints out the entire text contained in string variable s,
by using printf() format specifier %s.
The second line prints out the character (element) with index 3 in the array. Such an element is written as arrayname[indexnumber]. You have to be careful with the index numbers, though: in C all counting starts from 0. So the 1st element really is element number 0, the 2nd element is element number 1, etc. This also means that s[20] would be the 21st element of our array, and since the array is only 20 elements long, this value is out of range and therefore not valid!
In our case the string variable contains the text "strings" followed by the 0 byte. Since s[3] is the 4th element of the array, thus the 4th character of the string, an 'i' is printed (the %c printf() format specifier is used to print a character).
The third line prints s[7], which is the 8th character of the string. Since the text is only 7 characters long, the 8th character is our 0 terminator byte. That's why it's printed using %d (integer value) instead of %c (the 0 terminator usually doesn't have a c>
strcat(s, " program");The strcat() function is used to append text to an already existing string variable. To append string1 to string2 you would write:
strcat(string2, string1); /* notice order of parameters */You must make sure the string (character array) is large enough to hold all characters + the 0 terminator byte after the append!
In our case we add the string literal " program" to our already existing string "strings". The resulting string will be 7 + 8 = 15 characters long, plus the 0 terminator this requires 16 elements. Since our array is 20 elements long, there is no problem.
printf("s now = %s\n", s);
printf("strlen(s) = %lu\n",(unsigned long)strlen(s));
Finally the new text contained in s is printed. This will of course be
"strings program". The new string length is also printed. This will of
course be larger than the previous value.
/* prog10-2.c: pointers */
#include <stdio.h>
int main(void)
{
int a[] = {1,2,3,5,7,11,13,17,19};
int *p;
printf("a[4] = %d\n", a[4]);
printf("the address of the array = %p\n", (void *)a);
printf("the address of element nr. 0 of the array = %p\n", (void *)&a[0]);
printf("the address of element nr. 4 of the array = %p\n", (void *)&a[4]);
p = a;
printf("the value of the pointer = %p\n", (void *)p);
printf("the value the pointer points at = %d\n", *p);
p += 4;
printf("the value of the pointer now = %p\n", (void *)p);
printf("the value the pointer now points at = %d\n", *p);
return 0;
}
The output of this program may be something like the following:
(the address values and even their notations will most likely be
entirely different on your system)
a[4] = 7 the address of the array = 00030760 the address of element nr. 0 of the array = 00030760 the address of element nr. 4 of the array = 00030770 the value of the pointer = 00030760 the value the pointer points at = 1 the value of the pointer now = 00030770 the value the pointer now points at = 7
What's a pointer ? Well, every variable has to be stored somewhere in
your computer's memory. The place where they are stored is called the
address of the variable. Such an address is some sort of position value
(its exact representation depends on your system, and shouldn't really
matter).
Such an address value can of course be stored in another variable (which
has its own, different, address). That's called a pointer. A pointer
contains the position in memory (address) of another variable, so in
effect 'points' to this variable.
Now let's analyze the program line by line and explain things on the way:
int a[] = {1,2,3,5,7,11,13,17,19};
This is a definition of an array (see previous section). However, it's
different from the one we used in our strings example in 3 ways:int *p;Another variable definition: the * indicates "is a pointer to". So we've defined p as a pointer to int, which means that p can hold the address value of an int variable. [*]
[*] If you need a generic pointer, not to any specific type of variable, you should use a pointer to void (void *). However, you can't do the sort of calculations with these sort of pointers that we're doing in this program.
printf("a[4] = %d\n", a[4]);
This prints the value of element number 4 (the 5th element) of the array,
which is 7. Nothing new here.
printf("the address of the array = %p\n", (void *)a);
We've seen before that by using something of the form arrayname[indexnumber]
we can access an element of an array. But what happens if we simply use the
arrayname itself (in this case a) ? That will return the address of the
first element of the array (element number 0). That's also why assigning
arrays simply using the assignment operator or passing the entire contents
of an array as a parameter is not possible.
OK, so we have the address of the array's first element, which is the same value as the address of the array (the type is different; the former is a pointer to int, the latter a pointer to an array of int). Now how do we print it ? That's something special. The printf format specifier %p prints the value of a pointer (thus an address). What it prints exactly will be different on every system, but the concept will be the same.
But %p is used to print a generic pointer as seen in the note above: a void * instead of the int * that the address of the array will be. Therefore we have to typecast (see 4.2. Operators and typecasts) the pointer value (address) to a void *. [*]
[*] On many systems simply passing an int * to the printf() format specifier %p will work fine; but the standard requires this conversion.
printf("the address of element nr. 0 of the array = %p\n", (void *)&a[0]);
printf("the address of element nr. 4 of the array = %p\n", (void *)&a[4]);
These two lines are similar to the previous one: they print out an
address using the %p format specifier, after typecasting that address
value to a generic void *.
However, instead of using the arrayname a to get the address of the array's first element, an element index is used (0 resp. 4) between the [ ], and another operator is prepended: the address operator (&). Putting & before something takes the address of that something (when that something has an address of course; something like &5 will not work since the constant 5 doesn't have an address in memory; it is not an lvalue). So &variable takes the address of the variable, and &arrayname[indexnumber] takes the address of an element of an array.
In our case, the first line will always print the same address value as the previous one, because &a[0] (the address of element number 0, i.e. the first element of the array) will always be the same as just a (the address of the array's first element). But &a[4] (the address of element number 4, i.e. the fifth element of the array) will be a bit higher. [*]
[*] The example output printed above on my system can be explained as follows: an int is 4 bytes on my system (this may be different on yours), and my %p prints out addresses as hexadecimal values (that may also be different on yours), so 00030770 (&a[4]) is 16 (10h) bytes past 00030760 (&a[0] or just a).
p = a;What's this ? We assign an array of int (a) to a pointer to int (p). Like we've said in our strings example, assigning an array using the = operator does not copy the actual elements of the array (that's why we needed the strcpy() function for strings). We've seen above that the arrayname a on its own returns the address of the first element of the array. So it is that address (of type pointer to int) that we assign to p, which is a pointer to int. That's why it works. The result now is that p contains the address of array a's first element; we therefore say that p now points to a.
printf("the value of the pointer = %p\n", (void *)p);
We print the address value stored in the pointer, which is of course the
address of the array.
printf("the value the pointer points at = %d\n", *p);
Ah, another operator. The * is called the dereference operator [*]. To
dereference a pointer means to take the value of the variable to which
the pointer is pointing. So *p means the value of the int that p
is pointing at (i.o.w. whose address value p contains).
[*] As seen above, this * is also used for defining pointers. You could also read the definition int *p; as "p dereferenced is int".
Since p points at the first element of the array, the value of element number 0 of the array will be printed: 1.
p += 4;
This is equivalent to p = p + 4; (see 4.2. Operators and typecasts). But how can you calculate with a pointer ? Well, the address value contained in the pointer variable can be modified using the operators + and -. However, there's one important difference to calculating with normal integral values: adding 4 to the pointer as in this example does not necessarily add 4 to the address value; instead it moves the pointer 4 elements of the type it points at further (so if your ints are 4 bytes like mine, this will add 4*4=16 to your address value). [*]
[*] You can't do this with the generic pointers to void, because the compiler doesn't know how many bytes the elements you're pointing at are.
So in this case p will now no longer point to element number 0 of the array, but to element number 4.
printf("the value of the pointer now = %p\n", (void *)p);
printf("the value the pointer now points at = %d\n", *p);
Which is verified by the output of these printf() calls.
Note: you can make pointers constants, too, just like regular variables; but watch out:
int *p; /* p itself and what p points to may be modified */ const int *p; /* p itself may be modified; what p points to may not */ int * const p; /* p itself may not be modified; what p points to may */ const int * const p; /* p itself nor what it points to may be modified */
Something else to watch out for is this:
char *a,b;This defines a char pointer (a) and a char (b), not 2 char pointers! On the other hand:
typedef char *char_pointer; char_pointer a,b;Does define 2 char pointers.
There is also a special pointer value, which may be used for all pointer types, that is used to indicate an invalid pointer value. This is the NULL pointer. NULL is defined in stdio.h, stdlib.h, stddef.h and in other headers. You should use NULL only for pointer values (to indicate an invalid pointer), not for other types of variables (e.g. not for an integral 0 value). You can use a plain 0 instead of NULL as well for this purpose, but most people prefer using NULL as a stylistic hint. [*]
[*] You shouldn't worry about what pointer value NULL is exactly. At the source code level 0 will evaluate to a NULL pointer value, but this does definitely not mean that the actual NULL pointer value must be a value with all bits 0.
Pointers are also useful in simulating "call by reference", which means that a function can change the value of its parameters and these changes will be passed on to the caller as well. With normal parameters (other than arrays) this is not possible in C, because they are always passed as "call by value", which means that the function can only access a copy of the actual values, and this will have no effect on the calling code.
By using a parameter of type "pointer to type" instead of just "type" this is possible. The calling code must then insert the address operator & before the variable passed as the parameter. The function can then access and modify the value using the dereference operator (*). This is also referred to as "pass by pointer" and you should not confuse it with the call by reference that is supported in C++, Pascal etc. because they don't require the calling code to explicitly insert an address operator.
An example may shed some light on this:
void swapintegers(int a, int b)
{
int temp;
temp = a;
a = b;
b = temp;
}
You would expect this function to swap the two integer variables a and
be that it is passed as parameters. It does that, but only within the
function itself. If the caller calls this function, e.g. like this:
swapintegers(myvariable, yourvariable);Then after the call, myvariable and yourvariable will not be swapped. The solution is to do it like this:
void swapintegers(int *a, int *b)
{
int temp;
temp = *a;
*a = *b;
*b = temp;
}
And in the calling code:
swapintegers(&myvariable, &yourvariable);
/* prog10-3.c: command line arguments */
#include <stdio.h>
int main(int argc, char *argv[])
{
int i;
printf("There are %d command line arguments\n", argc);
for (i=0; i < argc; i++)
{
printf("Command line argument number %d = %s\n", i, argv[i]);
}
return 0;
}
The output on my system is, provided I invoke the program as
"prog11 test 123": (but this may be entirely different on yours):
There are 3 command line arguments Commandline argument number 0 = C:\CINTRO\PROG11.EXE Commandline argument number 1 = test Commandline argument number 2 = 123The first thing that stands out is the other definition of main():
int main(int argc, char *argv[])
{
}
The first parameter (argc) is the number of command line arguments, and
the second parameter (argv) is an array (because of the []) of pointers
(because of the *) to char. So argv[indexnumber] is a pointer to char,
which in this case points to a string (array of char). Strings are
often passed as pointers to char, because just using the arrayname of
the string will give you the address of the array as a pointer to char.
This does not mean that strings (arrays of char) and pointers to char
are the same, just that in some cases they have equivalent behaviour.
This may be a bit confusing, but it is something you'll soon get familiar
with.
On most systems (not all!) argc is usually at least 1. If that is the case argv[0] is either an empty string or the filename (with or without the full path) of the executable. If argc is at least 2 there are command line arguments. In that case argv[1] is the first command line argument, argv[2] the second one, etc. until argv[argc-1].
printf("There are %d command line arguments\n", argc);
This line simply prints out the number of arguments. As noted above,
be aware that even if argc is 1 that still means there are no command
line arguments other than the program name, argv[0], if supported.
So if for example your program requires 1 command line argument,
you should check for argc==2.
int i;
/* ... */
for (i=0; i < argc; i++)
{
printf("Commandline argument number %d = %s\n", i, argv[i]);
}
This for loop goes through the argv array from index 0 to argc-1 using
the integer variable i as counter. For every command line argument
the number and content is printed. Notice that if argc==0 the loop
is never executed.
Note: as said above, the definition:
char *argv[]Defines an array of pointers to char. Because an array parameter is always passed as pointer, in this case this is equivalent to:
char **argvWhich is a pointer to a pointer to char, that can be used as an array of pointers to char (see 12. Dynamic memory allocation for another example of this handy similarity between pointers and arrays).
These sort of definitions and declarations can soon become confusing, especially when pointers to functions etc. start coming into play. A handy rule to understand complex definitions is the so-called right-to-left rule. You can read all about it in rtlftrul.txt in the Snippets collection (see 17.2. Other interesting online C-related material).
/* prog11-1.c: fileio */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char s[20];
char *p;
FILE *f;
strcpy(s,"file I/O test!");
printf("%s\n",s);
f = fopen("testfile","w");
if (f==NULL)
{
printf("Error: unable to open testfile for writing\n");
return EXIT_FAILURE;
}
fputs(s,f);
fclose(f);
strcpy(s,"overwritten");
printf("%s\n",s);
f = fopen("testfile","r");
if (f==NULL)
{
printf("Error: unable to open testfile for reading\n");
return EXIT_FAILURE;
}
if (fgets(s, sizeof s, f) != NULL)
{
p = strchr(s,'\n');
if (p!=NULL)
*p = '\0';
printf("%s\n",s);
}
fclose(f);
if (remove("testfile") != 0)
{
printf("Error: unable to remove testfile\n");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
This program's output should be:
file I/O test! overwritten file I/O test!
#include <stdio.h> #include <stdlib.h> #include <string.h>We need stdio.h not only for printf(), but also because it declares all file I/O functions that we're going to use here.
We need stdlib.h because we're no longer going to simply use return 0; but we're going to return a success/failure status to the operating system and EXIT_SUCCESS and EXIT_FAILURE are defined in stdlib.h (see 2.5. return).
We need string.h because we're going to work with strings, and need the functions strcpy() and strchr().
char s[20]; char *p; FILE *f;The first line defines an array of 20 characters, which we'll use to hold a string of upto 19 characters plus 0 terminator byte.
The second line defines a pointer to char that will be used to store the return value of the strchr() function (see below).
The third line defines a pointer to FILE. Such a pointer to FILE is used by almost all the file I/O programs in C. It is also referred to as a stream. This pointer is the only form the FILE type is used as; you may not define variables of type FILE by themselves.
strcpy(s,"file I/O test!");
printf("%s\n",s);
The text "file I/O test!" is copied into s and printed.
f = fopen("testfile","w");
OK, this is new. The fopen() function opens a file and returns a
pointer to FILE that will be used by all other file I/O functions
you'll use to access this file.
The first parameter to fopen() is the filename (a string). The second parameter is also a string, and can be any of the following:
"w" open for writing (existing file will be overwritten) "r" open for reading (file must already exist) "a" open for appending (file may or may not exist) "w+" open for writing and reading (existing file will be overwritten) "r+" open for reading and updating (file must already exist) "a+" open for appending and reading (file may or may not exist)
Append mode ("a") will cause all writes to go to the end of the file, regardless of any calls to fseek(). By default the files will be opened in text mode (which means that '\n' will be translated to your operating system's end-of-line indicator, other sorts of translations may be performed, EOF characters may be specially treated etc.). For binary files (files that must appear on disk exactly as you write to them) you should add a "b" after the above strings. [*]
[*] Some compilers also support a "t" for text files, but since this is non-standard and not required (since it is the default) you shouldn't use it.
if (f==NULL)
{
printf("Error: unable to open testfile for writing\n");
return EXIT_FAILURE;
}
As mentioned above, fopen() returns a pointer to FILE that will be used
by the other file I/O functions. However, in case an error occurred
(fopen() was unable to open the file for some reason) the returned
pointer to file will be NULL (the invalid pointer value, see
10.4. Addresses and pointers).
If that happens, this program prints an error message and returns to the operating system with the EXIT_FAILURE status.
fputs(s,f);The fputs() function writes a string to a file. It is similar to the puts() function (see 2.4. Functions and types), but there are two differences:
fclose(f);The fclose() function closes the file associated with a pointer to FILE, which has previously been opened by fopen(). After the fclose() you may not use the pointer to FILE as a parameter to any of the file I/O functions anymore (until a new file handle returned by e.g. fopen() has been stored in it, of course).
strcpy(s,"overwritten");
printf("%s\n",s);
This code copies a new text into s, thereby overwriting the previous
text, and prints this new text.
f = fopen("testfile","r");
if (f==NULL)
{
printf("Error: unable to open testfile for reading\n");
return EXIT_FAILURE;
}
This code opens the previously written "testfile" for reading. The
only difference with the above code is that the "w" has been
replaced with "r".
if (fgets(s, sizeof s, f) != NULL)
{
}
The fgets() function reads a string from a file. The middle
parameter is the size of the string. We use the sizeof operator
for this purpose, which will in this case return 20, the size
of the s array. Be aware that this way of using sizeof will
only work for arrays, not for e.g. a pointer to dynamically
allocated memory (see 12. Dynamic memory allocation).
It is important that you use the correct value here, because
if the value is too large, too many characters can be read so
that the array is overflowed, with unpredictable consequences
(e.g. a core dump or system crash). The fgets() function will
return NULL if an error occurred, hence the above if() statement.
p = strchr(s,'\n');
if (p!=NULL)
*p = '\0';
The fgets() function does not automatically remove the '\n' character that terminates a line. That's why it is manually removed by the above code. The return value of strchr(s,'\n') will be a pointer to the first occurence of the character '\n' in the string, or NULL if none is found. So we store this return value in the char pointer p, and if it is not NULL we set the newline character it is pointing at to the '\0' terminating character ('\0' is an octal character code constant (see 2.4. Functions and types), and is therefore equivalent to 0. It is often used instead of plain 0 to make clear that it is used as a character constant), thereby effectively terminating the string in such a way that the newline character is stripped off.
printf("%s\n",s);
The just read string is printed. If all went well, this should
print the original text again.
fclose(f);Again, the file is closed.
if (remove("testfile") != 0)
{
printf("Error: unable to remove testfile\n");
return EXIT_FAILURE;
}
The file we used to store our string is no longer needed, so we
remove it using the remove() function. The remove() function
will return 0 if it is successful, and nonzero if an error
occurs (hence the above if() statement).
return EXIT_SUCCESS;If we got here, all was successful, so we return to the operating system with status EXIT_SUCCESS.
Note: fgets() and fputs() are used for reading/writing strings, usually from/to text files. When interfacing with binary files, often non-string data should be read/written. That's what the fread()/fwrite() functions are for (see 16. Overview of the standard library) [*].
[*] When reading/writing a specified file format you may be tempted to use fread()/fwrite() to read/write structures from/to a file. However, there may (and will!) often be padding bytes inbetween member variables and at the end of the structure. So in that case you'll have to resort to either reading/writing every member variable separately, or if portability isn't your greatest concern your system may support a way of turning off this padding.
For reading/writing single characters the fgetc()/fputc() functions are available (see 16. Overview of the standard library) [*]. There are also the equivalent getc()/putc() functions which may be implemented as a macro (see 13. Preprocessor macros/conditionals).
[*] When using these functions you should store the return value
in a variable of type int, not of type char, because they can return
EOF if an error occurred.
11.3. The interact program
/* prog11-2.c: interact */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char name[20];
char buffer[10];
char *p;
int age, ok;
printf("Enter your name: ");
fflush(stdout);
if (fgets(name, sizeof name, stdin) == NULL)
{
printf("Error reading name from stdin\n");
return EXIT_FAILURE;
}
p = strchr(name,'\n');
if (p!=NULL)
*p = '\0';
fprintf(stdout, "Welcome, %s!\n", name);
do
{
printf("Enter your age [1-150]: ");
fflush(stdout);
if (fgets(buffer, sizeof buffer, stdin) != NULL
&& sscanf(buffer, "%d", &age) == 1
&& age>=1 && age<=150)
{
ok = 1;
}
else
{
ok = 0;
printf("Invalid age! Enter again...\n");
}
}
while (!ok);
printf("Your age now is %d. In 10 years your age will be %d.\n",
age, age+10);
return EXIT_SUCCESS;
}
The output of this program should be something like this:
Enter your name: John Doe Welcome, John Doe! Enter your age [1-150]: 32 Your age now is 32. In 10 years your age will be 42.
char name[20]; char buffer[10]; char *p; int age, ok;Here we define an array of 20 characters, which will be used to hold the name string, and an array of 10 characters, which will be used as an input buffer for the age, another char pointer which will be used for the return value of strchr(), and two integer variables: one for the age, and one that will be used as a flag variable (0 for not OK, 1 for OK).
printf("Enter your name: ");
fflush(stdout);
As seen in 3.3. printf(), if we don't end the text with a newline
(which we don't do here because we want to stay on the same line),
we must add an explicit fflush(stdout). fflush() makes sure all
output is sent through (it flushes the buffers). stdout is the
standard output stream, which is a pointer to FILE that corresponds
to the output device that is also used by printf() etc. [*]
[*] fflush() should only be used for output streams, because when used on input streams it produces undefined behaviour; on some systems it clears any pending input characters but this is definitely not standardized behaviour
if (fgets(name, sizeof name, stdin) == NULL)
{
printf("Error reading name from stdin\n");
return EXIT_FAILURE;
}
p = strchr(name,'\n');
if (p!=NULL)
*p = '\0';
An fgets() [*] with error checking, followed by the stripping of the
newline (if any), very similar to what we did in the fileio program.
[*] If the user attempts to enter more characters then allowed, the excess characters will be left on stdin to be read later. This may be undesirable, but isn't dealt with here to keep the code simple.
The only difference is that this time we don't use a pointer to FILE which we opened ourselves, but the stdin standard input stream, which is a pointer to FILE that is already opened for us. The following are available:
stdin standard input stream (often keyboard) stdout standard output stream (often screen) stderr standard error stream (often screen)
[*] Some systems support more, such as e.g. stdprn for a standard printer stream, but this is not standard.
You may notice that for obtaining input from the stdin stream, instead of fgets() some people use the gets() function. However, this function does not accept a size parameter, so it can't control the input to fit within the array and will therefore happily overwrite any memory past the array thereby causing undefined behaviour (core dump, system crash, ...). Concerning the gets() function I have only one advice: do NOT use it. At all.
fprintf(stdout, "Welcome, %s!\n", name);The fprintf() function is equivalent to the printf() function, except that its first parameter is a pointer to FILE where the output will be written to. In this case stdout is used, which is the default for printf() anyway, so this is equivalent to:
printf("Welcome, %s!\n", name);
Onto the next construct:
do
{
}
while (!ok);
The code inside this loop is repeated until the ok variable
becomes non-zero at the end of the loop.
printf("Enter your age [1-150]: ");
fflush(stdout);
Another prompt, with appropriate fflush(stdout).
if (fgets(buffer, sizeof buffer, stdin) != NULL
&& sscanf(buffer, "%d", &age) == 1
&& age>=1 && age<=150)
{
ok = 1;
}
else
{
ok = 0;
printf("Invalid age! Enter again...\n");
}
First another string is read from stdin using fgets() and the
return value checked (this time we don't bother to remove the newline
because we are going to convert this buffer to an integral variable
anyway).
printf() has an opposite: the scanf() function. It also uses a format string like printf(), and is passed a variable number of arguments consisting of variables that correspond to a specific format specifier in the format string. But because the variables will be used to store an input result in, they must be prepended with the & address operator, so that their address is passed instead of their value (this is not required for arrays, such as strings, because they are always passed via their address anyway).
The format specifiers are mostly the same as those for printf() (see 3.3. printf). However, because an address is passed instead of a value there is no more float -> double, char -> int or short -> int promotion, so the following new format specifiers or modifiers are added/changed:
Format specifiers:
%i unlike %d, allows for entering hex (0x...) or octal constants (0...)
%n number of characters read upto now will be stored in signed int
%c exact number of characters specified by width specifier will be read
(when no width specifier is present, 1 character will be read)
%[character list] longest string consisting of the characters in the
character list will be read
%[^character list] longest string consisting of characters other than
those in the character list will be read
Modifiers:
h for d,u,x,o: short int instead of int
for n: result will be stored in short int
l for d,u,x,o: long int instead of int
for n: result will be stored in long int
for f,e,g: double instead of float
(note that this is different from printf(), where %lf etc.
may not be used for printing a double)
* assignment suppression
But the scanf() function itself is not very reliable. If the user
enters wrong data, it can leave things on the input stream etc.
which may be very annoying.
A better approach is to read a string using the fgets() function, and then process the input from this string using the sscanf() function. The sscanf() function works like the scanf() function, except that it takes as its first parameter the string where it has to read its input from. So in our case:
sscanf(buffer, "%d", &age)
Reads a decimal integer (%d) from the string buffer and stores it in
the integer variable age (notice the address operator!).
However, we have to check whether all of this was successful because the user might also have entered something like "foo bar" instead of a valid age. Hence the code:
sscanf(buffer, "%d", &age) == 1
sscanf(), like scanf(), returns the number of variables it has been
able to assign to properly. In our case this should be 1 if all
went well.
&& age>=1 && age<=150
Furthermore we check that the entered age is between
1 and 150, otherwise we reject it as well.
We use a special feature of the logical operators && (and ||) here: they are the only operators that are guaranteed to execute from left to right, so we can be sure that by the time the sscanf() is reached the fgets() has been executed and by the time we check the value of age, the sscanf() has been executed. You may not rely on this with other operators (except the comma operator and the ?: operator). The && and || operators also have something called short-circuiting: if the left operand of the && operator is false, the right operand is guaranteed not to be evaluated. Similarly, if the left operand of the || operator is true, the right operand is guaranteed not to be evaluated.
if (fgets(buffer, sizeof buffer, stdin) != NULL
&& sscanf(buffer, "%d", &age) == 1
&& age>=1 && age<=150)
{
ok = 1;
}
else
{
ok = 0;
printf("Invalid age! Enter again...\n");
}
So if all these conditions are met, we set ok to 1 which will
cause the loop to be ended. If they are not, we set ok to 0
which will cause the loop to be executed again, and we write
an error message to inform the user that he has to re-input.
printf("Your age now is %d. In 10 years your age will be %d.\n",
age, age+10);
To make use of the fact that we now have the user's age in
integer format, we do a little calculation with it.
Another function that is often used to read input from stdin is getchar(). It reads a single character, but does require the user to press enter. To avoid this, compiler/operating system specific functions have to be used. You have to be careful when using it similar to this:
if ((c=getchar()) != EOF) /* code */ ;In the above code, the variable c must be of type int, not char! The getchar() function returns an int, which may be an unsigned character value if a character was successfully read, or EOF (a negative value) if an error occurred.
/* prog12-1.c: persons */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct person {char name[30]; /* full name */
unsigned birthday, /* 1..31 */
birthmonth, /* 1..12 */
birthyear; /* 4 digits */
};
int main(void)
{
struct person *personlist;
unsigned number_of_persons, num;
char buffer[30];
char *p;
int year, month, day;
int ok;
do
{
printf("Enter the number of persons: ");
fflush(stdout);
if (fgets(buffer, sizeof buffer, stdin) != NULL
&& sscanf(buffer, "%u", &number_of_persons) == 1)
{
ok = 1;
if (number_of_persons>0)
{
personlist = malloc(number_of_persons * sizeof(struct person));
if (personlist==NULL)
{
printf("Not enough memory to store that many persons!\n");
ok = 0;
}
}
}
else
{
ok = 0;
printf("Invalid number! Enter again...\n");
}
}
while (!ok);
if (number_of_persons==0)
{
printf("OK, perhaps another time!\n");
return 0;
}
for (num=0; num<number_of_persons; num++)
{
printf("\nEnter the information for person #%u:\n", num);
printf("Name: ");
fflush(stdout);
if (fgets(buffer, sizeof buffer, stdin) == NULL)
{
printf("Error reading from stdin; input aborted\n");
number_of_persons = num;
break;
}
p = strchr(buffer,'\n');
if (p!=NULL)
*p = '\0';
if (strlen(buffer)==0)
{
printf("Input stopped\n");
number_of_persons = num;
break;
}
strcpy(personlist[num].name, buffer);
do
{
printf("Birthday [YYYY-MM-DD]: ");
fflush(stdout);
if (fgets(buffer, sizeof buffer, stdin) != NULL
&& sscanf(buffer, "%d-%d-%d", &year, &month, &day) == 3
&& year>=1000 && year<=9999
&& month>=1 && month<=12
&& day>=1 && day<=31)
{
ok = 1;
}
else
{
ok = 0;
printf("Invalid birthday! Enter again...\n");
}
}
while (!ok);
personlist[num].birthyear = year;
personlist[num].birthmonth = month;
personlist[num].birthday = day;
}
printf("\nOK, thank you.\n");
printf("\nYou entered the following data:\n");
printf("\n%-10s%-30s%s\n", "Number", "Name", "Birthday");
for (num=0; num<number_of_persons; num++)
{
printf("%-10u%-30s%04d-%02d-%02d\n",
num,
personlist[num].name,
personlist[num].birthyear,
personlist[num].birthmonth,
personlist[num].birthday);
}
free(personlist);
return 0;
}
An example output of this program is:
Enter the number of persons: 10 Enter the information for person #0: Name: John Doe Birthday [YYYY-MM-DD]: 70-5-31 Invalid birthday! Enter again... Birthday [YYYY-MM-DD]: 1970-5-31 Enter the information for person #1: Name: Foo Bar Birthday [YYYY-MM-DD]: 1948-1-1 Enter the information for person #2: Name: Input stopped OK, thank you. You entered the following data: Number Name Birthday 0 John Doe 1970-05-31 1 Foo Bar 1948-01-01
#include <stdio.h> #include <stdlib.h> #include <string.h>
As usual, stdio.h is needed for our input/output, and string.h for the string functions we need, like strchr() and strcpy(). However, stdlib.h is also needed because it contains the declarations for the malloc() and free() functions, which we'll use for our dynamic memory allocation.
struct person {char name[30]; /* full name */
unsigned birthday, /* 1..31 */
birthmonth, /* 1..12 */
birthyear; /* 4 digits */
};
As seen in 6. Structs, unions and bit-fields, this is a definition
of a structure type. It consists of an array of 30 chars, and 3 unsigned
variables. The comments indicate what each member variable will hold:Don't forget that what we've defined here is just a structure type, we don't have any actual variables capable of holding any information yet.
struct person *personlist; unsigned number_of_persons, num; char buffer[30]; char *p; int year, month, day; int ok;Now, to hold that information we could have used an array, for example:
struct person personlist[100];But this way we would limit the number of persons to 100, and someday someone will need more. If we would use a very large number, say 10000, we might seriously reduce the chance of the array being too small, but in most cases we'd also be wasting a lot of space (and it is even possible that your compiler doesn't support very large arrays like that).
How do we solve this problem ? Instead of determining the number of persons in our program's source code, we let the user determine it at runtime, dynamically. That's why we'll need dynamic memory allocation. To be able to do that we must define a pointer (see 10.4. Adresses and pointers) to our type (in this case struct person). This pointer doesn't yet point anywhere valid and no memory is allocated yet, so you shouldn't make the mistake of attempting to use the pointer before any memory has been allocated for it.
That explains the following definition of a pointer to struct person:
struct person *personlist;The other variable definitions are:
unsigned number_of_persons, num; char buffer[30]; char *p; int year, month, day; int ok;Here we define two unsigned integers, number_of_persons and num, of which the former will hold - as its name suggests - the number of persons, and the latter will be used as a counter. The buffer array will, as in previous examples, be used to buffer the user's input before processing, and the char pointer will be used to hold the return value of strchr(). The year, month and day variables will be used to temporarily hold a person's birthday, and the ok variable will again be used as a flag variable (1 for ok, 0 for not ok).
do
{
}
while (!ok);
As we've done before, this loop will repeatedly prompt the user for
input until the entere