These notes are not a complete tutorial or reference. They are a useful collection of important topics for someone who has programmed in C but might be rusty. The idea is that the material here can get you started with a programming project quickly.
|
Note
|
If you want more comprehensive details about the C language then this is a good place to start. |
Compiling
C is a compiled language meaning that it’s source code needs to be translated into something the computer can understand (in its entirety) before it is actually run. The "compiler" does this.
Then typical way to compile a program looks like so:
gcc -o typical typical.c
If you don’t specify a -o option (for output) your executable program will be named a.out which is not terrifically useful. It’s best to not do too much of that or you’ll have one a.out overwriting another.
In the old days compiling C programs on a Linux system was kind of a giant pain. These days things tend to work much smoother but for reference, I’ll include some notes on things to try to solve typical compile issues.
-
Use -D_GNU_SOURCE early and often. Modern Linux systems seem to come with a gcc that is aware that it’s a Linux system and does the right thing, but it wasn’t always so.
-
If include (.h) files are "lost" try an option like -I/usr/X11R6/include/X11/magick/ which can provide hints where to find include files.
-
Math not working even though you added a #include <math.h>? Maybe only some of math ( undefined reference to "floor")? Try -lm which often fixes that. I do not understand the logic of this requirement, but sometimes it solves these problems.
-
Are you nuts and compiling something against Xlib? You might need something like this: -L/usr/X11R6/lib -lX11
Preprocessor
Including Libraries
#include "file_in_this_directory.h" #include "/an/explicit/path.h" #include <look_in_the_normal_place.h>
There are many useful functions in standard libraries. It looks like Wikipedia has a pretty good list of Posix C libraries. This is a specification of libraries a sane system should provide. Here are some of the classic ones with some of the defined functions listed.
- stdio.h
-
Includes the super important printf. And the rather important macro NULL. Also fwrite, fread, fputc, putc, putchar, ungetc, fflush, fopen, freopen, fclose, remove, rename, rewind
- math.h
-
Pretty much anything involving the eponymous topic of math. Here are some useful ones: ceil (nearest whole above), exp, floor (nearest whole below), pow, sqrt. And most of the others: acos, asin, atan, atan2, cos, cosh, fabs, fmod, frexp, ldexp, log, log10, modf, sin, sinh, tan, tanh.
- stdlib.h
-
exit, abort, assert, perror, atexit, getenv, system, malloc, calloc, realloc, free, atoi, atol, atof, strtod, strtol, strtoul, rand, srand, qsort, bsearch Here’s a notable use: char *u; u= getenv("USER");
- ctype.h
-
isalnum, isalpha, isdigit, isxdigit, isgraph (visible character), isprint (printable character), isupper (case), islower, iscntrl, ispunct, isspace, tolower, toupper
- string.h
-
strlen, strcpy, strncpy, memcopy, memmove, strcat, strncat, strcmp, strncmp, memcmp, strchr, strrchr, memchr, strcspn, strpbrk, strspn, strstr, strtok, strerror, memset
- unistd.h
-
Includes the getopt function.
- locale.h
-
setlocale, localeconv
- time.h
-
asctime, ctime, clock, difftime, gmtime, localtime, mktime, time, strftime
- signal.h
-
Defines functions and MACROS for handling signals. signal, raise also SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, SIGTERM
Preprocessor Macros
All kinds of mischief can be had with preprocessor tricks. Generally it seems that this should only be used to manage the software development aspect of the program and not the program’s actual functionality. Here’s an example of a preprocessor macro in use:
#include <stdio.h> #define TYPE(T,V) T V;printf(#T"= %d\n", sizeof V); int main(void){ TYPE(char,a_char) TYPE(short int,a_short_int) TYPE(int,an_int) TYPE(unsigned int,an_unsigned_int) TYPE(long,a_long) TYPE(unsigned long,an_unsigned_long) TYPE(float,a_float) TYPE(double, a_double) TYPE(long double,a_long_double) return 0;}
To see what this preprocessor macro does, see below.
Other preprocessor tricks would be stuff like general constants that should be flexible depending on how someone might want to compile the program:
#define PRECISION .0001
Often there is a big maze of include files and it’s easy to have one place include a library and then another place try to do it too resulting in some kind of clash. The following checks to see if the special library has been loaded and if not, it loads it. Subsequent uses of this will be ignored.
#ifndef SPECIAL #include "special.h" #define SPECIAL #endif
Preprocessor macros can be useful for debugging messages too.
#define VERBOSE 4
...
if (VERBOSE > 2) {printf("A level 2 message.");}
Just set the value to 0 to turn off verbose messages. This allows the programmer to set up a bunch of diagnostic print statements that can be turned on or off easily.
Idiosyncratic C Operators
- x++
-
Increments variable x by 1 after using it in this spot.
- ++x
-
Increments variable x by 1 before using it in this spot.
- x--
-
Decrements variable x by 1 after using it in this spot.
- --x
-
Decrements variable x by 1 before using it in this spot.
- {test}?{true}:{false}
-
An "if" statement for saving punch card chad.
- x,y
-
Evaluate and discard x, evaluate and retain y.
Main Structure Of A C Program
A C program is a collection of functions, routines that possibly take some input and possibly return some output. All C programs that run must have one and only one function called main.
Here is a typical structure showing how the main function can be passed the command line arguments. This program is useful for diagnosing exactly what the C program is receiving from the executing shell. It also shows the polite return code (0 is usually success and 1 is usually failure while other numbers can signify fancy modes of failure or other things).
/* Comments look like this! */ #include <stdio.h> int main(int argc, char *argv[]) { int i= argc; for (i-1;i+1;i--) { printf("Argument #%d:%s\n",i,argv[i]); } return 0; }
Or here’s a more bad ass version:
#include <stdio.h> int main(int argc, char *argv[]){ while (argc--) printf("%s\n", *argv++); }
|
Note
|
To only show the arguments and not the program name (element 0), just make both of the postfix modifiers (-- and ++) into prefix modifiers. |
If you don’t care about the command line arguments use something like:
int main(void) { /*code goes here*/; return 0; }
Types
C is a "strongly typed" language meaning that it carves out memory for various purposes based on very explicit definitions of the resources which will be used. Important types:
- int
-
Integer, i.e. non fraction whole numbers.
- float
-
Numbers that can represent a continuous value (to the accuracy a binary representation ultimately provides).
- char
-
A character.
- enum
-
An enumeration. Used to create a type with a constrained set of possible values. enum lightswitch {Off, On}; Here lightswitch can be either "Off" or "On" which is the same as 0 and 1. If you wanted different values, use something like enum lightswitch {Off=-1, On=1};
- union
-
Define with something like: union Lights { int Switch; float Dimmer;} This allows one thing (Lights) to either have an int value if it’s just a switch and a float value if it’s a dimmer. It’d be good to keep a separate variable around to store which you’re using at any given time or confusion will result.
- struct
-
A structure. Used to create custom types that hold collections of things. struct point { int x; int y; int z}; To declare a variable of this type you need to do struct point LastKnown;
- Custom
-
Sometimes you want to make some complex named type have a simple name. To do that use the typedef statement: typedef short int twobyter; Now declaring something as twobyter X= 0 is the same as saying short int X=0; This can be a handy trick when setting up arbitrary data structures that may find utility handling different payloads. Just typedef the data component of the complex structure and tailor that to your needs at the beginning of the program.
The various types require different amounts of storage in memory. It is best to choose the most economical type which satisfies requirements. This is a nice feature to be able to optimize in this way, however, since it is not optional it is also one of those pains that makes C programming a bit tedious. Here is the output of the preprocessor example above which shows the size in bytes of various storage types on my machine.
char= 1 short int= 2 int= 4 unsigned int= 4 long= 4 unsigned long= 4 float= 4 double= 8 long double= 12
Pointers
Objects in C can be handled by their names (which imply their contents), but a far more powerful and flexible technique is to work with them by only specifying the address where the data of interest is. The reason for this is that it’s computationally expensive to shuffle things around in memory if you don’t really need to. It’s better to leave the bulk of the thing alone and just refer to it where it is needed. It’s a bit like money. You could trade gold specie for the things you want, but for most transactions, it’s easier to leave the gold in a vault somewhere and just trade promissory notes referring to it. (Assuming a gold standard) writing a check is like referring to a reference (bank notes) to actual money (the gold). This is like a C pointer’s ability to point to a pointer.
So if you have a variable called big_thing with a lot of data in it, you can do things with that variable by name, but sometimes it is more effective to just refer to the location where that thing lives. It’s quite like addresses in real life: you don’t have to specify the exact nature of a house at a particular address or if it’s a strip mall or whatever, just the address is sufficient to deal with it for many purposes.
Important ideas with pointers:
-
Pointers are a data type that holds exactly one memory address. What that address actually is should seldom ever be of concern.
-
Pointers can point to other pointers.
- int x;
-
Defines an integer type called x.
- int *ptr2x;
-
Reads "Define the thing ptr2x points to as an integer." This (*) is technically called the "indirection operator".
- ptr2x= &x;
-
Reads "Set ptr2x to the address of the object defined by x."
- p->n= 0;
-
Sets to zero the subcomponent n in the structure that pointer p points to. This is technically called the "indirect member access operator".
Arrays
A[i] is the same as (*((A)+(i)))
float origin[3]= {0,0,0};
Brackets are actually a postfix operator for manipulating the array specified by the operator. << CONFUSING?
Elements of arrays are stored in successive pointer address locations.
&origin[1]-&origin[0] == 1
Find the length of an array:
int length= sizeof origin / sizeof origin[0]
Chars and Strings
An array of objects of the char type has some special syntactical properties in C. This is to facilitate the handling of "strings".
char alphabet[26]; char theFword[4] = {f, u, n, \0}; char string[6] = "twine"; char gray[] = {g, r, a, y, \0}; char salmon[] = "salmon";
|
Note
|
Hmmm. Looks like *argv[] is the same as **argv. |
Branching
Basically computers compute by making logical decisions. In C, the main decision making feature is the if statement:
if ({test_expression}) {statement_block} else {statement_block}
For if statements, the test expression can be anything that reduces to an integer which equaln integer which equalss 0 (which is false) or something else (which is true).
A fancier form of branching can be done with the switch and case statements. Here’s how it works:
#include <stdio.h> #include <unistd.h> int main (int argc, char **argv ){ static char optstring[]="a:b:c"; int o; while ( (o = getopt(argc, argv, optstring)) != -1) switch(o) { case 'a': { printf("Option argument for `a` is: %s\n",optarg); break; } case 'b': { printf("Option argument for `b` is: %s\n",optarg); break; } case 'c': { printf("Option `c` has no argument.\n"); break; } default: { printf("Option `%c` is unknown.\n", o); } } return 0;}
Looping
Interesting software is a result of many logical decisions being repeatedly performed in interesting ways. The main way to achieve multiple iterations of an action in the C idiom is with the for loop:
for ({initial};{test_before_each_iteration};{eval_after_each}){thing_to_do}
Here’s a more interesting example:
for (hi=100,lo=60;hi>=lo;hi--,lo++){converge(hi,lo);}
|
Note
|
You need to define the variables that appear in the for statement prior to using it. If that really bugs you, you can try compiling with -std=c99 but that seems kind of non standard to me in some slight way. The less compiler magic, the better IMO. |
The other two important loop structures are similar with a subtle difference. These are the while loops. The most basic works like so:
while ({test_expression}) {do_this_stuff}
If before any attempt to execute the body of the loop the test expression is 0 or NULL then the loop is skipped and control is passed on.
If you want the test evaluated after the loop body code is run (which implies the loop body will always run at least once) use this form:
do {do_this_stuff} while ({test_expression});
Exiting loops
- continue
-
This statement jumps control to the end of the current loop body statement as if it had completed an iteration and was now ready for more. It allows for short circuiting some code that might otherwise be performed on every iteration.
- break
-
This statement jumps control just past the end of the current loop body statement as if the last iteration had just occurred and finished. This statement basically says that this looping structure is completely finished, not just this iteration.
- return [expression]
-
This is the way to break out of a function. The optional expression is passed back to the calling function by value (so use pointers where that’d be unpleasant). If the function was defined as void then don’t include an expression. A function can have several return points depending on the situation.
Dynamic Memory
Anytime you are working with an amount of data that you can not explicitly define an upper bound on ahead of time, you probably need to use dynamic memory. The main mechanism of dynamic memory is the malloc() function which runs around looking for enough contiguous memory to reserve for some run time defined purpose. Once malloc() finds the memory you’ve requested, it returns a pointer to that location so you can start doing stuff with it. The format for using malloc() is a bit fussy:
p= (struct Thing *) malloc (sizeof (struct Thing))
Here the sizeof() function returns exactly the value (in bytes) for just how much memory an instance of struct Thing would need. That memory is is reserved and the pointer that is returned is cast (forced) by the first parentheses to point to memory that is configured as a struct Thing.
When you’re program is finished with some memory that has been allocated, it’s polite (or maybe even critical) that it be returned to the system for use. The way to do that is with the free() function which takes a pointer to the memory you want recycled.
Simple Stack Implementation
Before C can be made into anything useful, you really need to create some tools to make certain tasks easier to implement. One theme that comes up over and over in more substantial programming tasks is the need to hold an arbitrary bunch of data somewhere. Since C requires very explicit declarations of all memory used, this can be challenging to always attend to it. It is therefore useful to create some templates that can get you into more interesting parts of the problem quickly.
Here is an implementation of a simple stack system. The stack is fed data with a Push() command, that is data is appended to the end of the stack (a FILO queue). Data is retrieved (and removed) from the stack with a Pop() function. Note the type definition cargo_type allows the stack to carry whatever kinds of data types you want simply by redefining this.
#include <stdlib.h> #include <stdio.h> #include <time.h> typedef int cargo_type; struct linkbox { cargo_type cargo; struct linkbox* next;}; typedef struct linkbox lbox; void Push(cargo_type v, lbox** p2mylist); cargo_type Pop(lbox** p2mylist); cargo_type Iter(lbox** current); int dice(int sides); int main(void){ int i,m; srand(time(NULL)); m= dice(20); lbox *mylist=NULL; for (i=0;i<m;i++){ Push( dice(6), &mylist); } lbox *index= mylist; int sum=0, n=0; while (index) { sum += Iter(&index); n++; /*printf("Iter:%d\n",Iter(&index));*/ } printf("Average:%f\n",(float)sum/n); while (mylist) { printf("Popping:%d\n",Pop(&mylist)); } return 0;} int dice(int sides){ return rand() % sides + 1;} cargo_type Iter(lbox** c){ cargo_type t= (*c)->cargo; *c= (*c)->next; return t;} void Push(cargo_type v, lbox** p2mylist){ lbox* latestbox; latestbox= (lbox *) malloc(sizeof(lbox)); latestbox->cargo= v; printf("Pushing:%d\n",v); latestbox->next= *p2mylist; *p2mylist= latestbox; return;} cargo_type Pop(lbox** p2mylist){ cargo_type t= (*p2mylist)->cargo; lbox *dead= *p2mylist; *p2mylist= (*p2mylist)->next; free(dead); return t;}
This program is also an example of passing function arguments by reference. It needs a pointer, so the pointer is pointed to by another pointer which gets sent to the function. When the transporter pointer is dereferenced, the original pointer that was supposed to show up at the function is ready to go. The reason this is necessary is that C function arguments are copied over and if you copy a pointer, it’s a different pointer (even if it points to the same place). If you inserted a new node between a function copy of the pointer to the list and the list, then you’d lose track of the (complete) list when the function variable’s memory was freed on function exit.
Useful Tricks
Random Numbers
To get a random number between 1 and 100 do something like this:
#include <stdlib.h> #include <stdio.h> #include <time.h> int main (void) { srand(time(NULL)); int mystery= rand() % 100 + 1; printf ("Random number from 1 to 100: %d\n", mystery); return (0); }
You need the srand() to seed the random number generator. The rand() function returns random numbers between 0 and RAND_MAX. If you need a random number between 0 and 1, another way to do that would be to do rand()/(RAND_MAX+1).
|
Warning
|
The method of seeding srand() with a time(NULL) function is ok in many situations, but remember that this can be reversed engineered. This means you don’t want to write a real-money gambling game that is randomized in this way. Also if you run the program quickly the time may be the same to within a second and this will cause the "random" output to possibly repeat itself. |
If you are using a proper operating system (like Linux or a fruit-based computer) there is a managed resource that collects entropy for use by various processes in establishing randomness. This source of randomness is presented as a file by the kernel and automagically filled with pretty high quality random numbers (see man random for gory details). Here is a way to get random numbers using a seed pulled from this source:
#include <math.h> #include <stdio.h> #include <unistd.h> #include <stdlib.h> int main (int argc, char *argv[]) { FILE *urandom; unsigned int seed; urandom = fopen ("/dev/urandom", "r"); if (urandom == NULL) { fprintf (stderr, "Cannot open /dev/urandom!\n"); exit (EXIT_FAILURE); } fread (&seed, sizeof (seed), 1, urandom); srand (seed); printf ("Random number from 1 to 100: %d\n", (int) floor(rand() * 100.0 / ((double) RAND_MAX + 1) )+ 1); exit (EXIT_SUCCESS); }
A good illustration of the difference can be seen by running these numerous times very quickly. If run 10,000 times, a random number between (and including) 1 and 100 should pop up roughly 100 times. You can see that producing random numbers from the OS’s seed does roughly that. The time based one, however, does a terrible job. Most of the time it will produce zero results with a particular preselected number ("88" in the following example).
$ for x in `seq 10000`;do ./rand_from_os | grep ' 88$' ; done | wc -l 97 $ for x in `seq 10000`;do ./rand_from_os | grep ' 88$' ; done | wc -l 94 $ for x in `seq 10000`;do ./rand_from_time | grep ' 88$' ; done | wc -l 0 $ for x in `seq 10000`;do ./rand_from_time | grep ' 88$' ; done | wc -l 512
This is because over the course of a few seconds to run, the time only changes a few times and most of the values will be from only a handful of seeds. Ironically, this problem is worse on higher performance machines.
Print Error Messages
Something like this:
fprintf(stderr,"Prints to standard error.\n");
Core Dump Analysis
What if you get the dreaded Segmentation fault? This means something bad happened at run time. Most errors are caught at compile time but sometimes your program looks fine to the compiler and does a silly thing once you actually fire it up. Besides mystical intuition the best methodical way to analyze the problem is to have the system create a memory dump at the time of the error and then use a special tool to look through this memory file to figure out what went wrong. To get a misbehaving program to create a core dump file compile like this:
gcc -g -o sketchy sketchy.c
Or if you’re definitely going to use gdb:
gcc -ggdb -o sketchy sketchy.c
If it still has a seg fault and you’re not getting a (core dumped) message appended to it, try changing your environment with:
ulimit -c unlimited
This removes any restriction on the size of core files allowed by the shell.
|
Note
|
When you’re done playing with core files, you might want to do ulimit -c 0 so that segmentation faults don’t generally produce core files. Normally, it’s a pain to have these files mysteriously lying around every time something crashes. |
Then if you get a core dump called core, run:
gdb sketchy core
The core should load and allow you to investigate it. It might just tell you about the error and where it occurred.
Or if you don’t need a core dump, you can just run gdb sketchy and type run to run the program and see if your error happens in a more interesting and verbose way. If not, you can step to it by starting the program with start to start running it step by step. To go to the next step, use next. Doing this should enable you to creep up on the error. To print out a variable’s value while stepping through, try something like print foo. To get out of gdb just type quit.
Keywords
These words are all reserved for C. Don’t name things with the same name:
auto, break, case, char, const, continue, default, do, double, else, enum, extern, float, for, goto, if, int, long, register, return, short, signed, sizeof, static, struct, switch, typedef, union, unsigned, void, volatile, while