learnxinyminutes-docs/c.html.markdown
2013-08-31 15:57:54 -05:00

558 lines
19 KiB
Markdown

---
language: c
filename: learnc.c
contributors:
- ["Adam Bard", "http://adambard.com/"]
- ["Árpád Goretity", "http://twitter.com/H2CO3_iOS"]
---
Ah, C. Still **the** language of modern high-performance computing.
C is the lowest-level language most programmers will ever use, but
it more than makes up for it with raw speed. Just be aware of its manual
memory management and C will take you as far as you need to go.
```c
// Single-line comments start with // - only available in C99 and later.
/*
Multi-line comments look like this. They work in C89 as well.
*/
//Special characters:
'\n' // newline character
'\t' // tab character (left justifies text)
'\v' // vertical tab
'\f' // new page
'\r' // carriage return
'\b' // backspace character
'\0' // null character
//print formatting:
"%d" // integer
"%3d" // minimum length of 3 digits for integer (right justifies text)
"%s" // string
"%f" // float
"%ld" // long
"%3.2f" // minimum 3 digits left and 2 digits right decimal float
"%7.4s" // (can do with strings too)
"%c" // char
"%p" // pointer
"%x" // hexidecimal
"%o" // octal
"%%" // prints %
// Constants: use #define keyword, no semicolon at end.
#define DAYS_IN_YEAR = 365
// Import headers with #include
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
// (File names between <angle brackets> are headers from the C standard library.)
// For your own headers, use double quotes instead of angle brackets:
#include "my_header.h"
// Declare function signatures in advance in a .h file, or at the top of
// your .c file.
void function_1();
void function_2();
// Your program's entry point is a function called
// main with an integer return type.
int main() {
// print output using printf, for "print formatted"
// %d is an integer, \n is a newline
printf("%d\n", 0); // => Prints 0
// All statements must end with a semicolon
///////////////////////////////////////
// Types
///////////////////////////////////////
// ints are usually 4 bytes
int x_int = 0;
// shorts are usually 2 bytes
short x_short = 0;
// chars are guaranteed to be 1 byte
char x_char = 0;
char y_char = 'y'; // Char literals are quoted with ''
// longs are often 4 to 8 bytes; long longs are guaranteed to be at least
// 64 bits
long x_long = 0;
long long x_long_long = 0;
// floats are usually 32-bit floating point numbers
float x_float = 0.0;
// doubles are usually 64-bit floating-point numbers
double x_double = 0.0;
// Integral types may be unsigned.
unsigned short ux_short;
unsigned int ux_int;
unsigned long long ux_long_long;
// sizeof(T) gives you the size of a variable with type T in bytes
// sizeof(obj) yields the size of the expression (variable, literal, etc.).
printf("%zu\n", sizeof(int)); // => 4 (on most machines with 4-byte words)
// If the argument of the `sizeof` operator an expression, then its argument
// is not evaluated (except VLAs (see below)).
// The value it yields in this case is a compile-time constant.
int a = 1;
size_t size = sizeof(a++); // a++ is not evaluated
printf("sizeof(a++) = %zu where a = %d\n", size, a);
// prints "sizeof(a++) = 4 where a = 1" (on a 32-bit architecture)
// Arrays must be initialized with a concrete size.
char my_char_array[20]; // This array occupies 1 * 20 = 20 bytes
int my_int_array[20]; // This array occupies 4 * 20 = 80 bytes
// (assuming 4-byte words)
// You can initialize an array to 0 thusly:
char my_array[20] = {0};
// Indexing an array is like other languages -- or,
// rather, other languages are like C
my_array[0]; // => 0
// Arrays are mutable; it's just memory!
my_array[1] = 2;
printf("%d\n", my_array[1]); // => 2
// In C99 (and as an optional feature in C11), variable-length arrays (VLAs)
// can be declared as well. The size of such an array need not be a compile
// time constant:
printf("Enter the array size: "); // ask the user for an array size
char buf[0x100];
fgets(buf, sizeof buf, stdin);
// strtoul parses a string to an unsigned integer
size_t size = strtoul(buf, NULL, 10);
int var_length_array[size]; // declare the VLA
printf("sizeof array = %zu\n", sizeof var_length_array);
// A possible outcome of this program may be:
// > Enter the array size: 10
// > sizeof array = 40
// Strings are just arrays of chars terminated by a NUL (0x00) byte,
// represented in strings as the special character '\0'.
// (We don't have to include the NUL byte in string literals; the compiler
// inserts it at the end of the array for us.)
char a_string[20] = "This is a string";
printf("%s\n", a_string); // %s formats a string
printf("%d\n", a_string[16]); // => 0
// i.e., byte #17 is 0 (as are 18, 19, and 20)
// If we have characters between single quotes, that's a character literal.
// It's of type `int`, and *not* `char` (for historical reasons).
int cha = 'a'; // fine
char chb = 'a'; // fine too (implicit conversion from int to char)
///////////////////////////////////////
// Operators
///////////////////////////////////////
int i1 = 1, i2 = 2; // Shorthand for multiple declaration
float f1 = 1.0, f2 = 2.0;
// Arithmetic is straightforward
i1 + i2; // => 3
i2 - i1; // => 1
i2 * i1; // => 2
i1 / i2; // => 0 (0.5, but truncated towards 0)
f1 / f2; // => 0.5, plus or minus epsilon
// Floating-point numbers and calculations are not exact
// Modulo is there as well
11 % 3; // => 2
// Comparison operators are probably familiar, but
// there is no boolean type in c. We use ints instead.
// (Or _Bool or bool in C99.)
// 0 is false, anything else is true. (The comparison
// operators always yield 0 or 1.)
3 == 2; // => 0 (false)
3 != 2; // => 1 (true)
3 > 2; // => 1
3 < 2; // => 0
2 <= 2; // => 1
2 >= 2; // => 1
// C is not Python - comparisons don't chain.
int a = 1;
// WRONG:
int between_0_and_2 = 0 < a < 2;
// Correct:
int between_0_and_2 = 0 < a && a < 2;
// Logic works on ints
!3; // => 0 (Logical not)
!0; // => 1
1 && 1; // => 1 (Logical and)
0 && 1; // => 0
0 || 1; // => 1 (Logical or)
0 || 0; // => 0
// Bitwise operators!
~0x0F; // => 0xF0 (bitwise negation, "1's complement")
0x0F & 0xF0; // => 0x00 (bitwise AND)
0x0F | 0xF0; // => 0xFF (bitwise OR)
0x04 ^ 0x0F; // => 0x0B (bitwise XOR)
0x01 << 1; // => 0x02 (bitwise left shift (by 1))
0x02 >> 1; // => 0x01 (bitwise right shift (by 1))
// Be careful when shifting signed integers - the following are undefined:
// - shifting into the sign bit of a signed integer (int a = 1 << 32)
// - left-shifting a negative number (int a = -1 << 2)
// - shifting by an offset which is >= the width of the type of the LHS:
// int a = 1 << 32; // UB if int is 32 bits wide
///////////////////////////////////////
// Control Structures
///////////////////////////////////////
if (0) {
printf("I am never run\n");
} else if (0) {
printf("I am also never run\n");
} else {
printf("I print\n");
}
// Notes:
// Loops MUST always have a body. If no body is needed, do this:
for (i = 0; i <= 5; i++) {
; // use semicolon to act as the body (null statement)
}
// While loops exist
int ii = 0;
while (ii < 10) {
printf("%d, ", ii++); // ii++ increments ii in-place
// after yielding its value ("postincrement").
} // => prints "0, 1, 2, 3, 4, 5, 6, 7, 8, 9, "
printf("\n");
int kk = 0;
do {
printf("%d, ", kk);
} while (++kk < 10); // ++kk increments kk in-place, and yields
// the already incremented value ("preincrement")
// => prints "0, 1, 2, 3, 4, 5, 6, 7, 8, 9, "
printf("\n");
// For loops too
int jj;
for (jj=0; jj < 10; jj++) {
printf("%d, ", jj);
} // => prints "0, 1, 2, 3, 4, 5, 6, 7, 8, 9, "
printf("\n");
// branching with multiple choices: switch()
switch (some_integral_expression) {
case 0: // labels need to be integral *constant* epxressions
do_stuff();
break; // if you don't break, control flow falls over labels
case 1:
do_something_else();
break;
default:
// if `some_integral_expression` didn't match any of the labels
fputs("error!\n", stderr);
exit(-1);
break;
}
///////////////////////////////////////
// Typecasting
///////////////////////////////////////
// Every value in C has a type, but you can cast one value into another type
// if you want (with some constraints).
int x_hex = 0x01; // You can assign vars with hex literals
// Casting between types will attempt to preserve their numeric values
printf("%d\n", x_hex); // => Prints 1
printf("%d\n", (short) x_hex); // => Prints 1
printf("%d\n", (char) x_hex); // => Prints 1
// Types will overflow without warning
printf("%d\n", (unsigned char) 257); // => 1 (Max char = 255 if char is 8 bits long)
// For determining the max value of a `char`, a `signed char` and an `unisigned char`,
// respectively, use the CHAR_MAX, SCHAR_MAX and UCHAR_MAX macros from <limits.h>
// Integral types can be cast to floating-point types, and vice-versa.
printf("%f\n", (float)100); // %f formats a float
printf("%lf\n", (double)100); // %lf formats a double
printf("%d\n", (char)100.0);
///////////////////////////////////////
// Pointers
///////////////////////////////////////
// A pointer is a variable declared to store a memory address. Its declaration will
// also tell you the type of data it points to. You can retrieve the memory address
// of your variables, then mess with them.
int x = 0;
printf("%p\n", (void *)&x); // Use & to retrieve the address of a variable
// (%p formats an object pointer of type void *)
// => Prints some address in memory;
// Pointers start with * in their declaration
int *px, not_a_pointer; // px is a pointer to an int
px = &x; // Stores the address of x in px
printf("%p\n", (void *)px); // => Prints some address in memory
printf("%zu, %zu\n", sizeof(px), sizeof(not_a_pointer));
// => Prints "8, 4" on a typical 64-bit system
// To retreive the value at the address a pointer is pointing to,
// put * in front to de-reference it.
// Note: yes, it may be confusing that '*' is used for _both_ declaring a
// pointer and dereferencing it.
printf("%d\n", *px); // => Prints 0, the value of x
// You can also change the value the pointer is pointing to.
// We'll have to wrap the de-reference in parenthesis because
// ++ has a higher precedence than *.
(*px)++; // Increment the value px is pointing to by 1
printf("%d\n", *px); // => Prints 1
printf("%d\n", x); // => Prints 1
// Arrays are a good way to allocate a contiguous block of memory
int x_array[20];
int xx;
for (xx = 0; xx < 20; xx++) {
x_array[xx] = 20 - xx;
} // Initialize x_array to 20, 19, 18,... 2, 1
// Declare a pointer of type int and initialize it to point to x_array
int* x_ptr = x_array;
// x_ptr now points to the first element in the array (the integer 20).
// This works because arrays often decay into pointers to their first element.
// For example, when an array is passed to a function or is assigned to a pointer,
// it decays into (implicitly converted to) a pointer.
// Exceptions: when the array is the argument of the `&` (address-od) operator:
int arr[10];
int (*ptr_to_arr)[10] = &arr; // &arr is NOT of type `int *`!
// It's of type "pointer to array" (of ten `int`s).
// or when the array is a string literal used for initializing a char array:
char arr[] = "foobarbazquirk";
// or when it's the argument of the `sizeof` or `alignof` operator:
int arr[10];
int *ptr = arr; // equivalent with int *ptr = &arr[0];
printf("%zu %zu\n", sizeof arr, sizeof ptr); // probably prints "40, 4" or "40, 8"
// Pointers are incremented and decremented based on their type
// (this is called pointer arithmetic)
printf("%d\n", *(x_ptr + 1)); // => Prints 19
printf("%d\n", x_array[1]); // => Prints 19
// You can also dynamically allocate contiguous blocks of memory with the
// standard library function malloc, which takes one argument of type size_t
// representing the number of bytes to allocate (usually from the heap, although this
// may not be true on e. g. embedded systems - the C standard says nothing about it).
int *my_ptr = malloc(sizeof(*my_ptr) * 20);
for (xx = 0; xx < 20; xx++) {
*(my_ptr + xx) = 20 - xx; // my_ptr[xx] = 20-xx
} // Initialize memory to 20, 19, 18, 17... 2, 1 (as ints)
// Dereferencing memory that you haven't allocated gives
// "unpredictable results" - the program is said to invoke "undefined behavior"
printf("%d\n", *(my_ptr + 21)); // => Prints who-knows-what? It may even crash.
// When you're done with a malloc'd block of memory, you need to free it,
// or else no one else can use it until your program terminates
// (this is called a "memory leak"):
free(my_ptr);
// Strings are arrays of char, but they are usually represented as a
// pointer-to-char (which is a pointer to the first element of the array).
// It's good practice to use `const char *' when referring to a string literal,
// since string literals shall not be modified (i. e. "foo"[0] = 'a' is ILLEGAL.)
const char *my_str = "This is my very own string literal";
printf("%c\n", *my_str); // => 'T'
// This is not the case if the string is an array
// (potentially initialized with a string literal)
// that resides in writable memory, as in:
char foo[] = "foo";
foo[0] = 'a'; // this is legal, foo now contains "aoo"
function_1();
} // end main function
///////////////////////////////////////
// Functions
///////////////////////////////////////
// Function declaration syntax:
// <return type> <function name>(<args>)
int add_two_ints(int x1, int x2)
{
return x1 + x2; // Use return to return a value
}
/*
Functions are pass-by-value, but you can make your own references
with pointers so functions can mutate their values.
Example: in-place string reversal
*/
// A void function returns no value
void str_reverse(char *str_in)
{
char tmp;
int ii = 0;
size_t len = strlen(str_in); // `strlen()` is part of the c standard library
for (ii = 0; ii < len / 2; ii++) {
tmp = str_in[ii];
str_in[ii] = str_in[len - ii - 1]; // ii-th char from end
str_in[len - ii - 1] = tmp;
}
}
// Built in functions:
// from stdio.h:
int c = getchar(); //reads character from input. If input = hi, only h is read.
// getchar() can be stored into int or char. I am using int because
// char is not large enough to store EOF used below.
while ((c = getchar()) != EOF) { // EOF constant "end of file".
// Linux: CTRL+D, Windows: CTRL+X
// must have () around getchar() as != is run before =.
putchar(c); //prints character (without newline at end)
char c = getchar();
}
/*
char c[] = "This is a test.";
str_reverse(c);
printf("%s\n", c); // => ".tset a si sihT"
*/
///////////////////////////////////////
// User-defined types and structs
///////////////////////////////////////
// Typedefs can be used to create type aliases
typedef int my_type;
my_type my_type_var = 0;
// Structs are just collections of data, the members are allocated sequentially,
// in the order they are written:
struct rectangle {
int width;
int height;
};
// It's not generally true that
// sizeof(struct rectangle) == sizeof(int) + sizeof(int)
// due to potential padding between the structure members (this is for alignment
// reasons). [1]
void function_1()
{
struct rectangle my_rec;
// Access struct members with .
my_rec.width = 10;
my_rec.height = 20;
// You can declare pointers to structs
struct rectangle *my_rec_ptr = &my_rec;
// Use dereferencing to set struct pointer members...
(*my_rec_ptr).width = 30;
// ... or even better: prefer the -> shorthand for the sake of readability
my_rec_ptr->height = 10; // Same as (*my_rec_ptr).height = 10;
}
// You can apply a typedef to a struct for convenience
typedef struct rectangle rect;
int area(rect r)
{
return r.width * r.height;
}
// if you have large structs, you can pass them "by pointer" to avoid copying
// the whole struct:
int area(const rect *r)
{
return r->width * r->height;
}
///////////////////////////////////////
// Function pointers
///////////////////////////////////////
/*
At runtime, functions are located at known memory addresses. Function pointers are
much like any other pointer (they just store a memory address), but can be used
to invoke functions directly, and to pass handlers (or callback functions) around.
However, definition syntax may be initially confusing.
Example: use str_reverse from a pointer
*/
void str_reverse_through_pointer(char *str_in) {
// Define a function pointer variable, named f.
void (*f)(char *); // Signature should exactly match the target function.
f = &str_reverse; // Assign the address for the actual function (determined at runtime)
// f = str_reverse; would work as well - functions decay into pointers, similar to arrays
(*f)(str_in); // Just calling the function through the pointer
// f(str_in); // That's an alternative but equally valid syntax for calling it.
}
/*
As long as function signatures match, you can assign any function to the same pointer.
Function pointers are usually typedef'd for simplicity and readability, as follows:
*/
typedef void (*my_fnp_type)(char *);
// Then used when declaring the actual pointer variable:
// ...
// my_fnp_type f;
```
## Further Reading
Best to find yourself a copy of [K&R, aka "The C Programming Language"](https://en.wikipedia.org/wiki/The_C_Programming_Language)
It is *the* book about C, written by the creators of C. Be careful, though - it's ancient and it contains some
inaccuracies (well, ideas that are not considered good anymore) or now-changed practices.
Another good resource is [Learn C the hard way](http://c.learncodethehardway.org/book/).
If you have a question, read the [compl.lang.c Frequently Asked Questions](http://c-faq.com).
It's very important to use proper spacing, indentation and to be consistent with your coding style in general.
Readable code is better than clever code and fast code. For a good, sane coding style to adopt, see the
[Linux kernel coding stlye](https://www.kernel.org/doc/Documentation/CodingStyle).
Other than that, Google is your friend.
[1] http://stackoverflow.com/questions/119123/why-isnt-sizeof-for-a-struct-equal-to-the-sum-of-sizeof-of-each-member