The Tools and Libraries

Before we start learning C language, let’s first take a look at the tools needed. We already know that C language is a ‘Compile to Run’ language, after we write our program, or rather, source code for the program, we need a ‘compiler’ to compile and turn the source to runnable program. On Linux, we’ll be using gcc, GNU Compiler Collection. gcc does more than code compiling, but it is good enough for now to just remember that gcc is the magic tool to turn our C source code to real program.

For example, if we already write our program source in a file called “hello.c”, we can use gcc to to ‘compile’ it,

gcc hello.c -o hey

This will create a runnable program called ‘hey’. If ‘hey’ has problems, we can edit the ‘hello.c’ file then run “gcc hello.c -o hey” to generate a newer version of ‘hey’ program.

Sounds straight forward?

There’s more.

Invoking gcc manually each time like the above is ok, if our program is simple and only has one or a few source files. Imagine a complex program has thousands of different source files, we’d need some other tools to manage the way ‘gcc’ is called. One of the most popular tools is called ‘make’. We’ll not learn ‘make’ here, but this is something to keep in mind. Once our program grows beyond a few files, we’ll need tools like ‘make’ to manage how ‘gcc’ is called. For even more complex program, there are tools built on top of ‘make’ to further simplify the ‘compile’ process, such as automake and cmake.

Other than ‘gcc’, there’s also the library part.

Programming language is designed so that we can program to give working instructions to CPU. However, it will be a problem if every programmer needs to write everything from scratch. For example, in order to create a file on disk, we need to be able to access the disk itself, find the available directory, open a file, write to it, and save, then close. Each programmer probably would write the same thing over and over to handle the disk access and file open/write/save/close part. Ideally the code can be shared so once it is written by one programmer, the other programmers can just reuse it.

One of the mechanism for reusable code is called ‘Library’. Think ‘library’ is a collection of shared code, that can be used by any programmers.

‘gcc’ plus ‘C library’, this will give us the base to C programming. To put it all together, when we program, we write our program following C language grammar, and using shared code from library.

Ok, enough talking, let’s see how to use ‘gcc’ and how to use code from library, open the editor, I am impressed and pleased to see that Frank actually learned to use emacs already! use your favourite editor to create a file named “hello.c” with the following lines,

Now, type the following,

gcc hello.c -o hey

If you see something printed on your terminal, something went wrong, double check to make sure there’s no typo. If nothing happens, that’s good. Remember that “no news is good news”? that means our compiling succeeded. You now should see a file named ‘hey’ is generated. ‘ls -l hey’ even shows that file is runnable!

Let’s run it,

./hey

You see ‘hello’ printed on terminal, it works! We just compiled our first C program!

We used the ‘gcc’ tool, but where is the library?

In this simple example, we only used the standard input/output library, for ‘printf’ support, that’s what the “include <stdio.h>” is for.  By including that, ‘gcc’ automatically finds shared code from that standard library. There is a lot more to learn on how to work with libraries, once we learned the language. For now, it is good enough to know that including <stdio.h> is sufficient to bring in the standard shared code.

Now, we learned all the programming basics, we finally are ready to learn the programming language itself …

The Language

It was exciting to see our first ‘hello’ program running, that wasn’t very hard either, programming seems quite easy after all.

Maybe we can write a game with that technique?

The answer is yes and no. Yes, we can write a simple game with what we learned so far. No, it can’t be a fancy game if we use the programming language that basically does what we can do in a terminal.

We already learned that in order for computer to understand our program, we need to use certain language. Computer can understand many different type of languages, we need to pick one for ourselves.

So what options do we have?

There are many programming langues available, it is nearly impossible to master all of them. The good news is that most programming languages are conceptually the same, if you master one, it will be easy to learn the next one.

We already learned that CPU understands certain instructions, various programming languages are designed to supply CPU those ‘instructions’, as it would otherwise be difficult to directly work with CPU instructions. In order to ease the talking to CPU, there are two different approaches,

  • Write to Run Language

With this type of programming languages, we can write the program to a file, then give the file executable permission, the file becomes a runnable program!

This is what we did with ‘hello’ program.

There are many different types of languages that work like this, the one ‘hello’ program is written with is called ‘shell script’. Shell script can do complicated things, but basically it is just a way of grouping commands that we can type on a terminal. It is a good choice when we want to automate things.

Some of the languages in this group can do fancier stuff, like drawing on screen, plot curves etc. ‘Python’ or ‘Perl’ are both good and powerful ones.

  • Compile to Run Language

With this type of programming languages, the files we write are called source files. But source files themselves can not run. We have to use some special tools to ‘compile’ them and generate different files as runnable program. C, C++, Java are popular ones in this group.

What is the difference between the two? why the “compile to run” one even exists? why we add the extra ‘compile’ step?

There can be many differences, depending on what languages we compare between. The primary difference is performance, the ‘write to run’ ones rely on certain interpreter to understand the instructions in the program and run them. The interpreter itself is a program, which runs on CPU. So basically the interpreter is a middle man between the program and the CPU. The ‘compile to run’ ones though, removes the middle man. The ‘compile’ step will turn the program into instructions that can be understood and run directly by CPU.

Ok, the above is an overly simplified description. Java, for example, even after been compiled, it does not directly run on CPU, instead it runs on some virtual machine. We’ll learn all that when we get to learn Java programming. Some languages may not fall into any of the above groups. But it is good for now to remember these two, and their primary differences.

Programming languages are designed to handle different type of tasks. For example, when trying to automate some commands that we have to type on terminal over and over, we may choose shell script.  If we need to directly work with hardware, we may use C language.

Programming languages are also designed to be ‘portable’. For example, if we need to write program that can run on more than one operating system (recall that our program has to run in certain OS?), we might choose a language that can ‘program once, run everywhere’. Java would be a good fit.

So what we’ll be learning here? we’ll start with the very basic, yet powerful one, C language