Saturday, January 26, 2008

Where are your tools ?

A new year brings with itself a new hope, a revived enthusiasm to set up things in your life in order. That's exactly the purpose of this post. I intend to help some students who are willing to strengthen their programming skills this year.

I have observed that a number of students in colleges who like to program in C, C++ are a bit short on the knowledge or are in complete ignorance, of how to use tools that are used by professional programmers daily. These tools like make, gdb, prof/gprof, valgrind/Electric Fence, CVS/Subversion really help you a lot with programming projects and in fact programmers can't imagine doing their work without these tools. You seriously need them for large-scale projects, and even with smaller projects you will notice how helpful they are.

My plan in this post is to give such students a short intro to these tools and places where they can start learning these tools. And as you may have noticed, there is bias towards the Linux/UNIX environment here. That's because it's really a much more powerful environment for programming than Windows. You need tos get good in Linux/UNIX environment if you really want to learn more and better about C, C++ programming.

Note: This book, The Linux Programmers Toolbox , by John Fusco, is a must-have for all those who are looking to do some serious programming in the Linux environment, and covers all the tools I mentioned. It's very well written also.

So let's start with 'make'. When you are writing an application, you will have many source files, that roughly depict how you have broken your app into modules. Now, as will happen almost always, you will need to make changes to some module of your app to get things working, or maybe adding a new feature. Once you have made the changes you will need to recompile. This means you will recompile all your source files (in case you don't know make already) which is a waste of time and CPU resource because you changes only one or a small number of modules in your app. And, so ideally you don't need to recompile the ones you haven't changed. This is where make is so helpful and the importance increases when the size of your application's code increases. The book, C in a Nutshell, by Prinz and Crawford, is also a good resource for learning about gcc, make and gdb. You should try to use GNU make as it is the one used almost in all open-source projects. The page is, GNU make, and has documentation and other relevant information on it. Even if the documentation looks a bit heavy, make sure you do read it every once in a while to know more about the capabilities and intricacies of make, it will help you in the long run.

Now, another very essential tool, 'gdb'. gdb is the GNU debugger. For those of you who still do the debugging by using print statements, it's important you know how to use a debugger because a debugger can save you a lot of time and labour. That's because debuggers will display the values variables are having at any point in the code and so you don't have to guess which variable values to check through print statements, removing and re-writing print statements in your code again and again. Also, you can put breakpoints anywhere and let a particular section of the program run to get an idea of what's going on. Once again, the book C in a Nutshell, is a good place to learn gdb and once again the GNU page manual, gdb, is the authoritative source.

Profilers become important when you want to improve the efficiency of your code. You need to identify the hotspots in your code, that is, the portions of your code which take up most of the execution time. Then you can think about why it's taking that much time, and try to substitute better algorithms. prof, gprof are some common profilers available on the Linux/UNIX systems. You can learn about gprof from this great doc, GNU gprof, by Jay Fenlason and Richard Stallman. This SUN doc, is also a very good one.
Let's move on to memory debuggers. Electric Fence is a memory debugger that triggers a program crash when the memory error occurs, so a debugger like gdb can be used to inspect the code that caused the error. You can learn about eFence (as it is called) on this page by the author of the software, Bruce Perens. He also points to some tools that are better than eFence, however eFence is simpler and a good starting point. 'valgrind' is actually a suite of tools that can automatically detect many memory management and threading bugs, and profile your code in detail. valgrind emulates a CPU and so really slows down your program. You can do some tuning to get a bit improved runtime with valgrind. It's heavy-weight compared to less memory-demanding tools like Purify. valgrind consists of five tools, memcheck (a memory debugging tool), cachegrind (a profiling tool which provides information on cache hit-miss and branch prediction events), callgrind (a tool that shows the call relationships and costs), massif (a space profiling tool which provides information on parts of your code that allocate memory) and helgrind (a debugging tool for threaded programs). You can download valgrind from the official site, and also find the very detailed and good manual there. Again, the manual may look heavy but you should read it.

Well, all the above tools help you with the programming logic and builds. You also need a good tool to manage your code when you are working on a considerable sized project, since you will be revising your code and you need versions of the code so that you can fall back to a code that worked. Also, in the case where you are working in a team a tool is needed that can provide safe access to code to the programmers in team, that is, one person's edits are not interrupted by other and everyone gets latest version of code. In a nutshell, you need a verison control system, like RCS, CVS, Subversion etc. RCS is probably the oldest, CVS is the most tried and tested and Subversion is the newer one with a set of exciting new functionality not in CVS. In the beginning, stages it really won't make a difference which one you pick but eventually you should use CVS or newer alternatives like Subversion. The official Subversion book is a good place for learning Subversion. Some good places to learn CVS are the cvshome.org archive, the Utah University page.

Ok, so with that I end my intro to these tools. I hope I have sparked some interest in some of you and you will go and try using these tools. And believe me, you should learn and use them because they are really necessary when you work on bigger projects. You will save a lot of time and improve your skills. Plus you will feel more like a professional programmer and that's cool, aint it ? ;-)