Sunday 13 February 2011

[YAGE] Memory Management - Introduction

It seems to me that memory management is a subject that has to be dealt with in the YAGE project. I've been doing some researches on the subject whenever I could afford some spare time.
This is -may be- an introduction to memory management techniques.


1. Risks of reckless code

Every C programmer knows how highly error-prone dynamic memory management is. The C++ inherits that critical feature of the C language and comes with its own memory trouble. Pointers, references, polymorphism are a double-edged weapon. They are powerful assets for a C++ programmer, especially when used to implement object-oriented design patterns and C++ idioms. But, the misuse of such power would be, in the best case, a source of bugs.
Sometimes, the errors we get due to clumsy use of memory and object-oriented features could be hard to detect, and even if we were able to detect them, e.g. through tests, the root cause would be rather difficult to find. This kind of problems is not specific to large scale applications, even small pieces of code can suffer from major memory problems as in the wikileaks function:

Example 1 - Memory leaks
 #include <stdio.h>  
 #include <stdlib.h>  
 #include <string.h>  
   
 char* wikileaks(const char* str, int n)  
 {  
    char* tmp;  
    int len;  
    if(n > 0)  
    {  
       len = strlen(str);  
       tmp = (char*)(malloc(sizeof(char) * (len + 1)));  
       strcpy(tmp, str);  
       tmp[len++] = '-';  
       tmp[len] = '\0';  
       return wikileaks(tmp, --n);  
    }  
    return (char*)(str);  
 }  
   
 int main()  
 {  
    printf("The result is : %s\n", wikileaks("Huh", 10));  
    return 0;  
 }  

In this exmaple, there are n malloc calls without the corresponding free. Moreover, after the i-th recursive call, we lose the pointer to the (i-1)th char* argument. The greater n is, the greater the loss we will have.


2. Now what's the deal?

The purpose of this article is to present some efficient ways to cure the problems related to dynamic memory allocation as the one seen in the first example. Therefore, the errors relevant to object-oriented features like object-slicing and so on won't be covered herein. Anyways, I recommand these series of articles, they are pretty exhaustive :
Part 1
Part 2
Part 3

Since, the primary goal of this article is to find the most efficient memory management mechanism for YAGE, we won't go through learning the programming best practices and we'll focus on code only.
There are couple of concepts that you should know about:
  • Wild pointers:
  • A wild pointer points to a random memory block.
    Example 2 - Wild Pointers
     int main()
     {
        int* pa;
        int* pb;
        {
           int* pc = new int;
           pb = pc;
           delete pc;
        }
        return 0;
     }
    
          Here, by the end of the main function, both pa and pb are wild pointers.
        • Pointer ownership:
        • Pointer ownership states that the one (function, method) who allocates memory should be responsible for freeing it.

        3. Prevention or detection, the trade-off

        The problems we can run into in a C++ program are mainly memory leaks and wild pointers. These errors are unpredictable and can lurk in the darkness for some time before showing up at the worst moment to crush the performance of your program.
        Hate it or love it, manual tracing of dynamic memory is nearly impossible. However, with strict coding practices and flawless respect of pointers ownership, one wouldn't worry about the use of dynamic memory. Well, if you are that kind of programmers, you have no need to read this article. 

        Needless to say, in computer science, there are several problems that aren't easy to seize. In this kind of situations, the solution is either to prevent the problem from happening or to detect its occurrence and solve the issues on a case-by-case basis. Note that this is a very common way to solve problems related to computer networks.
        Prevention or detection, no class of solutions have the upper hand over the other as shown in the following table:

        Prevention Detection
        + the source of the errors is avoided: automatic management of ownership and good initialization and disposal of the memory resources
        + assure an absolute quality of the program's memory management - overhead in the release version
        - it's quite subtle to deal with the problem before it occurs
        + it's easy to detect a problem once it occurs and before it happens
        + the detection system could be deactivated in the final release, it's only activated in the development stage: no overhead
        - debug only mechanism
        - once the error is detected the programmer have to deal with it manually

        4. Acknowledged Techniques

        Alright! let's get down to business. In the next articles, we'll see some of the most implemented techniques to control the use of dynamic memory. They will presented as following:
        1. RAII
        2. Smart pointers
        3. Parent-children management hierarchy
        4. Memory pools
        5. Placement new
        6. Allocators
        7. Garbage collection

        This list may be modified for some reason or another, it isn't likely to be though.
        TO BE CONTINUED ...

        No comments:

        Post a Comment