Layout of a program in memory

December 9th, 2009 by Kevin | No Comments | Filed in C Programming, programming

The image below shows the layout of any program when it is loaded in the operating system memory. Considering a 32-bit system, the memory is divided into 4GB virtual chunks for each program loaded in memory. The mapping between the actual memory and the virtual memory is done using a page system as shown below.

Copyright © en:User:Dysprosia, all rights reserved.

The physical address space is the actual memory into which the program is brought for execution (the RAM). The virtual address space is the swap space created on the secondary storage device (the hard disk). Generally the main memory has a smaller size due to cost constraints. We know that to execute a program, it needs to be brought from the secondary storage to the main memory. But what will happen if the program size is larger than the main memory? To overcome this problem, the paging system is used.

The entire main memory is divided into pages (say 1KB each). When the program is loaded into main memory, a page is taken from the secondary storage and placed in the main memory for execution. When the main memory has exhausted all of it’s pages, the operating system checks which of them are not required. It then swaps those pages for new ones from the secondary storage. This is even more convenient in case of multi-tasking systems where several programs are running concurrently. In this entire process, it is transparent that pages are swapped from the secondary storage. It virtually seems like the entire large program has been loaded into the main memory. Hence it is know as the virtual file system.

Since it is a 32 bit-system, the available virtual address space is 2³² = 4GB. Which is why it is mentioned that it doesn’t matter for 32-bit systems if the RAM size if more than 4GB as only that much address space can be used (though several hackers mention that this could be done by some advanced techniques both on Windows and Linux).

The layout of a program in this virtual memory is as shown below:

The kernel space is the area of memory allocated to the kernel. This space cannot be accessed by the user level programs.

The stack is the memory in which the automatic variables and function parameters of the program exist. It is a LIFO (Last In First Out) structure. So it is accessed by a pointer always pointing to the top of the stack. Each function has a stack frame for it to keep all of it’s variables. There is a limit to the length of the stack and trying to access memory larger than it will result in a stack overflow.

The memory mapping segment is a region of memory to which a program has access. It is possible to perform fast I/O by using this memory mapped region. Hence it is used to load dynamic libraries. Functions like mmap() in Linux exist to manipulate this memory mapping region.

The heap is the memory space that the program can utilize for allocating memory dynamically as it is required. Memory used from the heap remains even after the function in which it was created has it’s scope ended. Care has to be taken by the program to release this memory else the program results in a memory leak. Some programming languages like Java have a garbage collector to ensure no memory leaks take place.

The BSS (Block Started by Symbol) segment is used to store all the global and static variables which are uninitialized. This is a read/write area of memory.

The data segment is used to store all the global and static variables which have been initialized. This is a read/write area of memory.

The text segment is used to store all the constants and string literals. It also stores the program’s binary code. This area is a read only area of memory.

You can learn more details about the anatomy of program memory, here.

Technorati Tags: virtual file system, paging, layout of program memory

 

Related Posts:

Tags: ,

The C Compilation model

December 3rd, 2009 by Kevin | No Comments | Filed in C Programming

I had mentioned that the “Hello world” is the first program written in any programming language. Well, below is the program written using the C language:

/* Hello World */

#include 
main()
{
   printf("hello, world\n");
}

And the output is:

hello world
C Compilation Process

The preprocessor will remove the statements marked as comments in between the /*…*/  Then it will insert the contents of the stdio.h file from the #include .

We can either mention #include or #include “stdio.h”. The first method tells the preprocessor to search in the standard include directories. The second method tells the preprocessor to search in the local directory.

The compiler translates the C code into assembly language code which contains machine level instructions specific to the platform such as x86, ARM, etc on which the program is being run.

The compiler then takes the assembly code and converts it into a binary level code which is called an object file marked as hello.o (example)

To create the executable file, the linker combines the program with other programs whose functions the main program uses. In the above program case, the printf() function is a library function which the object file doesnt know about. So the linker links the library object file with this main object file.

(For those who are wondering about a.out format, it is a file format though now no longer used in most systerms. The name has stuck though and it stands for assembler output.)

 

Related Posts:

Tags: , ,