C Code Optimization Techniques Pdf

Download FREE PDF Mathematics Ebooks for IIT JEE Main and JEE Advance and All other Engineering exams. IIT Books is free site for JEE Main & Advance Preparation.IIT Books give Free Study material papers and latest news of all engineering entrance exam. Wave Optics Class Notes PDF For IIT JEE Here we are providing Allen institute wave optics class notes pdf for JEE Main 2018 exam. https://lanegol.netlify.app/iit-jee-books-pdf.html.

28 May 2012CPOL

Optimization of code Optimization of code is done by applying code transformations to improve performance, like execution time, code size, minimum resource utilization etcetera. These transformations can be made either at a high level or at a low level. In computing, optimization is the process of modifying a system to improve its efficiency. Optimization is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed. In optimization, high-level general programming constructs are replaced by very efficient low-level programming codes. A code optimizing. Optimization is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed. In optimization, high-level general programming constructs are replaced by very efficient low-level programming codes. A code optimizing.

Introduction

This article is intended to introduce softwaredevelopers into the topic of optimization techniques. For this, different optimization techniques will be explored.

Asa first step, I have chosen an easy tounderstand algorithm to which I have applied various optimization techniques:

The problem we will solve is the 3n + 1 problem (details): for every number n between 1 and 1000000 apply the following function:

until the number becomes 1, counting the number of time we applied the function.

Thisalgorithm will be executed for all the numbers between 1 and 1000000. No inputnumber from the keyboard will be read and the program will print the result,followed by the execution time (in milliseconds) needed to compute the result.

Test machine will be a laptop with the following specs: AMD Athlon 2 P340 Dual Core 2.20 GHz, 4 GB of RAM, Windows 7 Ultimate x64.

Languages used for implementation: C# and C++ (Visual Studio 2010).

Prerequisite

N/A

Differentimplementations for the same problem

Theinitial version of implementation: for each number between 1 and 1000000, the abovementioned algorithm will be applied, generating a sequence of numbers until nbecomes 1. The steps needed to reach to 1 will be countedand the maximum number of steps will be determined.

C++ code:

C# code:

I compiled the code for both Debug and Release builds, both 32 bit and 64 bit version. I then ran every executable 100 times and computed the average time(ms) it takes to do the calculations.

Here are the results:

C++ Debug C++ Release C# Debug C# Release
x86 version 6882.91 6374.50 6358.41 5109.90
x64 version 1020.78 812.71 1890.36 742.28

First thing to be observed in the table is thatthe 32 bits program versions are 5 to 7 times slower than the 64 bits versions.This is due to the fact that on x64 architectures one register can hold a long long variable and on x86 we need 2 registers. This means that on x86 operations with long long values are slow. Because of this we will not examine the 32 bits anymore in thisarticle.

Code Optimization Pdf

Second thing to be noticed is the differencebetween Release and Debug builds and, also, that for C# the differences arebigger than for C++.

C Code Optimization Techniques Pdf Pdf

Anotherobservation is the difference between the C# Release version and C++ Releaseversion. This, together with the previous observation, makes me believe that theC# compiler performs optimization better than the C++ compiler (maybe evenemploying some of the optimization techniques we are going to talk aboutlater).

Thefirst optimizations I will apply are related to performing the mathematicaloperations faster by replacing the conventional way of doing them with anunconventional way.

Ifwe look at the above code we see that we have only 3 complex mathematicaloperations: modulo 2 operation(%),multiplication by 3(*) and division by 2(/).

First operation I will optimize is the modulo 2.We know that all numbers are represented in memory as a sequence of bits. wealso know, the representation of an odd number will always have its last bit 1(5= 101, 13 = 1101, etc.) and the representation of an even number will alwayshave its last bit 0( 6 = 110, 22 = 10110). So if we can get the last bit of anumber and test it against 0 we know if a number is odd or even. To get thelast bit of a number I use the bitwise AND operator(&).

In C++, replace:

with:

In C#, replace:

with:

Here are the results:

C++ Debug C++ Release C# Debug C# Release
922.46 560.86 1641.41 714.10

C++Release version benefits most from this optimization. The difference inimprovement between the C++ Release and Debug versions leads me to believe thatthe compiler is able to remove more instructions in the Release build with thenew optimization algorithm.

C#seems not to benefit too much from this optimization.

Thenext operation I will try to optimize is the division by 2. If we look again atthe binary representation of the numbers, we can observe that when we divide by2 we discard the last bit of the number and we add a 0 bit before the remainingbits. So 5 (=101) / 2 = 2 (=010), 13 (=1101) / 2 = 6 (=0110), 6 (=110) / 2 = 3(= 011), etc. I will replace this operation with the bitwise right shiftoperation that produces the same result.

In C++, replace:

with:

In C#, replace:

with:

Here are the results:

C++ Debug C++ Release C# Debug C# Release
821.58 555.96 1432.01 652.11

C++Debug, C# Debug, C# Release version gain between 65 and 200 milliseconds fromthis optimization.

C++Release gains almost nothing from this replacement probably because thecompiler was already performing this optimization.

Lastmathematical operation that consumes time is the multiplication by 3. The onlything we can do to this operation is to replace it by additions.

Optimization techniques pdf

In C++ replace:

with:

In C# replace:

with:

Here are the results:

C++ Debug C++ Release C# Debug C# Release
820.84 548.93 1535.28 629.89

Thebiggest performance gain can be observed in the C# Release version, followed bythe C++ Release version.

C# Debug version shows a decreased performance dueto the fact that the current software version executes more instructions thanthe previous one and the compiler can not optimize the instructions (it can notreplace them with anything else because we might need to set a break point on any of them).

Thereis one last mathematical optimization we can perform based on some special instructions that theprocessor implements. These instructions are the so-called conditional moveinstructions. To determine the compiler to generate a conditional moveinstruction, I will replace the IF statement (which checks if the number is oddor even) with the ternary operator( ?: ).

Tobe able to implement the optimization mentioned above we need to modify theproblem statement. If the number is even, it will be divided by 2 (as imposedfor the problem). If the number is odd then it can be expressed as 2 * n + 1. Applyingthis modifications to the initial form of the function we will obtain:

From the above equation we can see that we can perform2 steps of the algorithm into 1. We will rewrite the algorithm so that wecompute next value of the number to test, assuming the current value is even.Then we will save the value of the last bit of the current number to test. Ifthis value is true, we will increment the current cycle count and add the currentnumber + 1 to the next value of the number to test. (Note: this optimizationwill become really important in one of the next articles when I will talk aboutSSE).

Free download turbo c 4.5 setup for windows xp free. In C++ replace:

with:

In C# replace:

with:

Here are the results:

C++ Debug C++ Release C# Debug C# Release
1195.38 462.21 1565.01 752.92

Both debug builds show a slowdown, because weare now executing more instructions compared to the previous versions of thecode and the compilers can not optimize them.

TheC# Release version shows a slowdown because there are no conditional moveinstructions in C#.

Thepower of this category of instructions is proved by the increased speed of theC++ Release version.

Itcan be noticed the I did solve the problem using recursion. For this problem, arecursive algorithm would be extremely slow: the maximum cycle length is 525,so assuming that most of the numbers have a cycle length of around 150 (just a guess,not actually verified), if we have 150 recursive calls for every number between1 and 1000000, we would have to perform 150000000 calls. This, clearly, is nota small number and, because calling a function takes a lot of time, recursion is,definitely, not a good solution for this problem.

Points of Interest

It'stime to draw the conclusions:

  1. Modulo and division operation take a lot of time and they should be replaced bysomething else.
  2. Try to analyze the problem and obtain an alternate representation of theproblem.
  3. Try to eliminate the IF statements from your code in the case that their onlypurpose is to set some values based on a condition.

Thenext time topic will be about how to make our program faster, using threadingin C# and C++.

History

  • 27 May 2012 - Initial release.
  • 28 May 2012 - I would like to thank anlarke for pointing out things that could be improved in the article and for submitting his code (C++ Debug time: 546.76 ms, C++ Release time: 386.35 ms). Also I would like to thank Reonekot for his clarification on the WoW topic. He is right and the performance problems are caused by the fact that the registers are 32 bits (for x86) and 64 bits (for x64).
  • Compiler Design Tutorial
  • Compiler Design Useful Resources
  • Selected Reading

Optimization is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed.

In optimization, high-level general programming constructs are replaced by very efficient low-level programming codes. A code optimizing process must follow the three rules given below:

Bach formatted in Encore, ready to use. New Features Gvox VSTi Host Toolbar now displays all tools Preroll click option New Score Wizard Over 30 Templates and unlimited user created template Transposed or 'C' score option. Encore 5 full crack free download. Beginning measure number offset Preset Tab tunings option Added menu items Auto Spacing when dragging notes, barlines and systems Enhanced MIDI playback Garritan Personal Studio ready Simplified Accidentals option More shortcuts MusicXML import and export Full DLS and Soundfont support Handwritten Music Style option Bonus - Hundreds of works by J.S. Top 4 Download periodically updates software information of Encore 5.0.4 B858 full version from the publisher, but some information may be slightly out-of-date.

  • The output code must not, in any way, change the meaning of the program.

  • Optimization should increase the speed of the program and if possible, the program should demand less number of resources.

  • Optimization should itself be fast and should not delay the overall compiling process.

Efforts for an optimized code can be made at various levels of compiling the process.

  • At the beginning, users can change/rearrange the code or use better algorithms to write the code.

  • After generating intermediate code, the compiler can modify the intermediate code by address calculations and improving loops.

  • While producing the target machine code, the compiler can make use of memory hierarchy and CPU registers.

Optimization can be categorized broadly into two types : machine independent and machine dependent.

Machine-independent Optimization

Definition

In this optimization, the compiler takes in the intermediate code and transforms a part of the code that does not involve any CPU registers and/or absolute memory locations. For example:

This code involves repeated assignment of the identifier item, which if we put this way:

should not only save the CPU cycles, but can be used on any processor.

Machine-dependent Optimization

Machine-dependent optimization is done after the target code has been generated and when the code is transformed according to the target machine architecture. It involves CPU registers and may have absolute memory references rather than relative references. Machine-dependent optimizers put efforts to take maximum advantage of memory hierarchy.

Basic Blocks

Source codes generally have a number of instructions, which are always executed in sequence and are considered as the basic blocks of the code. These basic blocks do not have any jump statements among them, i.e., when the first instruction is executed, all the instructions in the same basic block will be executed in their sequence of appearance without losing the flow control of the program.

A program can have various constructs as basic blocks, like IF-THEN-ELSE, SWITCH-CASE conditional statements and loops such as DO-WHILE, FOR, and REPEAT-UNTIL, etc.

Basic block identification

We may use the following algorithm to find the basic blocks in a program:

  • Search header statements of all the basic blocks from where a basic block starts:

    • First statement of a program.
    • Statements that are target of any branch (conditional/unconditional).
    • Statements that follow any branch statement.
  • Header statements and the statements following them form a basic block.

  • A basic block does not include any header statement of any other basic block.

Basic blocks are important concepts from both code generation and optimization point of view.

Basic blocks play an important role in identifying variables, which are being used more than once in a single basic block. If any variable is being used more than once, the register memory allocated to that variable need not be emptied unless the block finishes execution.

Control Flow Graph

C Code Optimization

Basic blocks in a program can be represented by means of control flow graphs. A control flow graph depicts how the program control is being passed among the blocks. It is a useful tool that helps in optimization by help locating any unwanted loops in the program.

Loop Optimization

Most programs run as a loop in the system. It becomes necessary to optimize the loops in order to save CPU cycles and memory. Loops can be optimized by the following techniques:

  • Invariant code : A fragment of code that resides in the loop and computes the same value at each iteration is called a loop-invariant code. This code can be moved out of the loop by saving it to be computed only once, rather than with each iteration.

  • Induction analysis : A variable is called an induction variable if its value is altered within the loop by a loop-invariant value.

  • Strength reduction : There are expressions that consume more CPU cycles, time, and memory. These expressions should be replaced with cheaper expressions without compromising the output of expression. For example, multiplication (x * 2) is expensive in terms of CPU cycles than (x << 1) and yields the same result.

Dead-code Elimination

Dead code is one or more than one code statements, which are:

Code Optimization Techniques

  • Either never executed or unreachable,
  • Or if executed, their output is never used.

Thus, dead code plays no role in any program operation and therefore it can simply be eliminated.

Code Optimization Definition

Partially dead code

There are some code statements whose computed values are used only under certain circumstances, i.e., sometimes the values are used and sometimes they are not. Such codes are known as partially dead-code.

The above control flow graph depicts a chunk of program where variable ‘a’ is used to assign the output of expression ‘x * y’. Let us assume that the value assigned to ‘a’ is never used inside the loop.Immediately after the control leaves the loop, ‘a’ is assigned the value of variable ‘z’, which would be used later in the program. We conclude here that the assignment code of ‘a’ is never used anywhere, therefore it is eligible to be eliminated.

Likewise, the picture above depicts that the conditional statement is always false, implying that the code, written in true case, will never be executed, hence it can be removed.

Partial Redundancy

Online

Redundant expressions are computed more than once in parallel path, without any change in operands.whereas partial-redundant expressions are computed more than once in a path, without any change in operands. For example,

C Code Optimization Techniques Pdf Download

[redundant expression]

[partially redundant expression]

Loop-invariant code is partially redundant and can be eliminated by using a code-motion technique.

Software Optimization Techniques

Another example of a partially redundant code can be:

C Code Optimization Techniques Pdf Online

We assume that the values of operands (y and z) are not changed from assignment of variable a to variable c. Here, if the condition statement is true, then y OP z is computed twice, otherwise once. Code motion can be used to eliminate this redundancy, as shown below:

Here, whether the condition is true or false; y OP z should be computed only once.