Friday, September 26, 2008

Floating-point: Bit Value of INF, NAN, DEN

I have every write a post and mentioned a little bit about the binary format of IEEE 754 in Comparison of Float and Double Precision. In this post I made a table to indicate the details of bit values for the IEEE 754 32bit single-precision float, the bit values of some special floating point value, including Zero, One, Minus One, Smallest denormalized number, "Middle" denormalized number, Largest denormalized number, Smallest normalized number, Largest normalized number, Positive infinity, Negative infinity, Not a number (NaN).

Capture 2008-09-26 13.12.57

In the column "Watch in Windows", I put what I can see from the watch window in Visual Studio 2008 (Visual C++ Environment).

Sunday, September 14, 2008

Show Formatted Source Code with Syntax Highlight

LibG Code Viewer is a free web tool to view source code with syntax highlight.

After I finished my post How to format source code in your blog, I decided to make a small tool for anyone who post source code file on Internet. I called this tool "LibG Code Viewer", it used the method I described in my previous blog. Here is the usage of this tool:

http://libg.org/code.php?lang=your source language&url=link to your source code

Below is the list of supported language and sample:

LanguageParameterSample
C++cpphttp://libg.org/code.php?lang=cpp&url=http://libg.org/code/sample.cpp
C#csharphttp://libg.org/code.php?lang=csharp&url=http://libg.org/code/sample.cs
CSScsshttp://libg.org/code.php?lang=css&url=http://libg.org/code/sample.css
Delphidelphihttp://libg.org/code.php?lang=delphi&url=http://libg.org/code/sample.pas
Javajavahttp://libg.org/code.php?lang=java&url=http://libg.org/code/sample.java
Java Scriptjshttp://libg.org/code.php?lang=js&url=http://libg.org/code/sample.js
PHPphphttp://libg.org/code.php?lang=php&url=http://libg.org/code/sample.phpcode
Pythonpyhttp://libg.org/code.php?lang=py&url=http://libg.org/code/sample.py
Rubyrubyhttp://libg.org/code.php?lang=ruby&url=http://libg.org/code/sample.rb
Sqlsqlhttp://libg.org/code.php?lang=sql&url=http://libg.org/code/sample.sql
VBvbhttp://libg.org/code.php?lang=vb&url=http://libg.org/code/sample.vb
XML/HTMLxmlhttp://libg.org/code.php?lang=xml&url=http://libg.org/code/sample.xml

You can use the form below to test LibG Code Viewer:

Language:
Source Code Link:


Friday, September 12, 2008

How to format source code in your blog

When I was writing my post Include Guard: #pragma once vs. #ifndef #define #endif, I searched from Internet and tried at least 5 WordPress plug-in to insert formatted source code. But none of them works well for me. Some can not support C++ source code, some insert complicated html code to my post. Finally I found syntax highlighting is simply and easy to prettify my code. There is WordPress plug-in Google Syntax Highlighter for WordPress, and convenient for WordPress users to integrate to your WordPress blog site. The plug-in author may be wrong here, because the true Google Code highlight tool is Google Code Prettify.
SyntaxHighlighter is a pure JavaScript based tool and can parse a lot of programming languages:
  • C++ (cpp, c, c++)
  • C# (c#, c-sharp, csharp)
  • CSS (css)
  • Delphi (delphi, pascal)
  • Java (java)
  • Java Script (js, jscript, javascript)
  • PHP (php)
  • Python (py, python)
  • Ruby (rb, ruby, rails, ror)
  • Sql (sql)
  • VB (vb, vb.net)
  • XML/HTML (xml, html, xhtml, xslt)
To learn format your source code use this tool, please refer to the usage from http://code.google.com/p/syntaxhighlighter/wiki/Usage. This post demonstrated the result of SyntaxHighlighter so you can make a quick decision. I used the code from Microsoft MSDN http://msdn.microsoft.com/en-us/library/w5c4hyx3(VS.80).aspx as the sample code for testing.
1. Default usage:
<pre name="code" class="cpp"> ... code here ... </pre>
Result:
// Exhibits polymorphism/virtual functions.

#include 
#include 
#define TRUE = 1
using namespace std;

class dog
{
public:
 // Source Code Omitted...

private:
   string _dogSize, _earType;
   int _legs;
   bool _bark;

};

class breed : public dog
{
public:
 /*
 Source Code Omitted...
 */
};

int main()
{
   dog mongrel;
   breed labrador("yellow", "large");
   mongrel.setEars("pointy");
   labrador.setEars("long", "floppy");
   cout << "Cody is a " << labrador.getColor() << " labrador" << endl;
}


2. Display no gutter:


<pre name="code" class="cpp:nogutter"> ... code here ... </pre>


Result:


// Exhibits polymorphism/virtual functions.

#include 
#include 
#define TRUE = 1
using namespace std;

class dog
{
public:
 // Source Code Omitted...

private:
   string _dogSize, _earType;
   int _legs;
   bool _bark;

};

class breed : public dog
{
public:
 /*
 Source Code Omitted...
 */
};

int main()
{
   dog mongrel;
   breed labrador("yellow", "large");
   mongrel.setEars("pointy");
   labrador.setEars("long", "floppy");
   cout << "Cody is a " << labrador.getColor() << " labrador" << endl;
}


3. Display no controls at the top


<pre name="code" class="cpp:nocontrols"> ... code here ... </pre>


Result


// Exhibits polymorphism/virtual functions.

#include 
#include 
#define TRUE = 1
using namespace std;

class dog
{
public:
 // Source Code Omitted...

private:
   string _dogSize, _earType;
   int _legs;
   bool _bark;

};

class breed : public dog
{
public:
 /*
 Source Code Omitted...
 */
};

int main()
{
   dog mongrel;
   breed labrador("yellow", "large");
   mongrel.setEars("pointy");
   labrador.setEars("long", "floppy");
   cout << "Cody is a " << labrador.getColor() << " labrador" << endl;
}


4. Collapse the block by default


<pre name="code" class="cpp:collapse"> ... code here ... </pre>


Result


// Exhibits polymorphism/virtual functions.

#include 
#include 
#define TRUE = 1
using namespace std;

class dog
{
public:
 // Source Code Omitted...

private:
   string _dogSize, _earType;
   int _legs;
   bool _bark;

};

class breed : public dog
{
public:
 /*
 Source Code Omitted...
 */
};

int main()
{
   dog mongrel;
   breed labrador("yellow", "large");
   mongrel.setEars("pointy");
   labrador.setEars("long", "floppy");
   cout << "Cody is a " << labrador.getColor() << " labrador" << endl;
}


5. Begin line count at value. Default value is 1


<pre name="code" class="cpp:firstline[123]"> ... code here ... </pre>


Result


// Exhibits polymorphism/virtual functions.

#include 
#include 
#define TRUE = 1
using namespace std;

class dog
{
public:
 // Source Code Omitted...

private:
   string _dogSize, _earType;
   int _legs;
   bool _bark;

};

class breed : public dog
{
public:
 /*
 Source Code Omitted...
 */
};

int main()
{
   dog mongrel;
   breed labrador("yellow", "large");
   mongrel.setEars("pointy");
   labrador.setEars("long", "floppy");
   cout << "Cody is a " << labrador.getColor() << " labrador" << endl;
}


6. Show row columns in the first line.


<pre name="code" class="cpp:showcolumns"> ... code here ... </pre>


Result


// Exhibits polymorphism/virtual functions.

#include 
#include 
#define TRUE = 1
using namespace std;

class dog
{
public:
 // Source Code Omitted...

private:
   string _dogSize, _earType;
   int _legs;
   bool _bark;

};

class breed : public dog
{
public:
 /*
 Source Code Omitted...
 */
};

int main()
{
   dog mongrel;
   breed labrador("yellow", "large");
   mongrel.setEars("pointy");
   labrador.setEars("long", "floppy");
   cout << "Cody is a " << labrador.getColor() << " labrador" << endl;
}





If you are using WordPress plug-in, please be sure to remove redundant JavaScript files in your plug-in source code and speed up the page loading. Go to WordPress dashboard, Plugins -> Plugin Editor, select "Google Syntax Highlighter for WordPress", go to the bottom of the source code, I removed the similar lines to the line below and kept this line to highlight CPP code:


<script class="javascript" src="<?php echo $current_path; ?>Scripts/shBrushCpp.js"></script>

Thursday, September 11, 2008

Performance: encapsulate floating point and built-in tolerance

When it comes to the floating point comparison with tolerance, two approaches come into my mind.
The first approach is to write inline functions and call them when compare floating point numbers. It is easy to understand and is widely used by numerous applications. Yes, it is really simple and I like it. The only concern is that I have to call these functions explicitly when compare two float or double variables, and I have to be aware of the code where it is necessary to compare with tolerance.
The second approach is to encapsulation floating point using a class. A float variable is the only member variable and I have to override lots of operators, assignment operator, relational and equality operators to control flow, arithmetic, unary, prefix and postfix operators. But how to implement the class is not a big deal, I'm worrying about the performance decrease which is introduced by using classes instead of pure machine data type.
Then I write the first to my open source library LibG, and made unit test for performance. The result looks good to me.
Testing Environment Hardware1: CPU: AMD Sempron 2600+ (1.6G Hz). Memory: 1G, DDR400
Hardware2: CPU: Intel Core2 6300 (1.86G Hz). Memory: 4G
Software: Windows XP Professional with SP2
Testing Method
Repeat 15million times of a set of operations. There are one additional assignment, one comparison and one pre-decrement to complete the loop. Run 10 times and get average time.
Source Code Open source portal of LibG: http://code.google.com/p/libg/
The class giFloat and giDouble: http://code.google.com/p/libg/source/browse/trunk/inc/gifloat.h
The testing code: http://code.google.com/p/libg/source/browse/trunk/test/libg_unit_test/test_gifloat.cpp
Testing Result 1. Arithmetic Operations: Operations Set: +, -, *, /, +=, -=, *=, /=
Testing Result (ms) float giFloat double giDouble
Hardware1 856.3 837.3 336.1 337.6
Performance +2.22% -0.45%
Hardware2 517.2 517.3 528.1 504.7
Performance -0.02% +4.43%
The performance of using class is very close to machine data type.
2. Relational and Equality Operations: Operations Set: ==, !=, >, >=, <, <=
Result float
(no tolerance)
float
(inline functions)
giFloat double
(no tolerance)
double
(inline functions)
giDouble
Hardware1 378.1 667.2 734.4 270.3 435.9 482.8
Performance -10.07% -9.71%
Hardware2 165.6 282.8 312.7 151.4 237.7 237.3
Performance -9.56% +0.12%
Comparing with using inline functions, the performance of using class decreased about 10%, it's acceptable to me. The data type double is the most frequently used type on geometry computation, the performance result on Intel Core2 CPU is exciting.

Monday, September 8, 2008

Comparison of Float and Double Precision

C++ supports two primitive floating point types: float and double. These are based on the IEEE 754 standard, which defines a binary standard for 32-bit floating point and 64-bit double precision floating point binary-decimal numbers. IEEE 754 represents floating point numbers as base 2 decimal numbers in scientific notation. An IEEE floating point number dedicates 1 bit to the sign of the number, 8 bits to the exponent, and 23 bits to the mantissa, or fractional part. The exponent is interpreted as a signed integer, allowing negative as well as positive exponents. The fraction is represented as a binary (base 2) decimal, meaning the highest-order bit corresponds to a value of ½ (2-1), the second bit ¼ (2-2), and so on. For double-precision floating point, 11 bits are dedicated to the exponent and 52 bits to the mantissa. The layout of IEEE floating point values is shown in Figure 1.

float

Because any given number can be represented in scientific notation in multiple ways, floating point numbers are normalized so that they are represented as a base 2 decimal with a 1 to the left of the decimal point, adjusting the exponent as necessary to make this requirement hold. So, for example, the number 1.25 would be represented with a mantissa of 1.01 and an exponent of 0:
(-1)

The number 10.0 would be represented with a mantissa of 1.01 and an exponent of 3:
(-1)








here are some sample floating point representations:
0      0x00000000
1.0    0x3f800000
0.5    0x3f000000
3      0x40400000
+inf   0x7f800000
-inf   0xff800000
+NaN   0x7fc00000 or 0x7ff00000
in general: number = (sign ? -1:1) * 2^(exponent) * 1.(mantissa bits)

As a programmer, it is important to know certain characteristics of your FP representation. These are listed below, with example values for both single- and double-precision IEEE floating point numbers:





































Property Value for float Value for double
Largest representable number3.402823466e+381.7976931348623157e+308
Smallest number without losing precision1.175494351e-382.2250738585072014e-308
Smallest representable number(*)1.401298464e-455e-324
Mantissa bits2352
Exponent bits811
Epsilon(**)1.1929093e-72.220446049250313e-16

Note that all numbers in the text of this article assume single-precision floats; doubles are included above for comparison and reference purposes.

(*)

Just to make life interesting, here we have yet another special case. It turns out that if you set the exponent bits to zero, you can represent numbers other than zero by setting mantissa bits. As long as we have an implied leading 1, the smallest number we can get is clearly 2^-126, so to get these lower values we make an exception. The "1.m" interpretation disappears, and the number's magnitude is determined only by bit positions; if you shift the mantissa to the right, the apparent exponent will change (try it!). It may help clarify matters to point out that 1.401298464e-45 = 2^(-126-23), in other words the smallest exponent minus the number of mantissa bits.

However, as I have implied in the above table, when using these extra-small numbers you sacrifice precision. When there is no implied 1, all bits to the left of the lowest set bit are leading zeros, which add no information to a number (as you know, you can write zeros to the left of any number all day long if you want). Therefore the absolute smallest representable number (1.401298464e-45, with only the lowest bit of the FP word set) has an appalling mere single bit of precision!

(**)

Epsilon is the smallest x such that 1+x > 1. It is the place value of the least significant bit when the exponent is zero (i.e., stored as 0x7f).

Saturday, September 6, 2008

Include Guard: #pragma once vs. #ifndef #define #endif

In the C and C++ programming languages, an include guard, sometimes called a macro guard, is a particular construct used to avoid the problem of double inclusion when dealing with the #include directive. The addition of include guards to a header file is one way to achieve this. pragma once is a non-standard but widely supported preprocessor directive designed to cause the current source file to be included only once in a single compilation. Both approaches specify that the file will be included only once by the compiler when compiling a source code file.

The table below compared the details of pragma once and #ifndef #define #endif approach:






















pragma once#ifndef #define #endif
Sample code
// header file
#pragma once
class foo { };

// header file 
#ifndef FOO_HEADER
#define FOO_HEADER
class foo { };
#endif // FOO_HEADER

C++ StandardNo

But both GCC and Microsoft Visual C++ support it.
Yes
Compiling PerformanceBetter

This can reduce build times as the compiler will not open and read the file after the first #include of the module.
It will still have to open the file multiple times, and discard the guard part when compiler find the macro guard. In a large project this could cause increased compile times.

But you can also optimize the compiling performance of #ifdef #define #endif approach by this way.