Archive

Author Archive

Average distance between two fields

December 2, 2008 No comments

If I randomly pick two fields from an aggregate type definition containing N fields what will be the average distance between them (adjacent fields have distance 1, if separated by one field they have distance 2, separated by two fields they have distance 3 and so on)?

For example, a struct containing five fields has four field pairs having distance 1 from each other, three distance 2, two distance 2, and one field pair having distance 4; the average is 2.

The surprising answer, to me at least, is (N+1)/3.

Proof: The average distance can be obtained by summing the distances between all possible field pairs and dividing this value by the number of possible different pairs.

                  Distance 1  2  3  4  5  6
Number of fields
            4              3  2  1
            5              4  3  2  1
            6              5  4  3  2  1
            7              6  5  4  3  2  1

The above table shows the pattern that occurs as the number of fields in a definition increases.

In the case of a definition containing five fields the sum of the distances of all field pairs is: (4*1 + 3*2 + 2*3 + 1*4) and the number of different pairs is: (4+3+2+1). Dividing these two values gives the average distance between two randomly chosen fields, e.g., 2.

Summing the distance over every field pair for a definition containing 3, 4, 5, 6, 7, 8, … fields gives the sequence: 1, 4, 10, 20, 35, 56, … This is sequence A000292 in the On-Line Encyclopedia of Integer sequences and is given by the formula n*(n+1)*(n+2)/6 (where n = N − 1, i.e., the number of fields minus 1).

Summing the number of different field pairs for definitions containing increasing numbers of fields gives the sequence: 1, 3, 6, 10, 15, 21, 28, … This is sequence A000217 and is given by the formula n*(n + 1)/2.

Dividing these two formula and simplifying yields (N + 1)/3.

Categories: Datatypes Tags: , , ,

Will IEEE 754 become a fringe representation?

December 1, 2008 No comments

Many people believe that with a few historical exceptions the IEEE 754 standard has won the floating-point value bit-representation battle.  What these people have forgotten is that money rules; customers are willing to ditch standards if it increases profit. FPGA devices can be configured to perform float-point operations faster and more cheaply than commodity cpus.

Making optimal use of a FPGA may require using a radix of 4 and for the time being automatically convert back and forth between an external 754 radix-2 representation.  In those cases where multiplication/division operations are more common than addition/subtraction use of a logarithmic number system has performance benefits.  For specialist scientific calculations (where cpu time is measured in days) purpose built FPGA devices are the path to significant performance improvements.  In many mass market applications the full power of a 32-bit representation is not needed and a representation using fewer bits does an acceptable job using less powerful (ie, cheaper) hardware.

Customer demand for higher performance and lower cost will push vendors to deliver purpose designed products.  IEEE 754 may be the floating-point representation that people without spending power use because it was once  designed into cpus and vendors are forced to continue to support it for backwards compatibility.