120

What is the language hack you are simultaneously the most ashamed of and the proudest of?

Mine is (C++) using

#define private public

to gain access to the internals of a class to work around a bug in a third party library (IBM Callcenter) where the third party was unwilling to give us a patch.

It works because C++ only enforces the access protection in the compiler and not in the compiled object code.

I feel sullied for resorting to this sleazy hack, but feel vindicated by the number of people who ate their hat after telling me it was impossible to subvert the C++ access control mechanism.

65 accepted

Didn't use it personally, but I think the fast inverse square root function in the Quake 3 source code is an awesome hack:

float InvSqrt(float x) {
    float xhalf = 0.5f*x;
    int i = *(int*)&x;
    i = 0x5f3759df - (i>>1);
    x = *(float*)&i;
    x = x*(1.5f - xhalf*x*x);
    return x;
}

The article explains how the function works and how it originated.

50

I don't know if the comment switch trick can count as a hack but sometimes it's very convenient for testing 2 pieces of code:

/*  <---
line1
/*/
line2
//*/ 

//* <---
line1
/*/
line2
//*/
20

Not sure I want to own up to this, but I implemented the goto and comefrom commands for Python: http://entrian.com/goto/

19

From http://thedailywtf.com/Comments/Illogical-Logic-Flow.aspx#266508

int x = 1;

/**++++++++++++++++++++++++++++++++++++++++++++++++++++

 ++ This is a long comment of a lot of text that     ++

 ++ had absolutely no redeeming value except that    ++

 ++ it could easily */x++/** mislead you into making ++

 ++ your eye skip right past the relevant part in    ++

 ++ the middle of the block of unending comments.    ++

 ++++++++++++++++++++++++++++++++++++++++++++++++++++*/;
16

Apparently an argument about to or not to use hungarian notation went on far too long and someone decided to make a compromise. The following hack was implemented for most members in the object hierarchy and allowed for "the best of both worlds"

class Foo { 
public:
  union {
    const SomeType* cstNamePtr;
    const SomeType* Name;
  };
}

This allows both hungarian and non-hungarian usage

Foo* pFoo = GetSomeFoo();
pFoo->cstNamePtr;
pFoo->Name;

Note: I am not the originator of this hack, just the presenter :)

11

From swright's blog on the old dotnetjunkies:

void SomeDumbRoutine(...)
{
  switch (a)
  {
    ...
    case 2:
      myvar = 34;
      // do some other stuff
      goto SKIP_MIDDLE_PART;
    ...
  }

  switch (b)
  {
    ...
    case "BLANK":
      myvar = -2;
      SKIP_MIDDLE_PART: myvar += x;
      ...
      break;      
    ...
  }
}

He explains:

Here is the deal. The C compiler on that system was implementing the "switch" statement as though it was a subroutine. When it hit the beginning of the switch, it pushed the address for the bottom of the switch onto the stack. Then, when it hit a "break" statement, it just poped the address off the stack. This resulted in the following sequence of events.

  1. The routine begins.
  2. The first switch is reached and the address for the bottom of that switch is pushed.
  3. The "goto" statement is hit and the program jumps into the second switch statement.
  4. The "break" statement is hit. This pops the address off of the stack. (OPPS! That's for the other switch statement!)
  5. The program jumps to the point just AFTER THE FIRST SWITCH.
  6. The program then continues on into the second switch AGAIN.

At first I thought, "Oh! They just got confused because this routine is so long. That must be causing the error." Nope. We traced through it and eventually figured out (by reading assembly code) that the person who wrote this was INTENTIONALLY using this strange behavior of the compiler!!!!

10

The worst one I ever saw was coming across a colleague's code that has something like this:

unsigned char palette[259]; // 255 + 4, just in case

I asked him about it and he said that he was having weird memory corruption issues,so by trial and error, he just added 4 bytes onto the end of the array until it all 'magically worked'

I also once worked with a guy who'd been at the company for 5 years writing c/c++ and it transpired he didn't realise that c arrays were zero-indexed!

And a few more from the games industry

A well known game on PlayStation 1 had completely run out of main memory so they started storing their 3d-models compressed into audio memory. Never sure whether this was hacking or a stroke of genius

I also saw the code to a very, very well known game that had a helicopter fly-over path hard-coded at the top of one of their files in a huge array of magic numbers

I also heard ( but can't vouch for this ) of a company storing extra data in the top two bits of code instructions

8

Checkboxes in Excel and .Net ReportViewer are exceedingly complex to implement, so I just change the font of a cell or TextBox to "Wingdings 2", in which the capital letter P is a checkmark. I've yet to have problems with it....

6

I still remember Simon Tatham's method of implementing coroutines (like C#'s yield return) in C:

#define crBegin static int state=0; switch(state) { case 0:
#define crReturn(i,x) do { state=i; return x; case i:; } while (0)
#define crFinish }

int function(void) {
    static int i;
    crBegin;
    for (i = 0; i < 10; i++)
        crReturn(1, i);
    crFinish;
}
5

The C++ hack which caused me the most personal pain to undo was the following

#undef new

It took me roughly 2 months to figure out first why this was done and secondly to undo all of the weird code that was added to work around this issue.

5

On Netburst era intel hardware the x87 floating point hardware behaved incredibly badly when presented with a double/float.NaN as an operand. Literally costing 100s of cycles to deal with it.

Since we made us of these extensively as 'safe' guard default values (to prevent a default of zero forcing a usable but 'incorrect' zero output for example) and the pre 2.0 versions of the CLR did not emit SSE for all floating point operations.

We came up with the following:

unsafe static bool IsNaN(double d)
{
    const long Mantissa = 0x000fffffffffffffL;
    const long Exponent = 0x7ff0000000000000L;
    long l = *((long*)(void*)&d);
    return (l & Exponent) == Exponent && (l & Mantissa) != 0; 
}

Nowadays this is unneeded but it was a massive performance win for us at the time.

For additional amusement the current double.IsNaN function in the framework (and many others) is:

public static bool IsNaN(double d)
{
    return d != d;
}

So we were hardly alone in making use of somewhat unintuitive behaviour

5

What about GOTO for Java? This goto implementation allows you to write something equivalent to goto(xx) where xx is a line number in the source code. So the following exemple:

 1 public class GotoDemo {
 2     public static void main(String[] args) {
 3         int i = 3;
 4         System.out.println(i);
 5         i = i - 1;
 6         if (i >= 0) {
 7             GotoFactory.getSharedInstance().getGoto().go(4);
 8         }
 9         
10         try {
11             System.out.print("Hell");
12             if (Math.random() > 0) throw new Exception();            
13             System.out.println("World!");
14         } catch (Exception e) {
15             System.out.print("o ");
16             GotoFactory.getSharedInstance().getGoto().go(13);            
17         }
18     }
19 }

Will output when running:

$ java -cp bin:asm-3.1.jar GotoClassLoader GotoDemo           
   3
   2
   1
   0
   Hello World!

Isn't that nasty?

4

I was working on fitting a query engine in a resource-constrained embedded environment. The expression evaluation function was essentially a highly recursive switch-case block: each expression type was a separate case and subexpressions were evaluated recursively. The problem was that the function consumed about 600 bytes of stack for each level of recursion. The embedded environment defaulted to 8kB of stack so any non-trivial expression would easily lead to crashes of all sorts.

I pulled all the variable declarations out from case blocks and crammed them in a tight union, carefully making sure the variable use patterns would not have any overlaps. At the end of this exercise, the function needed just some 24 bytes of stack. Nice optimization for a constrained environment, but certainly did not make the code easier to understand.

(I think this is not the most shameful nor awesome thing ever, but combines both aspects nicely.)

3

We had a large Fortran77 code base that had used one large globally allocated array as scratch space in builds released in the 70's and 80's. Functions would return indexes into this large memory buffer as requested by various routines. I modified the code to be able to use dynamic memory by tying an "empty" Fortran array to a memory address returned from malloc (via a function written in C).

So the Fortran half looked more or less like this:

  integer iscratch(*)
  pointer (ptriscratch, iscratch)

  ptriscratch = my_malloc_wrapper_code()

This syntax (using pretty widely adopted language extensions) causes the address returned from my_malloc_wrapper_code() to be assigned to the base of the iscratch Fortran array.

Obviously there was a bit more to this where I had to write a library of C code that managed blocks of memory allocated in C and returned those addresses to the Fortran client code.

This bought us dynamic memory in Fortran77 compatible compilers (well, compilers that recognized this "integer pointer" extension -- which was all the compilers we cared about). The company was not ready to adopt Fortran90 compilers at the time, plus the legacy code base that used various offsets into the large preallocated array needed to be supported moving forward.

3

As part of my bachelors degree we (a group of four) spend half a semester studying a program called Fhourstones, which is an integer benchmark that solves positions in the game of connect-4, as played on a vertical 7x6 board.

By default, it uses a 64Mb transposition table with the twobig replacement strategy. Positions are represented as 64-bit bitboards, and the hash function is computed using a single 64-bit modulo operation, giving 64-bit machines a slight edge. The alpha-beta searcher sorts moves dynamically based on the history heuristic. A move causing a cutoff is rewarded as many points as moves previously tried, each of which gets a -1 penalty, thus preserving total weight and avoiding renormalization (uniform penalties were found to work much better than depth dependent ones).

Although the initial assignment was only to study the techniques used and present this to our classmates, we set out to see if we could extend the program to solve 8x8 boards as well (just over the current limit). This proved troublesome as the code was highly optimized, storing a single board position inside a (Java) long.

// bitmask corresponds to board as follows in 7x6 case:
//  .  .  .  .  .  .  .  TOP
//  5 12 19 26 33 40 47
//  4 11 18 25 32 39 46
//  3 10 17 24 31 38 45
//  2  9 16 23 30 37 44
//  1  8 15 22 29 36 43
//  0  7 14 21 28 35 42  BOTTOM

For red stones on the board the numbered bit would be set to 1, for black stones to 0. By then taking the 'skyline' of the board and setting each bit above it to 1 (like a layer of snow over rooftops) the entire board position for boards up to 8x7 can be stored using 64 bits. The entire program relied on this encoding to efficiently test for a win using 8 shift/ands and 4 comparisons.

Obviously we needed something bigger if we wanted to store larger boards, but because Java does not provide a primitive 128 bit data type, we had to look at other ways to go about storing the board positions. However, any solution we tried proved to be anywhere from 25 to 100 times slower than the original, the 'best' one being BigInteger.

As the projected runtime for the original code for the larger board was already well over a day, this increase in time required was unacceptable. Determined not to let the project fail, and somehow beat the odds at finding a faster solution, I set out to create my own number monstrosity, called

IntLong

As the name implies is was an int concatenated with a long, designed to provide us with just enough bits to be able to store the 8 x 8 + 8 = 72 bits needed. Next came the problem of making these two numbers work together to represent one larger number, and implementing all the bit operations used throughout the code base. We used the int to store the higher bits, and the long to store the lower bit, with a overflow from long to int on the 60th bit. This made it trivially easy to implement the and, or and xor operations. The shift left and shift right operations were easy as long as you were careful to move the right bits over using ands and shifts. We worked around not having to implement addition, subtractions, mulitplication and division by replacing them <<, >>, & and |, which worked surprisingly well as these were only few and far between.

The real problem came when I was trying to figure out how to implement the modulo operation in a fast and accurate way, as this was used for hashing the board positions, which would happen a few million times in our application.

After two weeks of trying every possible algorithm and finding them all too slow, I was damn near ready to give up. Then it hit me that we were only ever calculating the modulo by a known large primitive number, as part of our transposition table hashing mechanism. So with that in mind I implemented the following function:

public int modulo(int divider, int shifts) {
 IntLong temp = new IntLong(this.high, 0).shiftRight(shifts);
 if (0 < temp.high)
  throw new IllegalArgumentException("IntLong.modulo(int divider): argument too small, high still " + Integer.toString(temp.high, 2));
 temp = new IntLong(0, temp.low % divider).shiftLeft(shifts);
 temp = temp.or(new IntLong(0, this.low % divider));
 return (int) (temp.low % divider);
}

It takes the divider and the number of bits in the divider as arguments, and performs the following:

  • take the higher bits in the int, and shift them shifts times to the right, ignoring the lower bits in the long
  • divide that by the divider and shift the remainder back to the left shifts times
  • take the lower bits from the long, divide them by the divider and keep the remainder
  • overlay the bits from two and three (which by now should not overlap) into one long
  • return the remainder from one final division by divider and you should have the correct result

It took me a very long time before I could finally convince myself that the end result was indeed correct, even after all my Unit tests passed flying colors. In the end it turned out to be 2.5 times faster than the BigInteger implementation (still 10 times slower than the original), but it was just enough to run our calculations in a realistic time frame.

It's the ugliest hack I've ever had to implement, but I'm also still proud about having somehow beaten the odds and saved the project. :)

3

Something I just did in Java: an inner class that subclasses its outer class:

public class XmlOutputHandler
implements OutputHandler
{
    private class ListOutputHandler
    extends XmlOutputHandler
    {
        public ListOutputHandler(Element container)
        {
            super(container, _options);
        }
    }
}

This was done because the outer class is used recursively, but in the case of a list (and several other types) I wanted to add additional code to the append operation. An alternative that might be cleaner (arguable) is to extract that operation into its own class, and create decorator instances.

The only truly ugly part of this is that it's an inner class, and the *options variable comes from the outer class. That's unnecessary, and will be changed.

2

It a templated collection class, I once ran across the following.

template <typename T>
class SomeCollection { 
  void Add(T* pValue) {
    ...
    memcpy(&someInternalNode, pValue, sizeof(T));
  }
};

I found this when I attempted to use a type which depended on value semantics in conjunction with this collection. My best guess is they ran into issues when doing a direct assignment (which is an indication BTW that there is a bug in the usage)

2

In Fortran one did not need to declare variables. They were automagicly REAL unless the name began with I, J, K, L, M or N. So to force all variables to be declared a programming standard employed the IMPLICIT statement which would modify the default of I-N for INTEGER and the rest REAL.

IMPLICIT COMPLEX

This would cause most uses of undeclared variables to produce compiler errors. Later IMPLICIT NONE was a more direct method.

1

I moved from a Fortran IV on RSX-11M system to VAX Fortran77 on VAX/VMS. So, I read the Fortran manual to find out about the differences. While reading it I came across the alternate return feature of Fortran which it turned out to be a standard feature not a VAX extension. I even showed it about ridiculing it and anyone who would use such a thing.

I was assigned a project to modify an in-house product MenuGraph which prompted the user for info to define the layout and contents for displaying graphs using the ISSCO DISPLAY Fortran graphics library. One of the design features allowed the user to back up to previous questions and change the answers. This was a terminal text based user interface.

In order to support the ability to back up from anywhere the code was very messy even for 1980's Fortran. Some how the alternate return popped into my mind. Beware of what you read.

    real gpa
    integer units
    character*10 name
10  type*, 'Computes grade points from GPA and units.'
20  call sdsu_ask_real (
   .    '$Enter GPA: ', !Prompt user with this.
   .    0.0,            !Minimum value user can enter.
   .    4.0,            !Maximum value user can enter.
   .    'N',            !Default not allowed (input required).
   .    gpa,            !Variable to receive user's valid input.
   .    *10,            !Previous question label.
   .    *999)           !End of input label.
30  call sdsu_ask_integer ('$Enter units attempted: ',
   .    1,
   .    500,
   .    'N', units, *20, *999)
    call sdsu_ask_string ('$Enter your name: ',
   .    1,
   .    10,
   .    'N', name, 10, *30, *999)
    type*, name, ', your grade points =', units * gpa
999 end

The starred arguments are the alternate return labels. The routines return to the first one ( 10, 20, or 30) if the user enters "^" to backup to the previous question. If the user enters "^Z" the routines return to the second alternate return (999). The normal flow is when the user enters valid data then it advances to the next line just like regular routine calls ignoring any alternate return values.

The nasty alternate return is used in this case to make the code very clean and easy to follow.

1

I can't claim credit for this one, but I was looking for how to declare const variables in Python and ran across this page: http://code.activestate.com/recipes/65207/

It revolves around this: "In Python 2.1 and up, no check is made any more to force entries in sys.modules to be actually module objects."

1

I was working with an embedded device that incorporated a simple webserver for configuration. It didn't have a filesystem of any sort, so rather than writing and saving static HTML pages, you had to write the webpages using special tags for form elements etc.

When you built the downloadable image it had a build step which took the webpages you'd written and pre-processed them into large static C string arrays, and replaced the custom tags with dynamic code snippets.

To serve up the webpage it re-generated the HTML from the static string arrays and the code snippets.

All fine and good; the problem came when I needed to present a page with some enormous number of checkboxes on it, but depending on other configuration options some of the checkboxes would sometimes be disabled.

The normal way to do this would be to add both enabled and disabled checkboxes to the form using the custom tags, and then use the dynamic code to serve up the correct variety, but I hit the limit on the number of custom tags the pre-processor could handle.

My solution was to include just the enabled checkbox on the page as a custom tag element, but also to include two images of a disabled checkbox (one checked, one not checked) on the page as well in each location where I needed a checkbox. I then wrapped everything in HTML end-comments "-->" and used the dynamic code snippets to insert HTML open-comments "<--" in the appropriate places so that the right stuff got displayed.

Yuk, but it did do the job.

1

A very neat hack in VBA and VB6 that tends to clean up complex If-Then-Else constructs is to use the Select Case Statement in this way: Caveat: This is pseudocode, and a trivial example at best - it really shines to unravel deeply nested If-Then-Else blocks

Public Sub VitalProcess(Test1 As Boolean, Test2 As Boolean, Test3 As Boolean)

  Select Case True
    Case Test1
      'do some processing for this test'
    Case (Not Test1) And Test2
      'Typical special case'
    Case Test2, Test3
      'do some alternate processing'
    Case Else
      Debug.Print "No Processing was done - check VitalProcess ", Test1, Test2, Test3
    End Select
End Sub
0

Defining two almost identical constructors in a Java class:

public class Class
{
    final private double[] data;

    public Class(double[] data)
    {
     this.data = data.clone();
    }

    private Class(double[] data, int i)
    {
     this.data = data; // no cloning
    }
}

Only later did it occur to me that I could do it in a nicer way.

0
jQuery.noConflict( extreme );
0

In C, using fscanf instead of fgets:

fscanf(fileptr, "%[^\n]%*c", string);

Why? stupidity.

0

Not sure if anyone answered, this but I see people add the signature of a function at the site where the function is called instead of a header file. This works, but now the signature is in two places, and nothing to check it. In C++ at least the name-mangaling would prevent it to link. In C, it'll link, and then crash at runtime -- so you just hope crash is in area of code that gets tested before the software ships. :)

0

A long time ago when I was still learning C I was having trouble getting a shell to work(I was also writing my own 16 bit OS. Yes, dumb idea while learning C). As I was looking for possible sources I just thought I would try changing char buffer[100]; to char *buffer[100].. but here is the more amazing thing. It worked. I have no idea how, but somehow the way my code was written strlen and everything just worked after that.. That doesn't make it any less shameful though(note: buffer stored a string. not an array of strings)

Also, in a C++ project I was having severe problems with having things needing to be declared before they could be(dependencies on each other).. so I worked up the skill of including a header file. Twice. It took some definite macro magic..

Also in the same project, it was a library. So the user of the library needed to implement things before it would actually link. Well, In order to prevent having a huge header file distributed with the library(over 20k iirc) with all of the internal methods and such. I worked things into a x86Lib.h and x86Lib_internal.h. If a certain macro was defined, then it knew it was building it internally and included the _internal file, else it didn't include it and changed any references to it to void * or similar.

-4

I don't like having to use java reflection in order to access private members. I also don't like having to work with the java class loader. Finally, I hate using reflection to access private members of the class loader (for example, to add additional classes the Tomcat's WebappClassLoader)