Node:Older is faster, Next:, Previous:How fast, Up:Performance

14.2 Comparing newer versions with old ones

Q: I switched to v2 and my programs now run slower than when compiled with v1.x....

Q: I timed a test program and it seems that GCC 2.8.1 produces slower executables than GCC 2.7.2.1 was, which in turn was slower than DJGPP v1.x. Why are we giving up so much speed as we get newer versions?

Q: I installed Binutils 2.8.1, and my programs are now much slower than when they are linked with Binutils 2.7!

A: In general, newer versions of GCC generate tighter, faster code, than older versions. Comparison between different versions of GCC shows that they all optimize reasonably well, but it takes a different combination of the optimization-related options to achieve the greatest speed in each compiler version. The default optimization options can also change; for example, --force-mem is switched on by -O2 in 2.7.2.1; it wasn't before. GCC offers a plethora of optimization options which might make your code faster or slower (see the GCC docs for a complete list); the best way to find the correct combination for a given program is to profile and experiment. Here are some tips:

I'm told that the PGCC version of GCC has bugs in its optimizer which show when you use level 7 or higher. Until that is solved in some future version, you are advised to stick to -O6. Some programs actually run faster when compiled with -O2 or -O3, even when compiled with PGCC, so you might try that as well. Several users reported that PGCC v2.95.1 tends to crash a lot during compilation, especially with -O5, -O6 and -mpentium options. (In general, PGCC version 2.95 is deemed buggy; you are advised not to use it.)

Programs which manipulate multi-dimensional arrays inside their innermost loops can sometimes gain speed by switching from dynamically allocated arrays to static ones. This can speed up code because the size of a static array is known to GCC at compile time, which allows it to avoid dedicating a CPU register to computing offsets. This register is then available for general-purpose use.

Another problem that is related to C++ programs which manipulate arrays happens when you fail to qualify the methods used for array manipulation as inline. Each method or function that wasn't declared inline will not be inlined by GCC, and will incur an overhead of a function call at run time.

However, inlining only helps with small functions/methods; large inlined functions will overflow the CPU cache and typically slow down the code instead of speeding it up.

If your CPU is AMD's K6, try upgrading to GCC 2.96 or later and use the -mcpu=k6 switch. I'm told that K6-specific optimizations are much better in these versions of GCC.

A bug in the startup code distributed with DJGPP versions before v2.02 can also be a reason for slow-down. The problem is that the runtime stack of DJGPP programs was not guaranteed to be properly aligned. This usually only shows up on Windows (since CWSDPMI aligns the stack on its own), and even then only sometimes. But it has been reported that switching to Binutils 2.8.1 sometimes causes such slow-down, and switching to PGCC can reveal this problem as well. In some cases, restarting Windows would cause programs run at normal speed again. If you experience such problems too much, upgrade to v2.02.