New Benchmark Tests For Variadic Template and Initializer List Versions of min

A correction of the results of N2722 and N2772

Author: Niels Dekker
Created: Saturday Oct 31, 2009
Last revision: Sunday Nov 1, 2009

There is a serious error in the benchmark test presented by N2722, Variadic functions: Variadic templates or initializer lists? by Loïc Joly, which is still there in N2772, the revision by Robert Klarer. Loïc has already informed the LWG about the error at c++std-lib-25210, Oct 30, 2009. The paper presented the results of benchmark tests of two min implementations: a version that has an initializer_list<T> as parameter, and a variadic template version. They were tested for 3, 13 and 24 arguments, and both int and std::string as argument type. The section Performance considerations concluded: We can see that in both cases, and perhaps surprisingly considering the theoretical performance considerations, the initializer_list version consistently outperformed the variadic template version.

Unfortunately, because of the error, the presented results of the min of std::string tests (Figure 1) are of no value at all, and the conclusion is wrong. After fixing the error, I reran the tests, and found that the variadic template version often outperforms the initializer list version.

Figure 1. Incorrect min of string graph, as presented by both N2772 and N2772:
Figure 1. Incorrect min-of-string graph from N2772

New tests

The new test results were from three little commandline test programs. The test programs were compiled using gcc-4.4.1-tdm-2 (TDM's GCC/mingw32) and ran on an Intel Core2 Quad Q8200 at 2.33GHz, having Windows XP SP3 (32 bits edition). The source files that are used for the tests are at source/

InitializerListMinTest.cpp and VariadicTemplateMinTest.cpp are fixed and extended versions of the test source code by Loïc (used for N2722 and N2772). The main extension is that they test more arities (argument numbers): 3, 6, 9, 12, 15, 18, 21, and 24 arguments. InitializerListMinTest.cpp uses the std::min(initializer_list<T>) implementation provided by GCC. Both InitializerListMinTest and VariadicTemplateMinTest repeat the min calls 50000 times in order to get a significant number of clock ticks.

TestMinOfThreeLists.cpp is entirely new. It tests the performance of the min of three std::list<char> objects of the same size. Their size is varied from small to large, and for each size, the performance of min is measured.

The performance is estimated by calling clock().

Results

The data that came out of the tests are stored in Excel files (using Microsoft Excel 2003), and available at data/.

MinTestDebug.xls is the result of running InitializerListMinTest.cpp and VariadicTemplateMinTest.cpp three times, after debug compilation (no compiler flag other than -std=c++0x). See also Figure 2 and 3.

MinTestOptimized.xls shows the results of running InitializerListMinTest.cpp and VariadicTemplateMinTest.cpp three times, after optimizing compilation (extra GCC compiler flag: -O3). See also Figure 4 and 5.

TestMinOfThreeLists.xls shows the results of running TestMinOfThreeLists: the first Excel sheet shows the result for an optimizing compilation; the second sheet shows the result for a debug compilation. In both cases, the variadic template version was so fast that zero clock ticks were measured. On the other hand, there appeared a clear linear relation between the performance of the initializer list version and the std::list size. As shown at Figure 6 and 7.

The time (along the vertical axes) is expressed by the number of clock ticks.


Figure 2. min of int (debug compilation):
min-of-int (debug)


Figure 3. min of string (debug compilation):
min-of-string (debug)


Figure 4. min of int (optimizing compilation):
min-of-int (optimized)


Figure 5. min of string (optimizing compilation):
min-of-string (optimized)


Figure 6. min of three std::list objects (debug compilation):
min of three std::list objects (debug)


Figure 7. min of three std::list objects (optimizing compilation):
min of three std::list objects (optimized)

Conclusion

The performance of std::min(initializer_list<T>) is potentially significantly worse than a similar variadic template, min(const T&, const Args&...), as shown by the new results for min of string. Moreover, the performance of std::min(initializer_list<T>) is potentially linear to the size of the objects themselves, as was shown by TestMinOfThreeLists. Such a linearity is not found for the variadic template version. Instead, it was found that the variadic template version can be extremely fast, even when its arguments are std::list objects of a very large size.

I regret that I didn't get myself involved with the LWG discussions about N2722 and N2772, last year. At that time I hadn't yet studied initializer lists and variadic templates well enough.

Thanks to Loïc for providing the original test code, admitting the error, and providing me very helpful feedback.