Pages

Friday, May 3, 2013

Report: Comparative Benchmark of Nginx, Lighttpd, & Node.js (via C++11 & Python)

This comparative benchmark was conducted in the form of an experiment with a steady control against various enterprise cloud configurations. The purpose was to determine their upper bounds and to find an acceptable trade-off between them. Minimalist HTTP and SCGI multi-threaded servers were written in C++11 for reference. This was an independent test for business/internal research and not sponsored by any organization or individual.

UPDATE (May 5th): Added GoLang, and high-performance Python application server configurations.


Motivation

There is significant hyperbole surrounding many popular web application services. IT managers and project leads need no nonsense assessments of these technologies in order to make the best decisions for their projects. These services are important to measure and understand, as they form the backbone of modern enterprise systems running on the cloud, and inefficiencies here can easily mean thousands in unnecessary costs. Trade-offs have to be made between the available language facilities for the target platform; the ease of maintaining and updating; the relative complexity of configuration and administration; and, the relative stability and maturity of the implementation.

Thursday, February 21, 2013

Machine Ethics & The Rise of AI Eschatology

A scene from the last phase of Ragnarök,
after Surtr has engulfed the world with fire. (Wikipedia)

Abstract


Listen on YouTube

There is a growing narrative of fear surrounding the existential risk of strong artificial intelligence. One that thrives on the proselytizing of extreme scenarios to a popular audience, and exploits an asymmetry in expertise between that audience and the scientists and engineers working in the field. It can be identified by its ornamental decree for the creation of safe and benign artificial intelligence. A sentiment that is impossible to refute on its surface, but which, in this specific campaign of fear, belies a supremacist moral stance and a scant technical foundation with little to no substance. It is also a sentiment that has been so obvious as to be invisible to security experts and machine intelligence researchers; putting the cart before the horse would be an understatement. Proponents of AI eschatology are so subservient to their ideals that they can not even fathom why experts find their case instantly dismissible. The case for a priori benevolence in recursive self-improving artificial intelligence as a mathematical formalism, even if solvable in theory, dissolves completely when faced with the external factors of real-world environments. Thus, this "golden path" becomes a farce. Modular, a posteriori architectures will be safer, less hazardous, and more robust. This not only vindicates those working to make actual progress in this field, but serves as an important message for the industries that will emerge; the moral hazards of assuming that "strong guarantees" can be made at the algorithmic level are an unacceptable risk. One that consumers, and, as a consequence, industries and societies, will take on as cost. The finality culminates in what should be rightly considered the decision of the century: it is the question of whether and when to release strong artificial intelligence.

Saturday, February 9, 2013

Report: Comparative Benchmark of Optimized C++ Data Structures

This is a review of four associative data structures in C++. Extensive benchmarks have been performed, and a comparative regression done on the resulting tabulated data. The purpose of this investigation was to determine the fastest associative data structure implementations, as opposed to theoretic time complexity.

Motivation

For time-sensitive; real-time; and CPU-bound applications, the underlying memory management and associative structures that presuppose application development have a dramatic impact on the performance of the resulting program. This analysis is meant to provide detailed information for C++ developers who are shopping for data structures to provide the foundation for their time-sensitive applications, be they servers; compilers; game engines, modeling; databases; or other uses.

shash: Micro-optimized Hash Map (C++)

Hash map. (Source: Wikipedia)
The static hash map is a micro-optimized, unordered associative data structure suitable for real-time and CPU-bound applications. It is significantly faster than binary search trees and bitwise tries, capable of fractional millisecond timing under even large capacities. It utilizes a non-linear function for entropic mixing with a static allocator behind a chaining list to alleviate collisions and increase robustness against collision attacks. It does not suffer from significant slowdown under full load factors, and performs in amortized O(1) constant time.

Recommendation

If ordering is not required, shash is recommended over all other alternatives, including bitwise / radix tries. It is capable of extreme performance; zero-fragmentation; and operates "in-place" on the heap in amortized O(1) constant time.

Usage

It must be intialized at construction with a maximum capacity that is a power of two. Once constructed, it can not be resized; however, it can be purged, resetting it to its initial state. It returns references to inserted items to allow the caller to have higher locality of reference. Exceptions are used to signal error conditions, including capacity and key location errors.
try {
    shash<double> m(8);
    double &x = m.insert(1);
    x = 1.2;
    //t will equal x
    double &t = m.find(1);
}//try
catch (exception &e) {
    cout << "error: " << e.what() << endl;
}//catch

Source

shash — Micro-optimized Static Hash Map

This source is available under the New BSD license, and can be used in both proprietary and free software projects.

High-performance "In-place" AVL Tree

BST. (Source: Wikipedia)
The AVL tree is a self-balancing binary search tree. It was invented in 1962 by G. M. Adelson-Velskii and E. M. Landis. It uses O(n) space and O(log n) time complexity on average. There is often debate between the use of the AVL tree and red-black trees, which perform the same functions and have comparable performance. One of the main supporting arguments for red-black trees is their use as a persistent data structure; however, AVL trees can be made persistent. AVL trees also have more strict balancing requirements. This can make insertions potentially slower, but with the trade-off that they can out perform red-black trees in read intensive applications (Pfaff, 2004).

What follows is a permissive free license implementation of the AVL tree written in C++ utilizing the smem Static Memory Allocator. This "in-place" BST provides significant performance over the built-in STL binary search tree when the application is using generic types of fixed-length.

Recommendations

This class should provide significant advantage in performance over the built-in std::map when using the STL default allocator. However, both are significantly slower than the shash static hash map, which should be used if there is no requirement for ordering on the input.

Usage

The fixed_map template class must be given its capacity upon construction. Once created, it can not be re-sized; however, it can be reinitialized. Insertion returns a reference to the value so that it can be quickly initialized by the caller, rather than invoking potentially unneeded copy-construction. This is by design to increase locality of reference for the calling application.

Exceptions are used to signal error conditions, such as a failure to initialize; key not found; or if the maximum capacity has been reached.
try {
    fixed_map<double> m(10);
    double &x = m.insert(1);
    x = 1.2;
    //t will equal x
    double &t = m.find(1);
}//try
catch (exception &e) {
    cout << "error: " << e.what() << endl;
}//catch

Source

fixed_map — In-place AVL tree (binary search tree)

This source is available under the New BSD license, and can be used in both proprietary and free software projects.

References

Pfaff, B. (2004). Performance Analysis of BSTs in System Software. Stanford University, 32(1), 410-411.

Friday, February 8, 2013

unicodex — High-performance Unicode Library (C++)

The following is a micro-optimized Unicode encoder/decoder for C++ that is capable of significant performance, sustaining 6 GiB/s for UTF-8 to UTF-16/32 on an AMD A8-3870 running in a single thread, and 8 GiB/s for UTF-16 to UTF-32. That would allow it to encode nearly the full English Wikipedia in approximately 6 seconds.

It maps between UTF-8, UTF-16, and UTF-32, and properly detects UTF-8 BOM and the UTF-16 BOMs. It has been unit tested with gigabytes of data and verified with binary analysis tools. Presently, only little-endian is supported, which should not pose any significant limitations on use. It is released under the BSD license, and can be used in both proprietary and free software projects.

The decoder is aware of malformed input and will raise an exception if the input sequence would cause a buffer overflow or is otherwise fatally incorrect. It does not, however, ensure that exact codepoints correspond to the specific Unicode planes; this is by design. The implementation has been designed to be robust against garbage input and specifically avoid encoding attacks.

Usage

Simply pass in the appropriate array and the size, in bytes (not characters), of that array, to the desired mapping member function. It will then decode it and store the result on the heap in separate locations. The calls to the functions are indempotent, but not thread-safe; the caller must handle threading issues.

Source

High-performance Unicode Library for C++


Cross-platform Nanosecond Timer (C/C++)

Nanosecond timing is a slight inconvenience for cross-platform application programmers in C and C++. The following simple header allows nanosecond timing operations on Mac OS X, Windows, and Linux operating systems using an extremely simple interface. Usage on Windows and Mac OS X need only call initnano(), then compare differences in calls to nano. On Linux, the call to initnano() can be skipped entirely. Tested on all three platforms, 32-bit and 64-bit. Released under permissive free license, available for proprietary or free software projects.

Source

Cross-platform Nanosecond Timer