Posts Tagged ‘open source’

Open Source Aligns Developer and User Interests

Wednesday, November 30th, 2005

A big advantage of open source software is to align the interests of developer and end user.

To see why that’s true, notice how it is not true in commercial software. The competitive dynamics of commercial software push developers to make decisions that are “piecewise competitive” with rivals, even if such decisions create inflexibility, incompatibility or increased maintenance costs in the long run.

For example, and most commonly, a commercial developer may optimize for performance by writing code in ways (e.g. choice of language or data type) that cause long-term inflexibility with respect to future hardware platforms or customer needs. This is in the developer’s best interest, because it’s the only way his firm can survive short-term competitive pressures. However, it is against the user’s interest, because it forces upgrades or incompatibility in the future.

Another example is that a developer may add features that are useful to only a small fraction of customers; this creates a temporary competitive advantage for the developer, because it adds a bullet point to the box in the store, influencing a customer’s purchase. It also raises the barriers to entry in that software segment, which benefits all leading developers. However, this decision often works against the average user’s best long-term interest, because it makes software unnecessarily complex for the vast majority of users, and raises the long-term cost of the software, as barriers to entry rise and competitive pressures fall in the segment.

By contrast, many GPL projects make design and implementation decisions that seek to maximize coder productivity and device independence, at the expense of peformance and features. This is in the project developer’s interest, because it is the only way such a project can exist: without high productivity, maintainability and succinctness, even a successful project would eventually drown in its own bug reports. The coding team is too small to support such problems, and generally cannot expand over time.

This is generally in the end user’s best interest, because it allows simpler, more invisible upgrades to future hardware platforms; reduces confusion over features; increases modularity and stability of behavior over time.

You can extend the argument to cover other noncommercial projects like the original Unix, i.e. it survived into modern times because of some developer-centric design choices (performance-inefficient in the short run but maintainable in the long run) that happened to be compatible with the end-user’s best interest.

Thanks to Jeff Cohen of Genezzo for sparking this thought with the following email:

From: Jeff Cohen
Date: Nov 29, 2005 10:47 PM
Subject: Re: Genezzo version 0.53 is now available

If you asked them, they’d probably say it’s impractical, and nobody could design such a schema. Of course, given the precedent of computer-generated queries of insane complexity past human comprehension, I can easily envision computer-generated schemas. Generally, the issue is that languages that C/C++ tend to encourage certain types of design rigidity and premature optimization. For example, to encode the number of tables in a join, they could use an int, or maybe an unsigned int, or even a byte if they want to shave some space. And they might pre-allocate some fixed data structures inside the join row source for efficiency. Oracle was literally riddled with these little”optimizations”, and it caused all kinds of headaches when we needed to expand an existing data structure. I believe I told you about the COUNT(*) overflow at WalMart, where ancient coders couldn’t conceive of a table with over 2^32 rows. With languages like Perl, Ruby, or Python, the native data structures are dynamic, flexible and extensible, so coders are less inclined to construct fixed-size structures for practical and philosophical reasons. And the native numeric type is a generic number –the languages automatically switch representations internally for proper efficiency and precision. For example, Python might start using an int internally, then switch to float, and finally to bignum for arbitrary precision. And finally, since I’ve been bitten by these issues so many times, I am very vigilant in order to avoid getting trapped. For example, even the basic block encoding on disk uses variable-length BER encoding for lengths, so I can use a single-byte length for strings under 128 bytes, and up to 128 bytes of length forBLOBs over 2^1022 bytes. BER is more expensive than a fixed-length encoding, but in the future, if someone wants to shove a 3-D, holographic movie into a column inGenezzo, we should be able to handle it.