Coding For Clarity


A Programmers Style Guide

Version 1.0
Boyne Cutting
Copyright © 2006 Alden Systems Pty Ltd

Contents

About this guide

This guide is not intended to describe a revolutionary methodology for writing infallible software. I doubt any such methodology will ever exist. The purpose of this guide is to collect together a set of programming principles in a useful and practical form that will maximise the chances of producing great software.

This guide is also not intended to proclaim laws to be followed.  There are bound to be many circumstances in which following these principles is not the best course , or may not work for different programmers. I don't and haven't followed the rules all the time.  However, I would suggest understanding the reasons behind these principles before violating them.

There are four parts to the guide.  In the first part I discuss general reasons and benefits of coding principles. In the second part I discuss beneficial attitudes to assist in maximising quality.  In the third good code styling techniques as distinct from code structural techniques, though there is some blurring of the lines.  The fourth part covers principles to assist in developing a good application architecture.   Additionally, at the end of the guide is attached an appendix with more detailed technical descriptions of coding styles.  Please be aware that due to limited space and for the sake of brevity, some of the code examples provided herein do not conform to defined styles.

Finally, while the principles outlined here are generally applicable to all languages, it has been written with C# in mind but not necessarily the target.  Hopefully developers of all languages can derive some usefulness from this guide.

Why code for clarity?

Programmers make hundreds of mental queries and decisions everyday, what to name things, what types of data to use, how and where elements are defined, how to structure code, where to look when a problem manifests itself, the consequences of modifying existing code, the list goes on and on.

Many of these decisions are only related to producing the end software in an ancillary way.  For example, creating a widget for a product directly relates to its development, but thinking about what to name that widget only relates to the programmers maintenance of the product.

Every instance where a programmer has to think about things that aren't directly related to the end product, slows development. That's not to say those decisions aren't important.  They are fundamentally important if the project hopes to survive to version one and beyond without an expensive rewrite or wholesale refactoring.

What invariably happens to projects under a tight schedule (aren't they all) is those decisions required for the maintenance of the project are discarded in favour of making everything happen as quickly as possible. Programmers use poor names, create rigid designs, layer bug fixes over bug fixes without fixing core problems and generally produce absurdly confusing code.

In the short term skipping these maintenance decisions appears highly productive, but in the longer term all those decisions that were not taken earlier will come back to haunt the project, and will in all probability be permanently crippling.

A consistent set of development principles reduces decision making overhead, yet retains the benefits of well thought out, maintainable code.

Programming Attitudes

There are many small ways below the level at which we typically plan applications, that nevertheless have a profound impact on its architecture.  These are manifestations of the attitude we take towards the application codebase and development.  The following points describe attitudes we can engage towards our programming to maximise coherence and flexibility.

Justify the code

You must adopt the attitude that every line of code matters and must be justified.  Consider every line of code you produce, is it redundant? Is it efficient?  Will it constrain the application in someway?  Nothing can be perfect, but at the very least you must be able to justify the code you write, otherwise you're just being lazy.

Keep it readable

Refine your appreciation for the aesthetic of code.  It's not enough that it works, it must be easy to read and understand so that when it doesn't work it can be fixed more easily.  The process of making the code more readable will help you find problems earlier rather than later.

Don't use obtuse constructs when a more obvious implementation is possible.  Simple and obvious code is easier for others less experienced with the codebase to learn and understand.

Keep it clean

Dead or bad code is infectious, if it looks like nobody cares about the quality of the code, nobody will care and the code will fall into an increasingly poor condition.  To reduce the potential for confusion and to ensure the future code quality, dead and bad code must be removed as soon as possible.

This principle also applies to code automatically generated by many tools and wizards.  Leaving obviously automatically generated code gives a project the same 'nobody cares' feeling as bad code.  Take a few minutes to examine what the wizard has done and learn to cut the code yourself, in the end you'll do a much better job than the wizard.

Warnings are errors

Resist the temptation to persist in the face of compiler warnings.  Warnings should be given the same consideration as errors.  They tell you of dead code and lurking problems.

Clarity in Code

The following principles describe methods for ensuring the code is as comprehensible as possible.

Consistency

The consistency principle or 'principle of least surprise' has been a core part of good systems development for decades.  That is, you design aspects of the system in a similar or consistent fashion. This allows developers reading the code to recognize the pattern and recognize when things don't look quite right.

Performance vs Maintenance

Performance and maintainability are not necessarily competing concerns.   If your application architecture is correct from a maintenance perspective, then your application will perform at or near optimal in any case.  

Where performance fails expectations, a well designed project will allow the seamless addition of caching or other performance improvements at appropriate points without degrading maintainability.

The rule I follow in this regard is that performance is critical, up to where it impacts on maintainability.

Naming Conventions

Names are incredibly important in software development. Good names allow us to better grasp the totality of a process, implicitly describe functionality, how components work and how they should be used.  Every programmer should have a bookmark to an online dictionary and thesaurus in their browser of choice.

There are many different aspects of software for which to consider various naming conventions.  What follows is a consideration of naming styles for each area.

Development tools and names

Intellisense is an enormously productive tool.  We are relieved of the burden of remembering the exact names or even casing of various code entities.  Using tools without intellisense like features is like stepping back into the dark ages of programming.

It makes sense to use the full capability of the tools provided by the modern IDE. When we use prefixes for names in our code we are forcing ourselves to make at least another keystroke to advance the intellisense members list to the appropriate alphabetical position. We also restrict the ability of other developers dependent on our code to navigate and understand the components we provide.

Publicly visible names

One general rule I follow is that any publicly visible attributes of a project should be proper cased and should not expose any internal naming convention.  There are two exceptions to this rule, interfaces and function parameters which are discussed later.

Not only does this principle provide a less confusing interface for other developers that may build upon your codebase, but it also ensures we are making full use of intellisense and are maximizing our own understanding of the code.

Avoid abbreviations and acronyms

Do not abbreviate entity names. What might seem perfectly obvious to you today will be obscure and meaningless tomorrow.  Never use abbreviations except where the meaning of that abbreviation is commonly accepted. For example, UI is widely recognised within development as referring to the 'user interface'.

Namespaces

Namespace conventions are fairly well established.  The typical convention of 'CompanyName.ApplicationName.LibraryName' is logical and easy to follow.

For consistency namespaces should be proper cased.  It is also a good idea to permit only a single namespace per library or application, and constrain a library to one area of responsibility.  This results in a larger number of libraries which has a minimal impact on performance, but is an aid to maintainability as it is easier to immediately identify in which library a given class can be located.

Classes

Class names should be proper cased and reflect only one purpose or entity.  If you find the name of a class is ambiguous to its functionality or overly broad in meaning, then it's either poorly named or includes functionality that rightly belongs elsewhere.

Variables

The first rule of variables is they should always be private to their declaration scope.  This is important for many reasons discussed further on, but in relation to naming conventions it allows us to conform to our public names rule and our naming rules below. There is one exception to this rule both in scope and naming, and that is event delegates.  Event delegates can be public and should be named using proper casing.

Prefixes

At some point early in my programming life I fell into the habit of prefixing my variables with various type or meta information.  This was very useful to a point, because it quickly allowed me to quickly identify problems without having to refer to the point where a variable was declared.  However, it became painfully obvious this was not going to be a sustainable approach into the future.  In an object oriented world it is impractical to think of and remember an appropriate prefix for each and every type used within an application.

Ultimately I bit the bullet and did away with my innumerable prefixes.   I discovered that without them I created names that better described purpose and meaning. While I do miss some of the utility of the prefixes, I think the display of this information is a task for IDE technology to overcome.

I don't think displaying type in a tooltip is adequate; we really need something to allow us to completely customise and toggle the display of particular types, such as colorization or underlining.

Suffixes

Another part of my variable naming convention was a suffix of scope, a truly useful habit I acquired early in my career.  There are three scopes of a variable in the context of a function.  It can be declared externally but visible internally, it can be declared locally within the function or block, or it can be supplied as a parameter.

At the class level I give variables a 'C' suffix to indicate class level scope. Remember that no variable should be visible outside the class so this does not violate the general public naming rule above.  Variables defined within a function are given an 'L' suffix to indicate local scope.  Variables supplied as parameters to a function should include no suffix but are still camel cased.

Not only does this make it clear where a variable is coming from and its scope, it also provides an additional benefit of a 'namespace' to the variable.  We can reuse the same name for different variables that essentially represent the same information.

private int widgetInstanceIDC;

public Widget(int widgetInstanceID)
{
   widgetInstanceIDC = widgetInstanceID;
}

In the above code fragment, inventing a different name or explicitly referencing the private widgetInstanceID class member with the 'this' keyword, is not necessary.

This approach improves the general readability and reliability of the code and does not expose internal naming conventions to other developers using your code base.

Class Members

All class members that are not directly declared variables, public and private, should be named in proper case style.  This will minimize the impact of changing the accessibility of a member, especially if that member is used internally to the class.

User Interface Controls

There is a pervasive need by many programmers to give user interface controls special names, like txtField and btnCommand.  Controls do not need a special naming convention, treat them for what they are; class members.

Interfaces

Interfaces should always be named with a leading 'I'.  This is a well worn convention that easily allows developers to quickly and easily identify that an entity is an interface as distinct from a class and also what interfaces a class implements.

Interfaces names should effectively describe what they do, their behaviour.

Enums

Enums should be named with regard to the same standard as classes.  However, if an enum represents a bit field then it should include a 'Flags' suffix.

Database Entities

I follow a strict set of naming conventions for database tables, views, stored procedures, procedure parameters and procedure variables.  The reason is simple, it makes it easier to follow and maintain code, both in the database and in the application, when you can be sure what type of entity to which the code is referring.

There are also additional benefits in that you can reuse names for different entity types and you can clearly distinguish between system objects and application objects.

All database entity names should be singular unless it refers to something that represents or contains something unambiguously plural. The point of this is to remove the guesswork from naming, if the developer can confidently know the table is 'tblCustomer' and not 'tblCustomers' then you spare them from the overhead of switching to the database and checking the table name. You also reduce the potential for errors to creep into your code.

Do not prefix field names with table names, we know we are looking at the customer, we don't need to know it's the customers customername.  That is a redundant naming scheme and simply forces the programmer to mentally parse or type more information.

Code comments

Another area in which I have changed my habits over time is code commenting.  I liked to have lots of comments within my code describing what elements are, what they're supposed to do, how they should be used and so on.  Over time however, these comments become increasingly burdensome to maintain and confusing.  

All programmers have encountered comments that do not describe the code correctly.  We are then have to decide if the comment is correct and the code is wrong or the comment itself is wrong.  If we cannot ensure that comments are always correct and infallible then they really do nothing but confuse the programmer.

Another problem with comments I have found is that I simply don't see them.   There have been times I have agonized over the purpose and intent of a code segment when a thorough description was merely a few lines away.   No doubt I have learned to ignore comments thanks to the situation described previously.

Those points aside, I don't advocate removing all comments.  Comments can be extremely useful when correct, especially when describing methods and parameters.

By using verbose descriptive names for the various entities in the codebase, you will discover that the code is relatively self documenting.  It describes itself without the need of additional comments.  Furthermore, self describing code can always be trusted to represent the actual behaviour of the application.

Ideally whenever a function was modified the compiler would give us a warning on the first occasion if the comments for that function were not also modified or approved.  Something that may soon be practical given the move to encapsulating comments in xml.

Good comments

Limit comments to code that actually needs commenting. Good places to comment code might include:

Bad comments

Bad comments provide no worthwhile information and simply burden the programmer with additional tasks.

One particularly annoying commenting style is the forty line comment headers to every function.  These comments often describe the programmer making a modification and what they changed, as well as their height, weight, telephone number and date of birth.   This is an unreasonable obstacle to a programmer that needs to make a minor modification to function, instead of a five second fix they are forced into a ten minute essay on what they did, why they did it and so on.

If it is critical for an organisation to maintain an audit trail of code changes, then this is something that would be much better handled by the version control system or some other automated process.

Overly lengthy descriptions are symptomatic of other problems.  If you find you're writing a huge explanatory description of some code, then the chances are the code is poorly designed. Break it up, make it more logical and easier to follow and the need for extensive comments may well evaporate.

Finally, make good use of commenting features provided by the IDE.  Do not dump all your comments inside the method, or in a comment block incompatible with documentation tools.

Code formatting

Most of these quasi-religious formatting issues can be handled automatically by the development environment.  That said, I still think it sensible to follow a consistent approach.  I often find myself browsing source code files in notepad and if the formatting was inconsistent with the IDE it would be unnecessarily confusing.

Indentation

All code should be indented to indicate execution flow and scope.  To my ceaseless amazement, this fundamental principle is still ignored in many places today.  Working on code that hasn't been indented correctly will waste valuable hours and cause needless frustration.

Braces positioning

I was once writing some code that needed to emit a JavaScript function. In order to force the opening brace of the function onto the next line I needed to include an additional newline character in the output. Given that this was automatically generated JavaScript, that the code was not ever intended to be viewed or edited I decided to forgo the extravagance.  This was the only time I've ever encountered a valid reason for including the brace on the preceding line.

This issue has been done to death, and I doubt any experienced programmer will ever change their position. However if you're a new, still untarnished programmer, please follow common sense and align braces on the following line - it will make yours and other programmers work much easier.

Tabs

Using spaces in place of tabs is needlessly obstructive.  It can restrict the flexibility of the IDE in applying alternate preferences and also affects the layout when viewing the code outside of the IDE.

Columns

There is a style of coding in which elements on a line are spaced or tabbed into columns, typically when variables are declared and initialised. Consider the declaration below. Each 'word' has been tabbed so they are aligned in columns.

int   widgetCountL   =   101;

If we want to add a second declaration whereby the column width is exceeded, we are forced to increase the tab for the first line to make it appear correctly.

int      widgetCountL     =   101;
string   secondExampleL   =   "cat"

In practical terms, it is impossible to apply this formatting style consistently across an application, even if we choose to constraint it to, and group, variable declarations.  Do not attempt to align code into columns.

Parameters

Function parameters should all be defined on the same line.  Breaking parameters onto separate lines is unnecessarily confusing because it is an inherently inconsistent practice.  At first glance it may even appear as though parameters are missing or a given parameter is actually internal to the function.

private void CalculateExtremeWidget(int widgetA,
                                    int widgetB,
                                    int widgetC,
                                    double widgetX,
                                    double widgetY,
                                    double widgetZ)

Typically breaking parameters onto multiple lines occurs when too many parameters are defined for a function.  Passing a large number of parameters into a function is also suggestive of structural problems.

Clarity in Architecture

Using good code styling conventions will improve the ability to manage a codebase enormously, but it's not enough.  The best coding style in the world won't save the project if the architecture is hopelessly confusing.

Creating a solid architecture is all about making good choices of what to abstract and how to relate various entities.   As programmers we should aim to build systems that minimize redundancy and that allow modifications to be applied without rewriting existing code.

It's not within the scope of this guide to outline a method for developing an application architecture.  However, by adhering to the following points in designing and developing your application, you will develop a more effective model that will be easier to maintain and be more adaptable.

Isolate Responsibility

Good application architecture is essentially about isolation, requiring that there be only a single point of responsibility for an aspect of the system.  Isolating parts of the application into distinct areas of responsibility ensures it is easier to modify, maintain and helps improve reliability and performance.

The principle of isolation is something we should apply throughout all levels of abstraction of the entire application, from a single line of code to the application as a whole and all associated resources.

Isolating data

All data related to the application, contained within a database or elsewhere should be appropriately normalized. Normalization is a process that helps to identify and relate the discreet items of information within the problem domain being modelled.

There are many levels of normalization, the elaboration of which is beyond the scope of this guide. However the following principles will assist in designing or maintaining a model that fulfils the objective of isolating responsibility.

Decompose data

Ensure the data is separated into discreet fields as much as possible or practicable for the application.   Often there are ways in which data is composed in fields that are not immediately obvious.

Recipe Name Ingredients Instructions
White Sauce 20g butter
15g flour
250 ml milk
...
Melt butter into small saucepan ...

The above table contains recipes.  It may be fairly clear that a seperation of the ingredients information into another table would be suitable.  It is less obvious that the name of the recipe "White sauce", also contains a composite of information - that this is a recipe for sauce.

Avoid data redundancy

There are many forms in which redundancy may appear in the database.  It may be the existence of many null values for a given field, the existence of the similar data for a field repeated across many records or in definitions of fields themselves.

Where many null values exist for a field, remove the null field to another table and relate it back to the source table.  Fields that contain a lot a similar data should be removed to a lookup table and referenced from the original record.

Avoid field redundancy

There should exist only one field, in one table for a given type of information.   Multiple definitions of a field of the same type indicate the need for new table.  Define a new table with a relation to the source table data.  This allows for the unlimited expansion of related data, without requiring a new field definition for each instance.

Define appropriate keys

Each record should have a single discreet identifier that does not reflect a value used in the problem domain. Consider primary keys as identifiers of records, rather than as identifiers of data.   This will the reduce potential for confusion and dependence upon processes external to the application that may be liable to change.

Avoid composite keys

Composite keys should only be applied to join tables directly relating two other entities, that contain no other associated data.  When introducing a new data field to such a join table, implement a new discreet primary key rather than depend upon a composite of two foreign keys.  This approach maintains a specific identifier for the data, ensuring it is easier to manage, and allows for the addition of new relationships to that data more easily.

Avoid union queries

Using union queries is suggestive of a flawed data design. The implication of a union query is that you're compiling similar data from multiple sources from with in a single database. This begs the question, why have identical types of data in separate tables?  If you have like but separate data, store it in the same table and add an additional field to identify the record type.

Isolating code

The following principles outline processes for effectively isolating code within the application.

Break apart functions

Examine each function within the code.  Does it perform one and only one task?  Perhaps it's performing some inner calculation that could be removed elsewhere?  A good rule of thumb here is that if a function is larger than a page on your monitor, then it's probably trying to do too much.  Break it up into smaller units of functionality.

public void SendMessage(string host, string address, string name)
{
   SmtpClient messageClient = new SmtpClient(host);

   MailMessage contactMessage = new MailMessage();

   contactMessage.Sender = new MailAddress(address, name);
   contactMessage.To.Add(address);
   contactMessage.From = new MailAddress(address, name);
   
   contactMessage.Subject = "this is the message subject";

   /* begin build message content */
   StreamReader template = File.OpenText("messagecontent.txt");

   string messageContent = template.ReadToEnd();

   template.Close();

   messageContent = messageContent.Replace("[name]", name);
   /* end build message content */

   contactMessage.Body = messageContent;
   contactMessage.IsBodyHtml = false;

   messageClient.Send(contactMessage);
}

In the above example, everything between 'begin' and 'end build message content' would be better placed in a seperate function.  That would allow for greater flexibility in using alternate content and also allow for other messaging options using common content.

Break apart lines

For each line of code, consider breaking code across many lines instead of pushing everything into a single line, even if it means using more variables.

While more verbose, the separating the code ensures it is easier to modify and debug and improves the accessibility to programmers that may not be familiar with the codebase.

string data = "abc:123:xyz:321";

Console.WriteLine(data.Substring(data.IndexOf(":") + 1, data.IndexOf(":", data.IndexOf(":") + 1) - data.IndexOf(":") - 1));

By breaking that code into several lines we improve readability and reduce the debugging overhead.

string data = "abc:123:xyz:321";

int startIndex = data.IndexOf(":") + 1;
int endIndex = data.IndexOf(":", startIndex);

Console.WriteLine(data.Substring(startIndex, endIndex - startIndex));

Above all with regards to this principle, use common sense.  Don't pointlessly refactor all your code simply to isolate each line of code if it is already reasonably understandable.

Avoid multiple exits

Multiple function exit points are symptomatic of code that is unnecessarily complex. Given a superficial analysis, it may appear that multiple exit points, or returns, are an aid in reducing complexity and improving maintainability, but they merely disguise deeper structural problems with the code.

Requiring a single exit point forces programmers to refactor the code in a manner that promotes better distillation of purpose.

Avoid mixing code

Mixing different code into the same environment creates unnecessary confusion and constricts the flexibility of the application.  If the target code is embedded in the structure of another language, you cannot see or properly test the flow of logic, you have to guess what is happening.

Different languages often don't fit within the same conventions, so styling and formatting will either be inconsistent or not optimal for a given language.

Further, you require that the person responsible for the code has to be equally skilled in both languages.

Separate SQL from the application

SQL should not be dynamically constructed or even contained within your application layer code.  There are many reasons why you should avoid this practice.

Firstly there are security issues.  If you have any SQL within your code that isn't simply a call to a stored procedure, chances are you're leaving your database open to attack.  Not only are you implicitly allowing direct access to tables and views but you're even including a nice description of your database structure for any malevolent person with access to your application.  In addition, you are prevented from applying security restrictions on the sql and database entities.

Secondly it makes it more difficult to maintain your application because whenever a database change is required, all user applications must be also be updated to support that change.  This isn't such an issue if it is a small single user standalone application and database, but if you have distributed users accessing a centralized database it is a logistical nightmare.  You'll have to synchronize the update with everyone, most likely forcing people offline while the update is in progress.

The converse of this principle is that programming code should not be resident in the database.  MS SQL allows developers to use .Net code within the database itself, this is a bad idea as it tightly binds the application to the database, making any future translation to an alternative database a much more difficult exercise.

Break apart classes

Each class within the application should represent only one entity or exist for only one purpose.  If you find a class contains a lot of functionality that doesn't directly relate to its purpose, separate that functionality into another class.

In the example below, all the functions only relate weakly to the defined EntertainmentSystem class.  A better design would seperate the code into more appropriate abstractions.

public class EntertainmentSystem
{
   public void TurnOnTelevision(){}

   public void ChangeTelevisionChannel(){}

   public void TurnOnMusicSystem()

   public void ChangeRadioStation(){}

   public Food CookPopcorn(){}

   public Drink FetchBeverage(){}
}

Many developers irrationally agonize over adding additional classes to a project.   There is a pervasive belief that this introduces greater complexity into the application.  In these situations, ever increasing functionality is packed into a class which barely relates to its purpose.

Ironically, by tying all that code into a single class, there is a real risk of entangling unrelated areas of functionality.  This entanglement introduces much greater complexity than another class.  Separating that code into distinct classes ensures the code is easier to manage and actually helps minimize complexity.

Each class should be contained within an individual file, unless the class is private to another class.

Abstract common functionality

Where common core functionality is discovered among classes, and that functionality is inherent to an aspect of those types, that code should abstracted to a base class.  Otherwise the code logically relates to some other conceptual function and should reside elsewhere as an independent class, utilised from the other classes as required.

public class DoorKnocker
{
   public void AlertResident(){}
}

public class DoorBell
{
   public void AlertResident(){}
}

Code like that above should be restructured to remove redudancy by including and inheriting from a base class.

public class DoorKnocker:DoorAlert
{

}

public class DoorBell:DoorAlert
{

}

public abstract class DoorAlert
{
   public void AlertResident(){}
}

Avoid partial classes

This unfortunate construct made it into Microsoft's .Net 2.0 framework.  Partial classes allow you to split the contents of a class across several classes.  This will invariably lead to a maintenance nightmare as naive programmers spread their code across multiple files in the mistaken belief that this is the way things are done.   Unfortunately the design of Asp.Net 2.0 makes it prohibitively difficult to avoid using partial classes for web forms.

Avoid generic classes

Sometimes programmers create generic utility classes, classes that contain methods that don't 'fit' anywhere else.   This is symptomatic of lazy analysis, all code should be encapsulated in meaningful classes.

Separate layers

Build the application in layers that cover broad areas of responsibility and minimizes their interactions.  This is important as it allows us to more easily adapt code for different environments and uses.  These layers are usually referred to as n-tier architectures.

Tiered architectures apply the principle of isolation to the broad view of an application, including concepts such as application presentation, processing and persistence.

While n-tier architecture is an oft used buzzword of development, most purportedly n-tier applications fail to deliver on the real objective of separating areas of concern. At the very least, ensure that the code relating to providing an interface to the user is not entangled with code concerned with application logic.  Such code is extremely difficult to maintain and improve.

Typical layers of an n-tier application

The level of separation between the tiers of an application depends on the problem domain.  N-tier applications typically consistent of three layers, the data layer, the business layer and the presentation layer, more or less may be suitable depending upon the problem domain.  However, at the very least you should separate display or presentation logic from the code that does the hard work.

The data layer

What is the data layer?  A data layer is not a data layer just because you designate a library of code a 'datalayer'.  The data layer is not the application code that talks to the database, the data layer is the database.  Your database is where the buck stops for data integrity.

Excepting some circumstances, access to data should be controlled through a single point, your database API.  What is the database API? The database API consists of stored procedures through which you can apply rules to ensure the integrity of the database is maintained.  I typically create separate stored procedures to perform CRUD (create, read, update and delete) operations on data.  With these procedures, there is absolutely no need to access a table directly.

There are several implications of this approach.  Firstly, it is easier to maintain a system where data access is controlled through a single point.  Knowing exactly where database entities are being referenced gives developers greater confidence in making database alterations.  It also minimizes the amount of work involved when the structure changes.  Secondly, it means we can apply a higher degree of security to our database in two ways.  We can eliminate the potential for malevolent SQL injection attacks or malformed queries from being executed.  We can also explicitly prevent direct access to tables or even limit access to procedures thus providing an extra barrier to overcome should application security fail.  Though this has been some matter of contention, stored procedures also provide performance benefits, following from the earlier principle - we should always care about performance up until it impacts on maintenance, so there is another good reason to build a stored procedure API for your application database.

The data interface layer

Some applications require an additional level of separation between the data layer and the business layer. The purpose of this layer is provide an interface against which operations on the data layer can be performed, without regard for the specific type of database that forms the data layer.

It may be dangerous to assume that the application is automatically portable to a different database product because of the existence of such a layer.

The business or application layer

What I prefer to call the application layer, but is more commonly referred to as the business layer, contains all the application logic concerned with turning input into valuable output.

The presentation layer

The presentation layer exposes the information provided by the business layer in some usable or user friendly format.  The advantage of separating the presentation code from the rest of the application should be obvious. It allows you to make extensive changes to the application without being locked into the structure of some other part of the system.

Separate Libraries

Depending on the scale of the application, it is a good idea to separate classes of related responsibility into distinct libraries that can be utilized by other applications or easily replaced.

Interaction

The principle of isolation also applies to the way in which the code we develop interacts.  Interactions are the 'moving parts' of the application.  The more moving parts, the greater the chance something will break.

The interactions or relationships between functionality can have a more profound effect on the maintainability of the application than the functionality itself.  If we don't clearly define and properly manage the relationships between elements, it is harder for programmers to understand and manage the application.  There is also a greater potential for synchronisation and entanglement problems.

Disparate parts of an application should be inaccessible to each other except when they unequivocally need to communicate.  Every effort should be made to encapsulate members or functions in a way that minimizes their visibility and the potential for unintended relationships.

Isolation in interactions is typically described through the principles of cohesion and coupling.

Cohesion

Cohesion is essentially about simplifying the way functionality is accessed.  Cohesive code removes ambiguity, ensuring the process is as reliable, easy to understand and debug as possible.

Encapsulate function calls

A given process should not require more than one invocation to complete.   A single unified function should be provided for undertaking a process.

Where a process is dependent upon two or more functions to complete, a single unified function should be provided for managing the process to completion.  Otherwise, if we attempt to include every dependent function wherever we wish to use that process in our application, we run the risk of failing to include all the necessary steps.

public class Calculator()
{
   public void Initialise(int first, int second){}

   public void DoSum(){}

   public int GetResult(){}
}

The code above requires the calling code to perform three steps in order to complete a logical process.  The code below only requires a single step.

public class Calculator()
{
   public int PerformCalculation(int first, int second)
   {
      Initialise(first, second);
      DoSum();
      return GetResult();
   }

   private void Initialise(int first, int second){}

   private void DoSum(){}

   private int GetResult(){}
}

Encapsulate critical resources

Critical resources are those required by the internal operations of a process, that typically need to be managed or released efficiently.  Such resources should not be returned in a function call where control over the resource is lost to the calling function.

In the following example an object is returned from a function that requires an additional call on that object in order to release application resources.

public SqlDataReader CreateReader(string connectionString, string commandText)
{
   SqlConnection dbConnection = new SqlConnection(connectionString);
   SqlCommand selectCommand = new SqlCommand(commandText);

   selectCommand.Connection = dbConnection;

   dbConnection.Open();

   SqlDataReader dataReader = selectCommand.ExecuteReader();

   return dataReader;
}

This approach is prone to error as the assumption is made that calling code will always call the required function.  It is not obvious to the programmer that this function call is required and if it was known, it is easy for an experienced programmer to accidentally omit.

Avoid meta flags

An invention of evil genius, the meta or modal flag that alters the meaning or behaviour of other code depending on the 'mode'. Not only is this construct extremely likely to cause problems, but it is impossible to debug without considerable frustration and wasted time.

Modes perniciously find their way into all aspects of software development. No matter how illogical or annoying they are, they seem to be a favorite among many programmers.  For example, Quirksmode is a mode defined in many browsers that change the way markup is rendered.  There are a number of possible states for this mode, triggered by the existence of a doctype and the semantic validity of the document.  A huge problem with Quirksmode is that it can silently fallback into another mode without explicit command from the developer.  Given the highly dynamic nature of contemporary web development, it can be extremely difficult to ensure that a document will retain the correct mode and will be rendered as intended.

Modal or meta flags are probably the worst violation of the principle of isolation possible.

Coupling

Coupling refers to the dependency between elements in code.  Ideally the dependency or coupling should be reduced to a minimum.  This allows programmers to more easily add, remove and replace code in the application without introducing problems into other areas by accident.

Minimize scope

By default, give every variable, attribute, method or class the smallest possible scope that allows it to function.  If code is not properly encapsulated, then it's quite likely that the exposed code will be used in way for which it was not designed to support, or be altered unexpectedly.   This makes the application more prone to errors and more difficult to refactor and debug.

Use minimal interfaces

There are degrees of coupling between components. In general, where components need to communicate externally or across layers of the application, implementing well defined, minimal interfaces can reduce coupling and maximize flexibility.

Use appropriate reflection

Reflection is an enormously useful tool that allows us to create loosely coupled components.   Reflection can be used to dynamically access information about an object of which there is no prior knowledge.   As the following example demonstrates, reflection can allow us to easily build in support for components that don't yet exist.

Note: ↓ - denotes code is wrapped to following line.

public class Zoo
{
   public void MakeAnimalNoises()
   {
      XmlDocument zooAnimalData = new XmlDocument();

      zooAnimalData.Load("animals.xml");

      XmlNodeList zooAnimalsList = ↓
      zooAnimalData.SelectNodes("animals/animal");

      foreach(XmlNode animalObjectNode in zooAnimalsList)
      {
         Type animalObjectType = ↓
         Type.GetType(animalObjectNode. ↓
         SelectSingleNode("reference").InnerText);

         IAnimal zooAnimal = ↓
         (IAnimal)animalObjectType.GetConstructor ↓
         (Type.EmptyTypes).Invoke(null);

         Console.WriteLine(zooAnimal.GetNoise());
      }
   }
}

The MakeAnimalNoises function is iterating through an xml file containing references to classes implementing the IAnimal interface, and calling the GetNoise method implemented by that interface.  This approach allows us to create zoo animals at will and simply add another reference to the zoo data file.

Care must be taken to utilize reflection in an efficient manner, as it has significant performance overhead. In the scenario above, you would typically design the code so the necessary object references only need be created once.

Use wildcards

SQL wildcards have a slight performance penalty, but for the overwhelming majority of applications this penalty is not meaningful or discernable.  Unless you have very strict performance requirements, use SQL wildcards when querying data.  Wildcards are easier to maintain as we do not have to update every associated segment of SQL when a table design is modified.

Thinking ahead

When writing code, continually ask yourself what would happen if a related requirement changed.  Try to write the code in a way that will make it easier to support future alterations without requiring the existing code be modified.

This does not mean writing code you think might be useful at some point in the future.  Indeed, code should never be written until there is a clear, well defined need for that code.  Thinking ahead merely means making allowances in the code you have to write today, so it can be more easily adapted.

Avoid premature abstraction

One of the most common causes of poor design is premature abstraction.  Premature abstraction is where the programmer develops code in anticipation of required functionality.  Often such code does not directly relate to the problem for which the software was commissioned.

The problem with such code, is not only is it essentially a waste of time, but it takes the focus away from the model of the problem.  Programmer efforts are expended fitting a solution to the problem into the structure defined by this premature code.

Consider alternate environments

Consider what might happen if the application was moved to a different environment.  For example, consider that you have written a dll intended for a single threaded context in a windows based application, what will happen if that dll is used in a different context, such as a multithreaded web application hosted in IIS?

In such a scenario any static code employed throughout the dll may cause errors.  Through simple consideration of the potential of environmental changes, we can avoid potential future obstacles.

Avoid one to one relationships

We all know one of the prime reasons projects fail or suffer delays is because of changing requirements.  One of the most frequently changing requirements that strikes all projects is the relationship of one item of data, entity or component changing with respect to another.  This can be very difficult to overcome.

For this reason you should always carefully consider the relationships between everything within the application.  If there is any doubt about the nature of a relationship between two entities, implement it as a one to many.

You can avoid building restrictive relationships into your application by being wary of assertions that a given entity only ever has a one-to-one relationship to another. A typical case might be mandating that a person may only ever have one passport, which makes sense given a superficial analysis, but is certainly not always true.

It's trivially easy to enforce a one-to-one relationship when your application already supports many-to-many, but it can be extremely expensive to implement many-to-many when your application only supports one-to-one.  If there is any doubt, and even though it may seem like unnecessary additional development overhead.  Implement relationships in a way that allows them to be expanded beyond one-to-one as insurance against the day it will change.

Booleans are suspect

In the same vein as avoiding one to one relationships, boolean attributes mean that the information represented by that attribute can only ever be true or false.  By marking an attribute a boolean we remove the flexibility of adding greys to that true/false situation.  Consider implementing such fields using enums instead. With enums you can easily expand on the possible number of states for that attribute without too much difficulty.

Conclusion

The benefits of coding for clarity are many; improved productivity, reliability and reusability.  However, the greatest benefit of coding for clarity is the future.

Any application may fulfil the requirements of the day, but the almost invariably a successful project will require modification, or application to a different problem domain.  This is where many limitations and the real cost of the failure to plan and the failure to implement proper standards of development becomes apparent.

The nature of software development is that it is much easier to take shortcuts to save money today, disregarding the disproportionate costs incurred tomorrow.  It can require an effort several orders of magnitude than would otherwise be required to overcome the limitations of poor code.  All too often, the cost of modification can become so expensive as to warrant a complete rewrite of the application.

Through the application of the principles outlined in this guide, the risks of software development can be greatly minimized.