January 02, 2007

DocProject for Sandcastle

Addendum [1/4/07]: DocProject on CodePlex
DocProject is now available on CodePlex. From the home page you can access the How To... wiki, which provides updated guidance and information that was written after this blog entry. Also, I recommend that you download the installer from the codeplex website, here, instead of from my Downloads page since CodePlex will always have the most up-to-date version.

What is DocProject?

DocProject integrates the Sandcastle compiled help builder into the Visual Studio 2005 IDE. A new project template is provided that builds a compiled help file (.chm) for all of its project references. This facilitates the administration and development of project documentation by allowing developers to use a single solution for their projects and related code documentation. Optionally, the compiled help output can be included into the documentation project for further development. The DocProject installer only installs support for building compiled help from a specialized VS 2005 project template. No other features, add-ins or tools are installed. DocProject completely automates the process by examining the project for references to other projects and building the documentation from their assembly output. The user does not have to perform any manual steps for the build to work. Dependency information is taken from the project’s references automatically. The documentation is built by simply building the project.

In order to use DocProject you must have installed the following components:

  • .NET Framework 2.0
  • Visual Studio 2005 (Express Edition is not supported)
  • Html Help Workshop 1.4 SDK [1]
  • Sandcastle December 2006 CTP [2]
  • DocProject for Sandcastle (Installer) [3]

You can download DocProject for Sandcastle here or in the references section at the end of this blog entry [3].

Known Issues

  • Although the installer allows the installation path to be changed, doing so breaks the Add-In, which requires the installed path to be: C:\Program Files\Dave Sexton\DocProject for Sandcastle.
  • Uninstallation does not remove the C# Project Documentation template or the Add-In
  • .
  • Only one comment file can be used per solution since Sandcastle expects a file named comments.xml by default. This means that if multiple projects with comments.xml files are referenced only one comments.xml will be used, indeterminately, when the help is compiled.

License

CC-GNU GPL
This software is licensed under the CC-GNU GPL.

Copyright © Dave Sexton 2006-2007

Installation and Source Code

A Windows Installer package (.msi) has been created to install both the Add-In and "C# Project Documentation" template along with the complete source code and C# project file (.csproj) [3]. The installer uses a Community Components installer file (.vsi) to install the Add-In and Template. Currently, the uninstallation of the Windows Installer package does not remove the Add-In or DocProject template; however, they can be removed easily by opening the configured Add-In and Template directories and deleting the DocProject files manually.

To compile the source code, locate the .csproj file in the C:\Program Files\Dave Sexton\DocProject for Sandcastle\Source folder. The existing build event (on successful build) requires another component for zipping the output in preparation for installation. The component is called DSZip and can be downloaded with the complete source code at [4]. I'll blog about this simple console application in the future if there are enough requests for that.

DocProject Components

Templates

Currently, the only DocProject template is the "C# Project Documentation" template, based on a C# class library. In the future there may be support for a web project (ASP.NET documentation website). Also, there may be support for the execution of custom code from within the project itself.

Add-In

The build process for projects created from the DocProject template are controlled by an Add-In named, "DocProject for Sandcastle". Without the Add-In projects created from the DocProject template are just plain C# class libraries.

Usage Instructions

The following steps can be taken to use DocProject after it has been successfully installed:

  1. Create or open an existing solution.
  2. Add a project that you want DocProject to generate documentation for its assembly output.
  3. Add a new project to the solution and select the C# Project Documentation template.
    1. Select Visual C#.
    2. In the right pane look under My Templates.
  4. Name the new project anything that you'd like and create it.
  5. Add project references using the dialog that appears or cancel the operation and add the references as you normally would by using the Solution Explorer.
  6. Build the solution or project to see DocProject in action.
  7. Optionally, but recommended, open up the solution configuration and remove your documentation project from the build process.

Note that the build output (compiled help (.chm) and supporting files) are replaced each time a DocProject is built.

Configuration

DocProject installs a custom tools options page named, DocProject. It appears under Dave Sexton's Tools in the Options dialog accessible via the Tools menu. The dialog configures settings on a per-project basis.

Code Plex

I've submitted a request to Code Plex for the addition of this project. I'll create a new blog entry when I hear back from them :)

--
References

[1] HTML Help 1.4 SDK
Microsoft HTML Help Downloads
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/htmlhelp/html/hwMicrosoftHTMLHelpDownloads.asp

[2] Sandcastle - December 2006 Community Technology Preview (CTP)
http://www.microsoft.com/downloads/details.aspx?FamilyID=E82EA71D-DA89-42EE-A715-696E3A4873B2&displaylang=en

[3] DocProject for Sandcastle Installer
http://www.codeplex.com/DocProject/Release/ProjectReleases.aspx

[4] DSZip Installer
http://www.codeplex.com/DocProject/Release/ProjectReleases.aspx

As always, I'd love to hear from anyone that has something to say about this post.  Drop me a comment.

December 15, 2006

MSDN Wiki

Have you noticed the new Community Content sections on msdn2 documentation?

These new sections are part of the MSDN Wiki project, live on MSDN2 documentation. [1]

Community Content

Community Content is not intended to be a forum or discussion area, nor is it meant to be a place to report bugs or issues. Its purpose is for developers to collaborate through the extension of MSDN documents. Anyone can add their own tips and tricks, or simply provide extra information about a given topic if it will add value, which will then be available to anyone reading the online documentation. [2]

Participate!

To participate, simply find any MSDN2 document that has a Community Content section at the bottom and click the Add new Community Content link. Read and agree to the terms and conditions including the Code of Conduct [2], enter your display name and you're all set. (Note: You need a Windows Live ID to sign up.)

A Contributor's Rights to Their Content

The following FAQs are excerpts from [3]:

Can I reuse the content I contribute to the wiki in other publications (for example, a book or a magazine article)?  Can I reuse the code I contribute in my commercial applications?

Yes, and yes!  As described above, we do not ask for ownership of contributions of content or code.  Instead, the Contribution Agreement gives us a non-exclusivelicense to contributions.  As a result, you are free to reuse your own content or code however you like.

Who owns the rights to content I add to MSDN Wiki?  What does the contribution agreement mean?  

Although many collaborative development projects ask for assignment of ownership by contributors, we decided that you should own your own contributions.  The Contribution Agreement gives us a non-exclusivelicense to your contributions.

My Contributions So Far...

I recently added Designing Thread-safe Events in C# to the Event Design guidelines documentation since it didn't mention thread-safety anywhere. I also added a chart (a poorly-formatted one since MSDN doesn't allow table elements) to the UriComponents Enumeration document to illustrate the behavior of each flag. You can review all of my posts and edits at my public user profile on MSDN [4].

Feedback

Submit feedback about MSDN Wiki or view the feedback of others, including bug reports and suggestions at Microsoft Connect [5].

--
References

[1] MSDN Wiki RC0 Details
https://connect.microsoft.com/content/content.aspx?ContentID=996&SiteID=112

[2] MSDN Wiki -- Code of Conduct
http://msdn2.microsoft.com/en-us/library/communitycodeofconduct.aspx

[3] MSDN Community Content FAQ
§ Legal Framework
http://msdn2.microsoft.com/en-us/library/communitymsdnwikifaq.aspx

[4] Public User Profile: Dave Sexton
http://msdn2.microsoft.com/en-us/library/user-dave%20sexton.aspx

[5] Feedback, MSDN Wiki
https://connect.microsoft.com/feedback/default.aspx?SiteID=112

As always, I'd love to hear from anyone that has something to say about this post.  Drop me a comment.

.NET | C# | msdn | wiki

December 15, 2006

Custom C# Code Snippets

Introduction

Code Snippets in Visual Studio 2005 [1] allow developers to create templates for common C# programming constructs, XML documentation, or any other text that one might want to quickly insert into C# code with only a couple of keystrokes. Snippets even provide the refactoring capabilities found in VS's Refactor menu. Snippets support the addition of fields and can be designed for either explicit, user-defined values or for automatic replacement, allowing a snippet to be easily customized for the particular context in which it's being used.

Visual Studio 2005 provides some out-of-the-box snippets for C# [2]. Pressing Ctrl+Space brings up the IntelliSense list of local variables, members, types and namespaces that are currently in scope relative to the position of the cursor. That same list displays all of the built-in and custom C# code snippets too.

The Benefits of Snipping

I use some of the built-in snippets from time to time, such as try, tryf, #region, #if and maybe a few others, but I believe that the real value in Code Snippets lies in the ability to create your own.

Snippet for Better Code

The quickest way to produce software is to reuse canonical code samples that work. Because there are so many different coding patterns and practices, it can be difficult to remember them all. Code Snippets facilitate the reuse of the expression of programming concepts so that they don't have to be re-analyzed each time they are needed. Of course, not all design patterns may be expressed in simple code snippets. For example, patterns that should span multiple source files cannot be developed into a single snippet; however, multiple snippets might be useful to accommodate patterns of diffusion. Although it might not be easy or effective to handle some of the more robust or complex design patterns using snippets, they are certainly helpful for the simpler, more frequent patterns.

Snippet for Consistency

It's common for personal styles and techniques to be developed or acquired over time. Snippets can be used to easily ensure that all code conforms to personal preference for style and technique. Organizational standards may be expressed and encapsulated using code snippets as well. By effectively utilizing snippets, organizations can maintain consistency across assemblies, projects and teams. Developers can ensure that trusted and proven techniques will be reused without having to remember all of the ins and outs as to why a particular snippet was designed a certain way as long as they know what it's for.

Snippet for Distribution

Snippets provide a means for encapsulating code in a manner that's easy to distribute since a snippet consists only of a single XML file. Snippets can be installed easily into Visual Studio by simply copying a .snippet file to a user's "Code Snippets\Visual C#\" directory (usually located under "My Documents\Visual Studio 2005\").

Snippet for Education

I believe that the best way to learn code is to write code. Code Snippets can encapsulate canonical representations of design patterns so they can be used as a basis for learning in the same way that code samples are used in books, websites and newsgroups to teach programmers the correct way to do something. Snippets are, in that respect, like little, bundled examples. Because of their distributable nature, snippets can be supplied easily to developers as examples of canonical, working code.

There is also the integration factor to consider. Since Visual Studio understands snippets, they can be accessed easily using IntelliSense from within the development environment itself. Visual Studio aids in the customization of snippets as well, providing a means for snippet authors to add contextual fields using replacement tokens, the replacement of which may be performed by Visual Studio or simply edited by the programmer using the snippet. This automated guidance allows developers to actually place working examples into their real code much easier than having to copy, paste and modify, or to type samples from scratch, both of which may be subject to typographical and logical errors.

Unfortunately, I haven't seen many code snippets being passed around on the web. I haven't really tried to figure out why that is, but I'll assume that it's because it just hasn't caught on yet or, more likely, it's not as useful for educational purposes as I'd like to believe.

Snippet for Sanity

Code Snippets can save you precious development time and save you from the frustrating, tedious reproduction of well-known code constructs that can make development really boring and annoying at times.

Snippets and Me

For the remainder of this blog entry I'm going to provide the source for my own personal code snippets and explain why I wrote them in the first place. Afterwards, I'll briefly explain how they can be used.

You can download all of my snippets at once or each may be downloaded separately by clicking on any of the shortcuts below.

My C# Code Snippets
ShortcutTitleSummary
event Event Member Creates the canonical event member declaration and corresponding method for invocation. EventHandler is used as the delegate. The code produced by this snippet is thread-safe.
eventc Component Event Member Creates the canonical event member declaration and corresponding method for invocation for a class that derives from System.ComponentModel.Component. EventHandler is used as the delegate. The code produced by this snippet is thread-safe.
eventg Generic Event Member Creates the canonical event member declaration and corresponding method for invocation. EventHandler<TEventArgs> is used as the delegate and TEventArgs has been tokenized. The code produced by this snippet is thread-safe.
eventcg Generic Component Event Member Creates the canonical event member declaration and corresponding method for invocation for a class that derives from System.ComponentModel.Component. EventHandler<TEventArgs> is used as the delegate and TEventArgs has been tokenized. The code produced by this snippet is thread-safe.
eventargs Custom Event Args Creates the definition for a class that inherits from EventArgs.
layout Layout Regions Creates several #region blocks for the placement of specific code elements from within the body of a class definition.
layoutd Designer Layout Regions Creates several #region blocks for the placement of specific code elements from within the body of a class definition created by a designer.



Event Members
(event, eventg, eventc, eventcg)

The standardized event design pattern [3] for .NET includes a type member using the event keyword with a corresponding protected method to raise the event. Adding an event to a type is therefore a two-step process, at least:

  • Add event member to type
  • Add method to raise event (prefixed with On)

When designing an object to be used by multiple threads it's desirable and recommended [4] to synchronize the event accessors with the method that raises the event to ensure thread-safety. This counts for even more code that, in essence, doesn't change between events:

  • Declare a synchronization object
  • Add locking mechanism to the add and remove event accessors
  • Add locking mechanism to the protected method that will raise the event

When designing types that derive from System.ComponentModel.Component, including classes that inherit from Windows Forms Controls or Web Controls, the protected Events property provides an EventHandlerList that can be used to reduce the memory footprint of the class. To use the Events property there must be some more code added:

  • Declare an object to key the event in the EventHandlerList
  • Use the Events property in the add and remove event accessors
  • Use the Events property in the method that will raise the event

My code snippets for event members handle all of the above scenarios, by default. These snippets can be invaluable to developers that frequently write custom controls or business objects that use events by reducing the amount of time it takes to adorn classes while ensuring that the code adheres to standards and has been tested.

The Different Event Member Snippets

The differences between the four event member snippets can be understood by analyzing the different suffixes on each shortcut:

The event snippet, without any suffix, is the most basic of them all. It provides the standardized, thread-safe event member paradigm in C# and adds some automatic XML documentation as well.

The c suffix indicates that a component-based event will be serialized.  It will use the Events property inherited from System.ComponentModel.Component (last three bullets above).

The g suffix indicates that the serialized event's delegate will be the generic EventHandler<TEventArgs> type. In the case of g, TEventArgs is tokenized so that its value only needs to be specified once when using the snippet. You should use this snippet if you want event arguments other than the standard EventArgs class. However, I don't provide any snippet that tokenizes the delegate itself. If your code is targeting the 1.* .NET Framework then you may want to consider writing your own snippet to replace this one.

The cg suffix is simply a combination of c and g.  (component + generics)

Layout Regions
(layout, layoutd)

I invented the layout region snippets to keep my code really neat. That's basically it. If either layout snippet is used while the cursor is within a class or struct definition, it infers the name of the type and adds a constructor with some documentation. If it's used within an interface definition then the constructor region will be serialized without a name and must be deleted manually. Here's what the layout snippet would serialize if it were to be used within a class named, Tidy:

Note: The text in red is not serialized by the layout snippets. It has been added to describe the purpose of each #region.

#region Public Properties
This region is intended for public properties, but I use it for static fields and static properties that are public as well. When I'm developing custom controls I sometimes also add nested regions to separate the different categories of properties based on the value of the CategoryAttribute. As a rule of thumb, I always group the static members before the instance members within this region.
#endregion

#region Private / Protected
All private, protected and internal instance and static fields are declared here. I also use this region for protected or internal (or both) properties. I always group properties before fields within this region.
#endregion

#region Constructors
/// <summary>
/// Constructs a new instance of the <see cref="Tidy" /> class.
/// </summary>
public Tidy(|)  (Cursor goes here automatically)
{
}
#endregion

#region Methods
All methods that do not raise or handle events are added to this region. Those methods that are prefixed with On raise events and those with signatures of specific delegates handle them (they are placed in the Event Handlers region). I try to keep all static or utility methods grouped at the top, followed by all initialization-based methods and then finally, arbitrary methods. Depending upon the class, I may also try to group public methods first.
#endregion

#region Events
This region encapsulates all events defined on the type. Since my event snippets serialize the event declaration, invocation method and supporting objects all in one place, each of those components are included here. This region is the only exception to the rule that private fields must be placed in the Private / Protected region. I layout this region based on the position of the cursor in my event snippets, where all supporting fields are grouped first, and then each event/invoker pair follows one after another.
#endregion

#region Event Handlers
All methods that handle events are added to this region. An event-handling method is one that takes the form of: void MethodName(object sender, SomeEventArgs e) { ... } Before I added the Nested region, Event Handlers was the last one in the list because Visual Studio .NET designers were actually smart enough to serialize event handlers into the last region in the code base (or that might have been a bug but I used it to my advantage). VS 2005 doesn't exhibit the same behavior so I just cut and paste designer-serialized code into the proper regions.
#endregion

#region Nested
Nested types are placed here regardless of their accessibility.
#endregion

All interface implementation regions serialized by Visual Studio appear after the last region used by the type.



Not all regions are used in every type that I define. For instance, I usually remove the Nested region immediately after I use the layout snippet. For interfaces, I remove everything except Public Properties, Methods and Events. For stucts I usually remove everything below Methods.  If I predict, for any type, that a particular region will never be used then I'll remove it immediately.

I do not distinguish between members marked as unsafe, fixed, abstract, virtual, override or sealed. However, extern methods are placed in the Methods region below every other method or within a nested region if there are many of them.

When overriding derived methods that raise events (methods beginning with an On prefix), I place them in the Event Handlers region since that is usually the purpose for overriding them in the first place.

The layoutd alternative provides one subtle difference and one obvious difference. The obvious difference is that it serializes a call to InitializeComponent(); within the constructor. When I use layoutd I select the existing constructor that was serialized by the designer and then execute the snippet so that it's completely replaced.  The subtle difference that layoutd provides is that the cursor is no longer placed where the constructor parameters could be entered, but is instead placed within the Private / Protected region on an empty line.

Since I invented the layout snippets I've made it a habit when I open up source code to immediately right-mouse click and select Outlining → Collapse To Definitions from the context menu, making it much easier to navigate my code. This allows me to visualize where code elements are placed instead of blindly using the drop-down lists at the top of the document, which I avoid completely.

Using Snippets

I learned how to write snippets using the documentation on MSDN [1] and I found it to be quite easy, so I'm not going to elaborate here. I'll just provide a synopsis to get snippet-newbs started.

Create a .snippet File

I suggest that you download one of my files as a starting template. You may find the layout template to be useful at first because it's fairly barren. This way you can get an idea about some of the structural xml elements that are supported without the bloat of some of the more advanced concepts such as replaceable tokens. For a more advanced example see eventcg.

I recommend naming the file with the same name as the shortcut that will execute it, with ".snippet" appended to the end (the extension is required by Visual Studio but the name portion seems to be completely ignored). Specify the shortcut in the <Shortcut> element under the <Header> element within the snippet's XML.

Save the snippet file to your "Code Snippets\Visual C#\" folder usually located under "My Documents\Visual Studio 2005\". You don't have to restart Visual Studio. As a matter of fact, you don't even have to close and reopen the current document. Simply save your snippet, go back into VS and press Ctrl+Space to open the IntelliSense list from within a C# code file. Start typing the shortcut for your snippet and you should see it in the list. Pretty cool ;)

Types of Snippets

There are three ways that snippets can be typed: SurroundsWith, Expansion and Refactoring. A custom snippet may be any combination of the first two but not Refactoring. The type of a snippet must be specified by including one <SnippetType> element for each type. They must be placed within the <SnippetTypes> element, which is located in the snippet's <Header> element.

Executing Snippets

Expansion snippets are executed easily by typing their shortcut and pressing tab.

SurroundsWith may be executed using a special key combination (CTRL+K, CTRL+S), by selecting the Edit → IntelliSense → Surround With menu item, or from the context menu by selecting Surround With.

Refactoring snippets are executed by selecting the corresponding menu item from the Refactor menu or context menu. Custom snippets do not support this type.

Using Fields

To fill out a snippet's fields when it's being used you can continue to press tab after the snippet is started, and for each green box that receives focus when you press tab you can type in a new value.  Pressing tab repeatedly will eventually cycle through all of the replaceable tokens in the snippet and start over again from the first token. Note, however, that only the first appearance of a field within the snippet is editable. The remaining instances will take on the same value entered in the first instance.  When you are finished customizing the snippet, make sure that the cursor is within one of the green areas and press Enter or Esc once.

Snippets For Refactoring

Browse to your "Program Files\Microsoft Visual Studio 8\VC#\Snippets\language id\Refactoring" directory and you'll find the snippets used to perform the refactoring techniques in the Refactor menu. Notice that these snippets have a <SnippetType> element with a value of Refactoring, which cannot be specified in custom snippets. If you're going to try modifying these snippets then don't forget to back them up first, otherwise you can take a look at [5] if you mess them up.

Modifying these snippets can be quite useful. For instance, have you noticed that when Visual Studio stubs out an interface implementation, whether implicit or explicit, all methods contain a single line of code that throws a new Exception with the text, "The method or operation is not implemented."? Great, but why not just throw NotImplementedException instead? Well, you can easily adjust MethodStub.snippet to do that by simply replacing global::System.Exception with global::System.NotImplementedException for the Exception field, which uses the SimpleTypeName function.

--
References

[1] Visual C# Application Development, Code Snippets
http://msdn2.microsoft.com/en-gb/library/f7d3wz0k(VS.80).aspx

[2] Visual C# Application Development, Default Code Snippets
http://msdn2.microsoft.com/en-gb/library/z41h7fat(VS.80).aspx

[3] .NET Framework Developer's Guide, Event Design
http://msdn2.microsoft.com/en-us/library/ms229011.aspx

[4] J. Skeet, Delegates and events
http://www.yoda.arachsys.com/csharp/events.html

[5] Visual C# Language Concepts, How to: Restore C# Refactoring Snippets
http://msdn2.microsoft.com/en-gb/library/ms236401(VS.80).aspx

As always, I'd love to hear from anyone that has something to say about this post.  Drop me a comment.

December 09, 2006

Seekers, Solvers and Science

Where there's a problem there's a payment!

My sister, Andrea, pointed out this website to me. The idea behind it is very interesting, but I wonder how trustworthy it is:

www.innocentive.com

It seems to only cater to biology and chemistry, so far. I guess software development just doesn't fall under the category of "tough R&D problems".

I'd love to see a problem submitted like, "How to convert string to array of bytes in C#?". Although, someone would probably beat me to it :)

 

As always, I'd love to hear from anyone that has something to say about this post.  Drop me a comment.

December 05, 2006

Handling Exceptions for Contingency

Quite often developers (myself included) will take notice of certain inconspicuous issues with code before they even cause any real problems during testing. Being the cautious developers that we are, it's not uncommon to take action ahead of time by adding try..catch blocks. For instance, adding local exception handling into methods in preparation for an exception that might occur at runtime is quite common. Sometimes there is good cause for such code to be in place, but local exception handling should always be implemented with caution.

When local try..catch blocks are added during development to catch possible, but unlikely exceptions, they may inadvertently catch exceptions that shouldn't be handled in the same way, causing another exception to be thrown or, even worse, suppression of an exception that should propagate to the AppDomain because it indicates an actual bug. This makes debugging very difficult, or even impossible if these unexpected exceptions are occurring in the production environment but being suppressed. Exceptions that you truly didn't expect shouldn't be caught in local try..catch blocks - that's the point of exceptions when they are unexpected!

Cautious But Fallible

Say, for example, that you're writing a logging function that relies on a System.IO.FileStream constructor overload, which can throw several different types of exceptions [1]. The example method below creates a new log file, the name of which is based on a static variable that is incremented each time the method is invoked (it doesn't matter how the variable is first initialized, so assume that it's initialized with an appropriate value). The code has a local catch block for a particular exception that you realized may be thrown at runtime, but not necessarily. You have added the catch block for contingency:

static string GetNextLogName()
{
    return "log" + ++lastLogSeqNo + ".txt";
}

static void CreateLog(string path, string data)
{
    string fullPath = Path.Combine(path, GetNextLogName());

    try
    {
        FileStream stream = new FileStream(fullPath, FileMode.CreateNew, FileAccess.Write);
        ...
    }
    catch (IOException)
    {
        // try again with the next log name in the sequence
        CreateLog(path, data);
    }
}

(File.Exists is not used here because it won't help at all.   There is an interprocess synchronization issue that needs to be dealt with - the same issue that you're attempting to solve through recursion.)

The actual name of the log file isn't important to you - only that it's unique.  While authoring this code you realized that a file with the current sequence number might already exist, probably resulting in some exception being thrown, so you pull open the docs to see what Type of exception may be thrown and you notice the following for the FileStream constructor that you're using [1]:

IOException

An I/O error occurs, such as specifying FileMode.CreateNew and the file specified by path already exists.

-or-

The stream has been closed.

So you added a catch block to catch that specific exception, as recommended by exception-handling standards [2]. You're not worried about the stream being closed since you're trying to create it, so you can safely assume that the latter cause of the exception doesn't apply.  You're aware that if some threading bug exists or some other process creates a file with a name that impedes on your internal naming convention, you'll want to try again with the next available log name, so you call the method recursively. We'll assume that you're not worried at all about a StackOverflowException since this code will be executed only once a week and the log files are cleaned periodically anyway by the same process, which will fail if the logs cannot be removed.

So, what's the problem?

DirectoryNotFoundException! Well, you didn't think of that when you wrote the method and tried to be slick by catching IOException, but there's an even bigger problem now.  You might think to yourself, "No there isn't! That exception won't be caught so it will be handled by my global exception handling routine".   But DirectoryNotFoundException derives from IOException, so it will be caught too.  This results in an infinitely recursive method call that will eventually eat up the entire stack, causing a StackOverflowException to be thrown even though you weren't expecting it. So you might think, "Well that's fine because an anomaly like this will be easy to debug later - just take the exception log written by the global error-handling code and examine the stack trace." Okay, what happens if there is no log and stack trace? Starting with the .NET Framework version 2.0, a StackOverflowException object cannot be caught by a try-catch block and the corresponding process is terminated by default [3]. StackOverflowException cannot be trusted by reliable applications and so should be avoided at all costs.  Now, imagine that the code above was running in production when a DirectoryNotFoundException occurs.  How could you possibly repro that error without any diagnostic information?

I admit that the code above might not be optimal, but surely you don't think that things like this are easily avoided, all of the time. I've seen some strange anomalies in developer code. Granted, most of them were due to issues with multi-threaded synchronization (actually, a lack thereof), however I'm quite certain that cases such as this are found throughout production code more often than expected.

The problem in this example is simply that you didn't realize DirectoryNotFoundException derives from IOException. Actually, there are a lot of exceptions in the framework that derive from base exceptions that are somewhat unintuitive.  For examples, take a look at the list of exceptions that derive from these commonly caught exceptions: IOException [4], ArgumentException [5] and ApplicationException [6]. I'm sure some will surprise you ;)

(Note: Microsoft no longer recommends that you derive custom exceptions from ApplicationException.  System.Exception is recommended as a base type, generally [2])

This particular situation can be resolved easily by catching DirectoryNotFoundException before IOException, and re-throwing it:

    catch (DirectoryNotFoundException)
    {
        throw;
    }
    catch (IOException)
    {
        // try again with the next log name in the sequence
        CreateLog(path, data);
    }

Unless you can absolutely guarantee that all of your exception-handling designs will be perfect, or that all unit tests will provide complete coverage before code is released, then there is no good excuse of which I'm aware for adding local try..catch blocks to handle exceptions for contingency without attempting to discover all of the possible exceptions and their hierarchies first, but even that takes time and may require you to redesign the actual structure of your outer code as well.  Discovering all of the exceptions that may be thrown and figuring out all of the possible ways that you might be able to recover without inadvertently causing other problems isn't always easy and recovery isn't always possible by simply adding a try..catch block.  For instance, it's quite possible that one method may throw ArgumentException while a method that it calls throws ArgumentNullException, which derives from ArgumentException [5].  If you are aware that the method your code will be calling throws ArgumentException, and you catch it, then you'd be catching ArgumentNullException as well.  It's not likely that you'll be able to write code that will handle this circumstance given that the null argument was the fault of the method into which you called.  This particular example might be considered a bug in the invoked method, but it's quite possible for something like this to occur and not be a bug (with exceptions that don't depend on arguments, for instance, but internal state instead).  In that case, you'd be catching an exception of which your code couldn't handle gracefully.  If it's being suppressed or handled inappropriately, you could easily end up with similar problems as I've illustrated above.

Solutions

One approach to designing exception handling for situations where you predict that an error might occur is to code without exception handling first, and then add it later only after the predicted exception actually occurs during testing. In cases where you may expect an exception to be unlikely, although still possible, it's best to architect a solution that will allow your code to gracefully handle the exception.  However, deferring local, contingent exception handling code until exceptions are thrown in unit testing, staging or production environments is a better way to produce reliable applications and helps to minimize the amount of time it takes to debug code (Yes, not catching exceptions helps you to debug your program!).  For those exceptions that are known to be possible, yet rare, you should take care in your local exception handling code to ensure that you aren't inadvertently causing other problems.

Another approach is to provide exception handling in the global exception handler for unlikely exceptions (a catch-all exception handler such as the AppDomain.UnhandledException event [7]). This way, you've ensured that the operation has backed-out entirely and the application can continue at some point before the failure, considering whether or not state has been corrupted, of course. Even in the case that the application cannot continue due to corruption, the error will be logged with details of the exception and the application can prompt the user so they are aware of the error. The user can choose whether to try the operation again (this may require you to code some sort of reset routine that can clean-up invalid state without having to restart the entire program). If it's an automated system, the system can automatically try the operation again.

A third approach, although subject to the same issues, is to add exception handling code in a higher level, but not quite in a global exception handler. Basically, if you can determine transactional points in your code where in the case of an exception you can assure that no state will be corrupt, or else it can be fixed before returning to the caller, then you can catch base exceptions such as IOException without a fear of causing any major problems. The key is to ensure that you don't do anything that depends on the semantics behind the exception. Instead, assume the exception was fatal and roll back the state to immediately before the point of failure.

The best approach is to avoid exceptions in the first place.  This doesn't mean that you should avoid throwing them in code, I just mean that you should look for other ways around having to use try..catch blocks when you expect that an exception may be thrown.  For example, instead of catching ArgumentException, verify the arguments before passing them to a method.  Another example would be to check CanSeek before attempting to Seek on a Stream.  However, when working with shared resources and multiple threads this approach isn't always possible due to its asynchronous nature.  For instance, calling File.Exists may return true but another process might slip in and delete the file before your next line of code gets a chance to open it. Your only option is to just try the operation again. Therefore, the second or third approach is better suited for this type of scenario.

--
References

[1] FileStream.FileStream(String, FileMode, FileAccess) Constructor
http://msdn2.microsoft.com/en-gb/library/tyhc0kft.aspx

[2] .NET Framework Developer's Guide, Best Practices for Handling Exceptions
http://msdn2.microsoft.com/en-us/library/seyhszts.aspx

[3] StackOverflowException Class
http://msdn2.microsoft.com/en-us/library/system.stackoverflowexception.aspx

[4] IOException Hierarchy
http://msdn2.microsoft.com/en-us/library/bb57z14b.aspx

[5] ArgumentException Hierarchy
http://msdn2.microsoft.com/en-us/library/hat7a83s.aspx

[6] ApplicationException Hierarchy
http://msdn2.microsoft.com/en-us/library/kw9wwk34.aspx

[7] AppDomain.UnhandledException Event
http://msdn2.microsoft.com/en-us/library/system.appdomain.unhandledexception.aspx

As always, I'd love to hear from anyone that has something to say about this post.  Drop me a comment.

December 04, 2006

The Thick and Thin of a Paradigm Shift

If I had a time machine :-p, the first thing I'd do would be to go check out the Dinosaurs, alive and in all their spectacular glory. Then, certainly, I'd jump right into the future to see where we are heading with our development tools, platforms and standards :)

A change has already begun. Actually, change is continual in technology; however, change that shifts entire UI development paradigms are more rare. Microsoft's .NET initiative [1] is part of a change that is occurring in current software development paradigms. This change, I believe, supplements the slow, progressive changes that occur to software development paradigms every few years, but the one to come may have a more profound impact than some of the major shifts of the past.  We are at the foothold of the next large step forward in software development, perhaps.

Lines are continually being blurred:

Blurred line How?
Windows OS ↔ Other OS
  • Microsoft's .NET initiative1
    (like JAVA before it, but with support for multiple languages [2])
  • XAML (assuming eventual standardization)
Information ↔ Hypertext markup
  • XML Islands in HTML
  • Stand-alone XML
Data storage ↔ Middleware
  • Sql Server 2005 hosts the CLR
  • Sql Server 2005 Express2
  • Conceptual programming techniques such as LINQ and ADO.NET Entities [3]
Addendum: Recently discovered article
Client/Server ↔ Peer-2-Peer
  • WCF PeerChannel [4]
  • Windows Vista support for PNRP and PNM [4]
Rich-client applications ↔ Web applications
  • XAML
  • ClickOnce (with CAS and Isolated Storage as protection measures)
Rich-client applications ↔ Services
  • Distributed programming with .NET Remoting and Web Services
  • More recently, general SOA and WCF
  • ASP.NET (e.g., Cassini [citation desired])

.NET applications
  • can execute on heterogeneous systems that support CLI standards [1]
  • will be able to execute with a GUI on heterogeneous systems (assuming XAML will be standardized)
Information
  • can be formatted and packaged easily
  • can be structured; structural information can be shared easily
  • can be aggregated with heterogeneous data easily
  • can be distributed between heterogeneous systems easily and in a standardized manner

Database management systems

  • can handle the workloads of application servers using the same tools and languages
  • are becoming more portable and efficient as the ability of hardware scales
  • can be programmed against by thinking conceptually, like its entities were designed
  • can import and export data from heterogeneous systems based on XML standards

Rich-client applications

  • can consume remote services
  • can provide public and local web services, and remoting services for distributed communication and collaboration
  • can host ASP.NET and a WebBrowser
  • can be deployed easily over the web
  • can be hosted in an RDBMS, adjacent to the data being used.  Data itself can be pure XML markup, which is natively supported by the RDBMS
  • can be presented using markup (WPF, XAML)

I predict that as the ability of hardware scales, rich-client applications, in there entirety, will be service-oriented, data-driven, data-encapsulating, portable, secure, and interactive components that are part of a distributed framework of other rich-client component applications found on LANs and WANs, forming a network of peer-to-peer business intelligence systems.  They'll fit nicely into Microsoft's OS and possibly other third-party OSs (Vista, having built-in support for Peer Name Resolution Protocol (PNRP) and People Near Me (PNM), PeerChannel in WCF [4], and .NET FCL such as for system networking [5] and ClickOnce [6], is only the beginning).

From Thin to Thick

I believe that Microsoft may be trying to secure their place as providers of a rich-client OS and software development platform by slowly spinning the current trends in web development back into rich-client development. This is a good thing for developers and end-users, IMO.  But maybe I'm giving too much credit to Microsoft by assuming that this progression was intentional :)

Web development and standards are hard to create and enforce for a few reasons:

  • Hypertext markup, especially dynamic markup that includes scripting support, is complex
  • There are many browsers and each have their own implementations of complex presentation standards, which have problems of their own
  • Website developers tend to value aesthetics over ease of use and intuitive behavior and functionality
  • Website design is largely proprietary since standards only go as far as compatibility, not visual appeal
  • Because of the point above, visual designs frequently change; design trends under the guise of standardization

Websites are not ideal for business intelligence applications because

  • they have limited runtime ability due to non-existent scripting security standards
  • they have limited runtime ability due to inefficient languages and platforms (e.g., scripting)
  • they do not provide a rich, real-time user interface
  • they require network connectivity

Standardization

Totalitarianism isn't something I'm too fond of in government, but in software development and practice I'd prefer if the company creating the tools that I use would also enforce standards for using them. Standards ease interoperability, increasing the value of my applications. Interoperability relates to the new trend in software development, SOA (Service Oriented Architecture) since services are commonly distributed among heterogeneous systems (e.g., Web Services).  So long as the standards are good ones, I favor them to no standards at all.

Microsoft produces and enforces standards in several ways:

  • The release of a new OS
  • The construction of new APIs
  • The authoring of documentation for standards and guidance
  • The creation of new training material and certifications for developers
  • Submission for standardization to ISO, ANSI and ECMA
  • Microsoft Open Specification Promise (OSP) [7]
  • Deprecation of legacy tools and support

And the community accepts shifts in software development paradigms, tools and support if Microsoft (and usually when Microsoft) proves that the new technology and standards will make development easier and provide a better overall experience for end-users.  Out pours new books, seminars, user groups, newsgroups, training kits, on-line training, on-site training, certifications and standardization - and we're off to a new era in software development.

These shifts are common; a natural progression as new technology becomes available as a response to the demands of end-users and developers, but it's worth noting too that some technology doesn't survive long (DirectAnimation comes to mind).

So it seems that sometime in the future, hardware permitting, there will be a conversion from web programming trends into a standardized, distributed, service-oriented, conceptually-architected, "packaged", rich-client programming paradigm that spans operating systems from different vendors, with Microsoft in the driver's seat.

Buckle your seat belts :)

--
Appendix

1[M]ultiple high-level languages may be executed in different system environments [1]
2Sql Server 2005 Express is on the foreground of data-encapsulation within rich-client programs by providing a pluggable data model that can execute .NET Framework code. If standardized, or if some Sql Server Express edition in the future becomes standard with a Microsoft OS installation, fully encapsulated rich-client applications will be distributable without the need for the classic separation of data tier, middleware and presentation (disregarding OOP design patterns and techniques, of course).

--
References

[1] The Common Language Runtime (CLR)
http://msdn2.microsoft.com/en-us/library/aa497266.aspx

[2] Understanding Enterprise Platforms
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/jdni_ch02.asp

[3] The ADO.NET Entity Framework Overview
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs05/html/ADONETEnFrmOvw.asp

[4] Peer To Peer, Harness The Power Of P2P Communication In Windows Vista And WCF
http://msdn.microsoft.com/msdnmag/issues/06/10/PeerToPeer/default.aspx

[5] Windows Vista Networking for Developers (September 1, 2006)
http://msdn.microsoft.com/chats/transcripts/windows/06_0901_msdn_vista.aspx

[6] Windows Vista § Development technologies
http://en.wikipedia.org/wiki/Windows_Vista

[7] Microsoft Open Specification Promise
http://www.microsoft.com/interop/osp/default.mspx

As always, I'd love to hear from anyone that has something to say about this post.  Drop me a comment.

December 01, 2006

References, Citations, Paraphrasing and Plagiarism

There is a fair amount of research and work that can go into obtaining information when a poster responds to a newsgroup question.  Information is commonly obtained through someone else's testing or research and is simply paraphrased or even quoted.  I believe that's understood by the general population of newsgroup readers.  So at what point does a poster's information become plagiaristic, if ever, when their sources aren't referenced in a newsgroup post? 

It's no secret that a large majority of information found in newsgroups is based on the poster's memory of the subject matter.  However, much of it can be attributed to actual fact-based sources such as books, articles and other newsgroup posts.  Anything stated in a newsgroup post can easily be interpreted as fact even if it's not.  Information posted without fundamental tests being performed by the respondent themselves are sometimes qualified with, "AFAIK" (as far as I know) and "IIRC" (if I remember correctly).

If it's just common knowledge that, as for unreferenced information found on newsgroups, the testing required to have derived it is not of the respondent's effort but instead the effort of unreferenced sources, then what if the information derives purely from the respondent's own testing and experience?  Posters would only get credit for having derived their own knowledge if they explicitly state that the information provided by them was acquired solely through personal testing.

Maybe posting based on personal testing is uncommon enough that it's fair to just assume that posters have retrieved their information from other sources or just have so much experience with the subject matter that testing isn't required.  Since much or all of the information of the subject matter is coming off the top of their head, to enable quick responses, it might not be fair for OPs to expect references without explicitly asking for them.  In that respect, newsgroups are treated more like personal conversations.

Syndication

So this brings me to the point of syndication of USENET content and how what was once simple, off the top of your head opinion being expressed in a conversation between one or more thread correspondents, may become reference material much like books and articles to the possibly millions of users that search groups.google.com every day [1].

In school we learn about plagiarism and how bad it is, but we do it all the time in computer science.  In a way, there's just so much information out there that it's used on-demand, like children with our hands in a jar full of jelly beans. It might not make sense to reference the sources of all information provided in a newsgroup, web article, or even books.  Paraphrasing, to that point, is commonplace in newsgroups because it's just too difficult to locate all sources of information when speedy responses are preferred and expected (although not all OPs expect speedy responses.  Especially the ones that are familiar with newsgroups).

Usefulness in references

For one thing, quotations and fact-based opinions found in posts are usually not the entire story anyway.  References provide to readers a more complete description and reasoning, in its original context.  Without these references, the information being provided out-of-context may be easily misinterpreted, and then even worse, assumed to be accurate and complete.  People who prefer newsgroup posts to be short can't have it both ways.  Either you need to provide a reference to the complete information, or provide the complete information within the newsgroup post itself.  This reduces ambiguity in replies, providing more accurate information to readers.  I prefer a reference link over a long post where the source itself is easy to understand or interpret.  If that's not the case, I believe a fuller description in the post may be in order.  There's also the idea that any missing references or incomplete information will be supplemented by another respondent, but I doubt that possibly ignorant expectation disqualifies plagiarism.  I'm sure in many cases that OPs have read the references already but simply wanted clarification.  I'm not sure how or if that situation particularly applies to this topic, however.

References also enable readers to perform further research on their own, given a good place to start.  Without references, readers are forced to perform their own searches but in many cases they don't possess the necessary skills to perform Internet research without a helping hand, which is why they may have come to newsgroups in the first place.

In consideration of time and memory

To relax my arguments a bit, I must say that I don't expect everyone to always reference their sources in newsgroups.  If you're not performing research but instead answering from memory, then I think it's reasonable if you don't always reference your sources, but it's certainly desirable for the reasons I've listed above.  If you invest the time to find a good source that agrees with the information you are providing, I'm sure it will be much appreciated by readers.

Format and placement

Outdated links floating around in USENET posts are just clutter, however a reference list appearing at the bottom may prove to reduce some of the clutter found in newsgroup posts instead of in-line referencing.  Long after reference hyperlinks become invalid, the information found in a post may still prove to be useful, but having an invalid link right between two informative paragraphs or sentences reduces the readability of the post in which it's referenced and serves no useful purpose since the link no longer works.

Proposal

Are newsgroup posts supposed to be small enough where citations and references (as opposed to in-line referencing, on occasion) aren't expected or doesn't need to be standardized?

I wonder if it's a lack of standardization or precedence that most respondents feel like it's unnecessary (or just don't bother) to reference and cite the sources of their information, where applicable.  So here's my proposal for a simplified, standardized idiom for citing and referencing within newsgroup posts.

If anyone is aware that these standards or anything similar exists already, I'd love to see some links to your sources submitted as comments, please.

Newsgroup citation and referencing standardization based on [2,3]:

Considerations

  • Maximum length of lines in characters (based on common newsreader capabilities)
  • International character sets (e.g., reference to a book title published only in French)
  • Should we reference the author? publisher? dates?
  • How do we reference sections in web articles?  pages in books?

Rules

  1. Citations in posts should use the same standards as specified by [2].  For example, the [2] that I've just used to cite [2] :)
  2. Respondents should post a list of references after their entire signature.  Here is an example reference list, appearing after my signature, that could be included in newsgroup posts that cite these resources (also serves as the reference list for this blog entry):

--
Dave Sexton

[1] Search Engine Watch, Searches Per Day
http://searchenginewatch.com/showPage.html?page=2156461

[2] IEEE style documentation
http://www.ecf.toronto.edu/~writing/handbook-docum1b.html

[3] IEEE style edition § Web Page
http://www.ecf.toronto.edu/~writing/bbieee-help.html#wp

[4] IEEE style edition § Individual Author
http://www.ecf.toronto.edu/~writing/bbieee-help.html#indiv-auth

And here's an example book reference based on [4]:

[5] J. Writer, Computer Science, Reading-Material Press, 1996.
pp. 78-96: Artificial Intelligence

As always, I'd love to hear from anyone that has something to say about this post.  Drop me a comment.