BusinessRx Community

Dedicated to the advancement of software, technology and the people who devote their lives to it.

Welcome to BusinessRx Community Sign in | Join | Help
in Search

BusinessRx Reading List

These blog entries are written by industry experts and leaders. We consider this content to be a good read for any software developer or web technologist.

June 2009 - Posts

  • July's Toolbox Column Now Online

    My Toolbox column in the July 2009 issue of MSDN Magazine is available online and includes the following reviews:

    • ApexSQL Enforce - I've reviewed source code static analysis tools in previous issues, such as Microsoft's FxCop and StyleCop programs. ApexSQL Enforce is a static analysis tool for your database. In a nutshell, it runs a series of rules against a specified database, ranging from security concerns to data modeling.
    • Podcasts of Note: Hanselminutes - in addition to a great blog, Scott Hanselman also hosts a weekly podcast show, Hanselminutes. These roughly 30-minute long shows usually include a guest and focus on a specific development-related topic, like ADO.NET Services, getting started with Domain Driven Design, and an overview of jQuery. These high-quality and educational productions are great material for your daily commute or time at the gym.
    • Terminals - Terminals is a free, open source project that provides a multi tab UI for connecting to remote computers. Terminals supports Terminal Services, Microsoft RDP, VNC, RAS. Telnet, and other protocols and helps consolidate various remote connections into one window. Also included are a number of networking tools, tools for taking and managing screen shots from remote desktops, and one-click access to common administration and network configuration utilities. If you routinely connect to remote computers be sure to check out Terminals.

    This issue reviewed Programming Entity Framework, by Julia Lerman. An exerpt from my review follows:

    At nearly 800 pages, Programming Entity Framework is fairly hefty, but it offers a solid grounding in using the Entity Framework. The book assumes its readers are intermediate to advanced .NET developers who are already familiar with database concepts, ADO.NET, LINQ, and other core .NET features and spends no time introducing these topics. Instead, the book is packed with walkthroughs that illustrate the use of Entity Framework in various scenarios. It also does a great job pointing out what this first version of the Entity Framework can and cannot do and what use cases are difficult or tricky to implement, along with workarounds where appropriate.

    Enjoy! - http://msdn.microsoft.com/en-us/magazine/dd943051.aspx

    As always, if you have any suggestions for products, blogs, or books to review for the Toolbox column, please send them to toolsmm@microsoft.com.

  • O/R Mapping (Object/Relational Mapping)

    Quem é que ainda não conhece o NHibernate , LLBLGen Pro entre outros ? Como sabem estas "tools"

  • Principals, Code-Behind, & View Engines

    The July issue of MSDN Magazine is available online with my article “Guiding Principles For Your ASP.NET MVC Applications”. Another MVC article in this issue is Justin Etheredge’s “Building Testable ASP.NET MVC Applications”. Justin’s article is a good one as he shows you how to design for testability, and includes specific examples with xUnit.net, moq, and Ninject.

    Something we both touched on was the topic of code-behind files.

    Code-Behind and ASP.NET MVC

    The conversation between two developers (let’s call them Pushy and Principled), goes like this:I WANT CODE-BEHIND!!

    Pushy: Is it OK to use code-behind files with aspx views?

    Principled: No.

    Pushy: But, I have something that’s really, really specific to this one particular view. That’s OK, right?

    Principled: No.

    Pushy: Oh, come on! You aren’t being pragmatic here. I don’t want to add a Page_Load, I just need a teeny tiny little instance method. I’ll add a code-behind file and stick it inside. It’s tiny! That’s OK, right?

    Principled: No.

    Pushy: Really now, what do you expect me to do? Build one of those forsakenly awful HTML helper methods with more overloads than the California power grid in August? Don’t you think it’s better to put the code close to the view that uses it?

    Principled: No.

    This is one scenario where I’d side with Principled. Sure, the code-behind could be simple. Sure, if you are careful it might even be unit-testable. But, the mere fact that code-behind is possible is a fluke and a byproduct from building on an existing framework. Someone writing an MVC view engine from scratch wouldn’t need to provide such a feature that allows you to put anything resembling intelligence near a view.

    All of the problems I’ve seen described where code-behind is a solution could easily be solved with an HTML helper, or by using a more robust presentation model that is passed to the view. Some people worry about a proliferation of HTML helpers, but if you absolutely need a helper scoped to a specific view, you can always put the helper in a different namespace that only that particular view will use.

    I know your team is disciplined. I know you wouldn’t do anything wrong. I know you want to be pragmatic and do the simplest thing that works. But, think of the first code-behind file in a project as the first broken window. It’s another degree of freedom where entropy can wiggle into your software and undermine maintainability. It’s making a view smarter, which your principles should suggest is wrong.

    Thoughts?

  • Final Six ASP.NET Hosting Tutorials Now Online

    The final six tutorials in my Hosting Tutorials on www.asp.net have been published. These tutorials walk readers through hosting an ASP.NET website with a web host provider and are aimed at beginning to intermediate ASP.NET developers interested in getting a small- to medium-sized ASP.NET application online. The first six hosting tutorials served as an introduction to the series and provided an overview of core concepts. These current four tutorials examine deploying a data-driven web application. The following four tutorials explored common challenges in deploying data-driven web applications. These final six tutorials look at handling and logging runtime errors in the production environment as well as site administration and advanced deployment options.

    • Displaying a Custom Error Page [VB | C#] - What does the user see when a runtime error occurs in an ASP.NET web application? The answer depends on how the website's <customErrors> configuration. By default, users are shown an unsightly yellow screen proclaiming that a runtime error has occurred. This tutorial shows how to customize these settings to display an aesthetically-pleasing custom error page that matches your site's look and feel.
    • Processing Unhandled Exceptions [VB | C#] - When a runtime error occurs on a web application in production it is important to notify a developer and to log the error so that it may be diagnosed at a later point in time. This tutorial provides an overview of how ASP.NET processes runtime errors and looks at one way to have custom code execute whenever an unhandled exception bubbles up to the ASP.NET runtime.
    • Logging Error Details with ASP.NET Health Monitoring [VB | C#] - Microsoft's health monitoring system provides an easy and customizable way to log various web events, including unhandled exceptions. This tutorial walks through setting up the health monitoring system to log unhandled exceptions to a database and to notify developers via an e-mail message.
    • Logging Error Details with ELMAH [VB | C#] - Error Logging Modules And Handlers (ELMAH) offers another approach to logging runtime errors in a production environment. ELMAH is a free, open source error logging library that includes features like error filtering and the ability to view the error log from a web page, as an RSS feed, or to download it as a comma-delimited file. This tutorial walks through downloading and configuring ELMAH.
    • Precompiling Your Website [VB | C#] - Visual Studio offers ASP.NET developers two types of projects: Web Application Projects (WAPs) and Web Site Projects (WSPs). One of the key differences between the two project types is that WAPs must have the code explicitly compiled prior to deployment whereas the code in a WSP can be automatically compiled on the web server. However, it is possible to precompile a WSP prior to deployment. This tutorial explores the benefits of precompilation and shows how to precompile a website from within Visual Studio and from the command line.
    • Managing Users and Roles On The Production Website [VB | C#] - The ASP.NET Website Administration Tool (WSAT) provides a web-based user interface for configuring Membership and Roles settings and for creating, editing, and deleting users and roles. Unfortunately, the WSAT only works when visited from localhost, meaning that you cannot reach the production website's Administration Tool through your browser. The good news is that there are workarounds that make it possible to manage users and roles on production. This tutorial looks at these workarounds and others.
    As with the previous 10 tutorials in this series, each tutorial can be downloaded as a PDF for offline viewing or printing and each tutorial includes a working website that illustrates the concepts discussed in the tutorial (and can be downloaded as a ZIP file).

    Enjoy!

    -- Hosting Tutorials Homepage: http://www.asp.net/learn/hosting/
    -- ASP.NET Vidoes, Tutorials, and Other Learning Material - http://www.asp.net/learn/

  • Installing the FTP Service for IIS 6.0 On An Alternate Port Number and Configuring Windows Firewall

    The Scenario
    You need to setup the Windows FTP service that is part of IIS 6.0 on a port number other than the standard port 21 and are using Windows Firewall on the server.

    The Challenge
    Getting an FTP server and the firewall to play nicely can be a bit of an adventure because FTP uses two ports for communicating - one for establishing the connection (typically port 21) and another one for transferring the data. I believe that the FTP service for IIS 6.0 uses port 20 as the data port when using port 21 as the connection port, but I am not 100% certain. What I am certain of is that the data port may be randomly selected from a wide range of ports, and that is certainly the case when you select an alternate port number for the FTP service. The challenge lies in configuring the firewall to accept incoming requests to both the alternate port and the randomly selected data port.

    If neither port is opened in the firewall then when you attempt to connect to the FTP server the client will hang and eventually report that it cannot connect. If only the connection port is open then the FTP client will connect successfully but will repor the error: Failed to retrieve directory listing. The reason is because the connection can be established over the connection port, but the request made to the data port to get the directory listing was blocked by the firewall. Long story short, you need to make sure that the firewall will allow traffic for the randomly selected data port.

    The Solution

    1. Create an FTP site on the port number of choice, such as 12345. Typically you'll want to use a high port number to minimize the likelihood of a conflict. See Creating Multiple FTP Site (IIS 6.0) for more information on how to create multiple FTP sites using multiple ports.
    2. If you are using Windows Vista or Windows 2008 as the server, run the following command from the command line: sc sidtype MSFTPSVC unrestricted
    3. Stop and restart the FTP service.
      • Note: I got an error when restarting the service that complained that an unrestricted service could not be started in the same host as restricted services. I'm sure there is a more elegant workaround, but what worked for me was to reboot the server.
    4. Configure Windows Firewall to allow all TCP traffic in for the MSFTPSVC service. This will allow traffic on the randomly selected data port to penetrate the firewall. You can do this from the Windows Firewall configuration GUI or from the command line via: netsh advfirewall firewall add rule name="FTP" service=MSFTPSVC action=allow protocol=TCP dir=in
    5. Ensure that the FTP filter for Windows Firewall is disabled. You can do this via the command line, as well: netsh advfirewall set global Statefulftp disable

    For more on Steps 2-5 see Windows Firewall Setup for Microsoft FTP Publishing Service for IIS 7.0. Some of the command line arguments are different in the article than what I have posted above because my commands are for the FTP Service that's part of IIS 6.0 whereas the article linked to looks at using the FTP Publishing Service for IIS 7.0.

    I spent the better part of an afternoon figuring out these five steps. I hope this blog entry saves someone else a few precious hours.

  • Functional Programming Battles GOTOzilla

    Steve Wellens had a recent blog post arguing for the use of a goto in C# (see: Why goto Still Exists in C#). Steve had a series of methods he wants to execute, but he wants to stop if any given method returns false. At the end of the post, Steve decided that the following code with goto was better than setting boolean variables:

    Functional programming battles the GOTO // DoProcess using goto
    void DoProcess3()
    {
        LOG("DoProcess Started...");
    
        if (Step1() == false)
            goto EXIT;
        if (Step2() == false)
            goto EXIT;
        if (Step3() == false)
            goto EXIT;
        if (Step4() == false)
            goto EXIT;
        if (Step5() == false)
            goto EXIT;
    
    EXIT:
        LOG("DoProcess Finished");
    }
    

    In the comments, a remark from a different Steve stood out. Steve suggested using an array of Func<bool>. The comment didn’t generate any further discussion, but it’s worthy to call out.

    I don’t think the problem with the above code is with the goto per-se. I think the problem is how the code conflates “what to do” with “how to do it”. In this scenario both are a little bit tricky. Let’s assume we might need to add, remove, or change the order of the method calls. But, the method calls are so intertwined with conditional checks and goto statements that it obscures the process. Using an array of Func<bool> is a simple approach, yet it still manages to create a data structure that isolates “what to do”.

    void Process()
    {
        Func<bool>[] steps =
        {
            Step1,
            Step2,
            Step3,
            Step4,
            Step5
        };
    
        ExecuteStepsUntilFirstFailure(steps);
    }
    

    You could argue that all this code does is push the problem of “how to do it” further down the stack. That’s true, but we’ve still managed to separate “what” from “how”, and that’s a big win for maintaining this code. The simplest thing that could possibly work for “how” would be:

    void ExecuteStepsUntilFirstFailure(IEnumerable<Func<bool>> steps)
    {
        steps.All(step => step() == true);
    }
    

    The All operator is documented as stopping as soon as a result can be determined, so the above code is equivalent to the following:

    void ExecuteStepsUntilFirstFailure(IEnumerable<Func<bool>> steps)
    {
        foreach (var step in steps)
        {
            if (step() == false)
            {
                break;
            }
        }
    }
    

    With this approach it’s easy to change the order of the steps, or to add steps and delete steps, without worrying about missing a goto or conditional check. The next step up in complexity (excuse the pun) would be to create a Step class and encapsulate the Func with other metadata and state. I’m sure you could also imagine the execution phase relying on an IStepExector interface as the base for executing steps under a transaction, or with step-level logging, or even in parallel – and all this without changing how the steps are arranged. Take this to an extreme, and you’ll have a technology like Windows Workflow Foundation. :)

    The ability of functional and declarative programming to separate the “what” and the “how” is powerful, but you don’t need a new language, and you can start simple. In this scenario it’s another tool you can use to save your city from the goto-zilla monster.

    What do you think?

  • Progressive .NET Event In Stockholm

    Øredev is putting together an exciting lineup of topics and speakers for Progressive .NET Days. The event is August 27-28 in Stockholm Sweden.

    Progressive software development understands that tomorrow's better ideas for software development are likely here with us today and seeks them out now, building bridges that span paradigms through practice and experience.

    I’m excited to be a part of the event, and wish I could also sit in on every other session.I’ve also had a dream to ride in a hot air balloon, which I understand is quite popular during the summer months in Stockholm….

  • When Do I Use Interfaces?

    interface “Program to an interface, not an implementation” is a well-known mantra from the GoF book. Take this guidance to an extreme, though, and you generate POO instead of OOP. How do know if you crossed the line?

    I think it’s useful to take a step back and think about the word “interface” in a general sense. There are interfaces everywhere in software.  There are interfaces between layers, between tiers, between applications, between objects, and between callers and their callees. Just about anything and everything in software, no matter how trivial, has an interface.

    The real question with interfaces is how many constraints you want in place for any given interface. Consider the following JavaScript code.

    function validate(creditService) {
        creditService.checkCreditForCustomer(this.id);
    }

    The only constraint on the creditService parameter is that the object needs a checkCreditForCustomer function that takes an ID parameter. The validation function doesn’t care how the creditService was built, who built it, where it came from, or what other capabilities might be in place. This code demonstrates the flexible, dynamic, and relatively unconstrained qualities of duck typing. If the parameter checks the credit of a customer like a credit service should, then it must be a credit service.

    Going Static

    Static languages generally have to crank up the constraints on an interface, although many have an escape hatch. C# 4.0, for example, introduces a dynamic type.

    public bool Validate(dynamic creditService)
    {
        return creditService.CheckCreditForCustomer(ID);
    }
    

    Again - all we need is an object with a CheckCreditForCustomer method that takes an int parameter. Because the object is typed as dynamic, the compiler won’t guarantee what the object can actually do – there is no type checking. At runtime, we may find out the object doesn’t actually support the method we are looking for, and an exception appears. This duck typing behavior is what keeps fans of static typing awake at night. They think the dynamic programmers are insane for throwing around objects in a willy-nilly manner. Meanwhile, the dynamic crowd thinks the fans of static typing are insane for spending all of their time obsessing over types instead of creating software.

    Regardless of where you fall in the static to dynamic spectrum, you can view a type definition as a constraint. In C# and Java, the interface keyword can constrain the type of an object without placing any constraints on the implementation.

    interface ICreditService
    {
        bool CheckCreditForCustomer(int id);
        bool CheckCreditForCompany(int id);
    }
    

    Now we can use this constraint to enforce type safety.

    public bool Validate(ICreditService creditService)
    {
        return creditService.CheckCreditForCustomer(ID);
    }
    

    An interface (in the interface keyword sense) allows fans of static typing to sleep at night while still leaving some flexibility behind. The object that arrives as an ICreditService on any given call might be one of 10 different credit service implementations. The 10 implementations may be from the same class inheritance hierarchy, or they may not. One might be a mock object or test double used only during testing (which I should point out is not, not, not the point of using interfaces), or it may not. The Validate doesn’t care about the concrete implementation behind the interface.

    We still have some flexibility, but we also have additional constraints when compared to duck typing. The credit service has to implement two methods now, even if we just want to build an object for the Validate method which only uses the CheckCreditForCustomer method. These two methods may or may not be good thing. Iterative design with tests and a dose of the interface segregation principle will take care of the matter.

    Going Concrete

    Even more constraints come into play if we use a class definition instead of an interface.

    public class CreditService
    {
        public virtual bool CheckCreditForCustomer(int id)
        { 
            // ...
        }
        
        // ...
    }
    

    Now we’ve not only constrained the type, but we’ve constrained the implementation. Whoever provides our credit service functionality must be a CreditService object, or use CreditService as a base class. Building software is all about composing pieces of functionality together, and using a concrete class as the interface specification places hard restrictions on how the composition will work now,and in the future.

    Interfaces Everywhere?

    Sometimes, these hard restrictions make sense, or at least aren’t important. For example, classes that have no behavior (like DTOs) don’t need an interface abstraction. I’ve also never found it useful to specify entities using an interface, as they have pure business logic inside (logic dealing only with other business objects or abstractions).

    public interface ICustomer
    {
        int ID { get; set; }          
        int Name { get; set; }
        void UpdateAddress(/* ... */);
        // ...
    }
     

    In short, you don’t need interfaces everywhere, you need to anticipate where your software needs to be flexible, which isn’t always easy. Using interface definitions between two horizontal or vertical layers of an application is almost always a yes, but programming to an interface between two business objects inside the same context is a definite maybe.

    I like to use interface definitions when I want to turn a detail into a concept. For example, I’d feel more comfortable with an business object using an ISendMessage object then an SmtpServer object. The concept is closer to what the object needs to do (send a message), and it’s easier to change the business object’s behavior by giving the object a different ISendMessage implementation. As a special extra double bonus, the object using ISendMessage is much easier to test. List<T> is a detail. IList<T> is a concept.

    If you doubt the power of interface programming, then just look at COM. Really. In COM you could only program to an object’s interface, and this allowed objects from different runtimes (Visual Basic versus C), different threading models (objects with a thread affinity versus multi-threaded objects), and different processes (local versus remote) to all work together, plus a host of other features. Interface definitions are the ultimate abstraction (for a statically typed environment!).

  • June 7th Links: ASP.NET, AJAX, ASP.NET MVC, Visual Studio

    Here is the latest in my link-listing series.  Also check out my ASP.NET Tips, Tricks and Tutorials page and Silverlight Tutorials page for links to popular articles I've done myself in the past.

    You can also now follow me on twitter (@scottgu) where I also post links and small posts.

    ASP.NET

    • GridView Confirmation Box using jQuery: Mohammed Azam has a nice post that describes how to implement model confirmation UI using jQuery.  This is particularly useful for scenarios like saving or deleting data.

    AJAX

    • ASP.NET 4.0 AJAX – Client Templates: Damien White has a great post that describes the new client templating support in ASP.NET AJAX.  This provides an easy and powerful way to dynamically create rich HTML UI on the client.

    • ASP.NET 4.0 AJAX - Data Binding: Damien White continues his great ASP.NET AJAX series with this article that describes the new client-side data binding features in the new version of ASP.NET AJAX. 

    ASP.NET MVC

    • DataAnnotations and ASP.NET MVC: Brad Wilson (a dev on the ASP.NET MVC team) has a nice post that describes how to use DataAnnotations to annotate model objects, and then use a model binder to automatically validate them when accepting form posted input.  DataAnnotation support will be built-in with the next version of ASP.NET MVC.

    Visual Studio

    Hope this helps,

    Scott

  • From LINQ to XPath and Back Again

    Let’s say you wanted to select the parts for a Lenovo X60 laptop from the following XML.

    <Root>
      <Manufacturer Name="Lenovo=">
        <Model Name="X60=" >
          <Parts>
            <!-- ... -->
          </Parts>
        </Model>
        <Model Name="X200=">
          <!-- ... -->
        </Model>
      </Manufacturer>
      <Manufacturer Name="...=" />
      <Manufacturer Name="...=" />
      <Manufacturer Name="...=" />
      <Manufacturer Name="...=" />
    </Root>

    If you know LINQ to XML, you might load up an XDocument and start the party with a brute force approach:

    var parts = xml.Root
                   .Elements("Manufacturer")
                       .Where(e => e.Attribute("Name").Value == "Lenovo")
                   .Elements("Model")
                       .Where(e => e.Attribute("Name").Value == "X60")
                   .Single()
                   .Element("Parts");
    

    But, the code is ugly and makes you long for the days when XPath ruled the planet. Fortunately, you can combine XPath with LINQ to XML. The System.Xml.XPath namespace includes some XPath specific extension methods, like XPathSelectElement:

    string xpath = "Manufacturer[@Name='Lenovo']/Model[@Name='X60']/Parts";
    var parts = xml.Root.XPathSelectElement(xpath);
    

    Now the query is a bit more readable (at least to some), but let’s see what we can do with extension methods.

    static class ComputerManufacturerXmlExtensions
    {
        public static XElement Manufacturer(this XElement element, string name)
        {
            return element.Elements("Manufacturer")
                          .Where(e => e.Attribute("Name").Value == name)
                          .Single();
        }
    
        public static XElement Model(this XElement element, string name)
        {
            return element.Elements("Model")
                          .Where(e => e.Attribute("Name").Value == name)
                          .Single();
        }
    
        public static XElement Parts(this XElement element)
        {
            return element.Element("Parts");
        }
    }
    

    Now, the query is short and succinct:

    var parts = xml.Root.Manufacturer("Lenovo").Model("X60").Parts();

    Combine an XSD file with T4 code generation and you’ll have all the extension methods you’ll ever need for pretty XML queries...

  • Cathing Up On Lean

    I now have a number of lean software development books queued up. It started when I saw this single bullet point in a presentation:

    • Overproduction == Extra Features

    I’m enjoying the thinking behind lean, and I believe the techniques and vocabulary of lean makes software development more tangible to the folks we work with who don’t write code – and that’s important.

    Overproduction in software development happens when you produce a feature that customers rarely use. This is one of lean’s seven deadly types of wastes. The perfect technique to manage this waste is to never create a feature without first establishing a clear value for the feature, but perfection isn’t easy. In commercial software development you’ll inevitably ship some useless bits as you discover the market and the functionality your future customers will value.

    Even when you do ship successful bits, the outside world can reprioritize your software. The U.S. healthcare industry, for example, is ultra-sensitive to laws and regulations. A new piece of legislation can change last year’s “must have” feature into this year’s “meh”.

    Breaking Up Is Hard To Do

    The relationship between overproduction stuck out to me because I’ve wrestled with overproduction for many years on several different products. Software vendors are reluctant to remove features, no matter how rarely used the features may be. Sales people in particular object to cutting anything they think might possibly have the slightest potential to attract a single future sale.

    The basic problem is thinking of a software feature as an investment -something to protect moving forward. As Mark Lindell will tell you, code is not an investment but a liability. In lean thinking, features are inventory, and anyone who has come within spitting distance of a business school knows inventory can make cuts in the margin.

    Removing a working feature is never an easy decision, but the sooner a vendor sees obsolete features as a cost and waste, the sooner the vendor can jettison the unused inventory that adds no value to the customer or the company.

  • IIS Search Engine Optimization Toolkit

    SEO (search engine optimization) is one of the important considerations that any Internet web-site needs to design with in mind.  A non-trivial percentage of Internet traffic to sites is driven by search engines, and good SEO techniques can help increase site traffic even further.

    Likewise, small mistakes can significantly impact the search relevance of your site’s content and cause you to miss out on the traffic that you should be receiving.  Some of these mistakes include: multiple URLs on a site leading to the same content, broken links from a page, poorly chosen titles, descriptions, and keywords, large amounts of viewstate, invalid markup, etc.  These mistakes are often easy to fix - the challenge is how to discover and pinpoint them within a site.

    Introducing the IIS Search Engine Optimization Toolkit

    Today we are shipping the first beta of a new free tool - the IIS Search Engine Optimization Toolkit - that makes it easy to perform SEO analysis on your site and identify and fix issues within it.

    You can install the IIS Search Engine Optimization Toolkit using the Microsoft Web Platform Installer I blogged about earlier this week.  You can install it through WebPI using the “install now” link on the IIS SEO Toolkit home

    Once installed, you’ll find a new “Search Engine Optimization” section within the IIS 7 admin tool, and several SEO tools available within it:

    The Robots and SiteMap tools enable you to easily create and manage robots.txt and sitemap.xml files for your site that help guide search engines on what URLs they should and shouldn’t crawl and follow.

    The Site Analysis tool enables you to crawl a site like a search engine would, and then analyze the content using a variety of rules that help identify SEO, Accessibility, and Performance problems within it.

    Using the IIS SEO Toolkit’s Site Analysis Tool

    Let’s take a look at how we can use the Site Analysis tool to quickly review SEO issues with a site.  To avoid embarrassing anyone else by turning the tool loose on their site, I’ve decided to instead use the analysis tool on one of my own sites: www.scottgu.com.  This is a site I wrote many years ago (last update in 2005 I think).  If you install the IIS SEO Toolkit you can point it at my site and duplicate the steps below to drill into the SEO analysis of it.

    Open the Site Analysis Tool

    We’ll begin by launching the IIS Admin Tool (inetmgr) and clicking on the root node in the left-pane tree-view of the IIS7 admin tool (the machine name – in this case “Scottgu-PC”).  We’ll then select the “Site Analysis” icon within the Search Engine Optimization section on the right.  Opening the Site Analysis tool at the machine level like this will allow us to run the analysis tool against any remote server (if we had instead opened it with a site selected then we would only be able to run analysis against local sites on the box). 

    Opening the Site Analysis tool causes the below screen to display – it lists any previously saved site analysis reports that we have created in the past.  Since this is the first time we’ve opened the tool, it is an empty list.  We’ll click the “New Analysis…” action link on the right-hand side of the admin tool to create a new analysis report:

    Clicking the “New Analysis…” link brings up a dialog like below, which allows us to name the report as well as configure what site we want to crawl and how deep we want to examine it. 

    We’ll name our new report “scottgu.com” and configure it to start with the http://www.scottgu.com URL and then crawl up to 10,000 pages within the site (note: if you don’t see a “Start URL” textbox in the dialog it is because you didn’t select the root machine node in the left-hand pane of the admin tool and instead opened it at the site level – cancel out, select the root machine node, and then click the Site Analysis link).

    When we click the “Ok” button in the dialog above the Site Analysis tool will request the http://www.scottgu.com URL, examine the returned HTML content, and then crawl the site just like a search engine would.  My site has 407 different URLs on it, and it only took 13 seconds for the IIS SEO Toolkit to crawl all of them and perform analysis on the content that was downloaded. 

    Once it is done it will open a report summary view detailing what it found.  Below you can see that it found 721 violations of various kinds within my site (ouch):

    We can click on any of the items within the violations summary view to drill into details about them.  We’ll look into a few of them below.

    Looking at the “description is missing” violations

    You’ll notice above that I have 137 “The description is missing” violations.  Let’s double click on the rule to learn more about it and see details about the individual violations.  Double clicking the description rule above will open up a new query tab that automatically provides a filtered view of just the description violations (note: you can customize the query if you want – and optionally export it into Excel if you want to do even richer data analysis):

    Double clicking any of the violations in the list above will open up details about it.  Each violation has details about what exactly the problem is, and recommended action on how to fix it:

    Notice above that I forgot to add a <meta> description element to my photos page (along with all the other pages too).  Because my photos page just displays images right now, a search engine has no way of knowing what content is on it.  A 25 to 150 character long description would be able to explain that this URL is my photo album of pictures and provide much more context. 

    The “Word Analysis” tab is often useful when coming up with description text.  This tab shows details about the page (its title, keywords, etc) and displays a list of all words used in the HTML within it – as well as how many times they are duplicated.  It also allows you to see all two-word and three-word phrases that are repeated on the page.  It also lists the <a> text used on other page to link to this page – all of which is useful to come up with a description:

    Looking at the URL is linked using different casing violations

    Let's now at the “URL is linked using different casing” violations.  We can do this by going back to our summary report page and by then clicking on this specific rule violation:

    Search engines count the number of pages on the Internet that link to a URL, and use that number as part of the weighting algorithm they use to determine the relevancy of the content the URL exposes.  What this means is that if 1000 pages link to a URL that talks about a topic, search engines will assume the content on that URL has much higher relevance than a URL with the same topic content that only has 10 people linking to it.

    A lot of people don’t realize that search engines are case sensitive, though, and treat differently cased URLs as different actual URLs.  That means that a link to /Photos.aspx and /photos.aspx will often be treated not as one URL by a search engine – but instead as two different URLs.  That means that if half of the incoming links go to /Photos.aspx and the other half go to /photos.aspx, then search engines will not credit the photos page as being as relevant as it actually is (instead it will be half as relevant – since its links are split up amongst the two).  Finding and fixing any place where we use differently cased URLs within our site is therefore really important.

    If we click on the “URL is linked using different casing” violation above we’ll get a listing of all 104 URLs that are being used on the site with multiple capitalization casings:

    Clicking on any of the URLs will pull up details about that specific violation and the multiple ways it is being cased on the site.  Notice below how it details both of the URLs it found on the site that differ simply by capitalization casing. In this case I am linking to this URL using a querystring parameter named "AlbumId".  Elsewhere on the site I am also linking to the URL using a querystring parameter named "albumid" (lower-case “a” and “i”).  Search engines will as a result treat these URLs as different, and so I won’t maximize the page ranking for the content:

    Knowing there is a problem like this in a site is the first step. The second step is typically harder: trying to figure out all the different paths that have to be taken in order for this URL to be used like this.  Often you'll make a fix and assume that fixes everything - only to discover there was another path through the site that you weren't aware of that also causes the casing problem. To help with scenarios like this, you can click the "Actions" dropdown in the top-right of the violations dialog and select the "View Routes to this Page" link within it.

    This will pull up a dialog that displays all of the steps the crawler took that led to the particular URL in question being executed. Below it is showing that it found two ways to reach this particular URL:

    Being able to get details about the exact casing problems, as well as analyze the exact steps followed to reach a particular URL casing, makes it dramatically easier to fix these types of issues.

    Looking at the page contains multiple canonical format violations

    Fixing the casing issues like we did above is a good first step to improving page counts.  We also want to fix scenarios where the same content can be retrieved using URLs that differ by more than casing.  To do this we’ll return to our summary page and pull up the “page contains multiple canonical format violations” report:

    Drilling into this report lists all of the URLs on our site that can be accessed in multiple “canonical” ways:

    Clicking on any of them will pull up details about the issue. Notice below how the analysis tool has detected that sometimes we refer to the home page of the site as "/" and sometimes as "/Default.aspx". While our web-server will interpret both as executing the same page, search engines will treat them as two separate URLs - which means the search relevancy is not as high as it should be (since the weighting gets split up across two URLs instead of being combined as one).

    We can see all of the cases where the /Default.aspx URL is being used by clicking on the “Links” tab above.  This shows all of the pages that link to the /Default.aspx URL, as well as all URLs that it in turn links to:

    We can switch to see details about where and how the related “/” URL is being used by clicking the “Related URLs” drop-down above – this will show all other URLs that resolve to the same content, and allow us to quickly pull their details up as well:

    Like we did with the casing violations, we can use the “View Routes to this Page” option to figure out of all the paths within the site that lead to these different URLs and use this to help us hunt down and change them so that we always use a common consistent URL to link to these pages. 

    Note: Fixing the casing and canonicalization issues for all internal links within our site is a good first step.  External sites might also be linking to our URLs, though, and those will be harder to all get updated.  One way to fix our search ranking without requiring the externals to update their links is to download and install the IIS URL Rewrite module on our web server (it is available as a free download using the Microsoft Web Platform Installer).  We can then configure a URL Rewrite rule that automatically does a permanent redirect to the correct canonical URL – which will cause search engines to treat them as the same (read Carlos’ IIS7 and URL Rewrite: Make your Site SEO blog post to learn how to do this). 

    Looking up redirect violations

    As a last step let’s look at some redirect violations on the site:

    Drilling into this rule category reminded me of something I did a few years ago (when i transferred my blog to a different site) - that I just discovered was apparently pretty dumb. 

    When I first setup the site I had originally had a simple blog page at: www.scottgu.com/blog.aspx  After a few weeks, I decided to move my blog to weblogs.asp.net/scottgu.  Rather than go through all my pages and change the link to the new address, I thought I’d be clever and just update the blog.aspx page to do a server-side redirect to the new weblogs.asp.net/scottgu URL. 

    This works from an end-user perspective, but what I didn’t realize until I ran the analysis tool today was that search engines are not able to follow the link.  The reason is because my blog.aspx page is doing a server-side redirect to the weblogs.asp.net/scottgu URL.  But for SEO reasons of its own, the blog software (Community Server) on weblogs.asp.net is in turn doing a second redirect to fix the incoming weblogs.asp.net/scottgu URL to instead be http://weblogs.asp.net/scottgu/ (note the trailing slash is being added).

    According to the rule violation in the Site Analysis tool, search engines will give up when you perform two server redirects in a row. It detected that my blog.aspx redirect links to an external link that in turn does another redirect - at which point the search engine crawlers give up:

    I was able to confirm this was the problem without having to open up the server code of the blog.aspx page. All I needed to-do was click the "Headers" tab within the violation dialog and see the redirect HTTP response that the blog.aspx page sent back. Notice it doesn't have a trailing slash (and so causes Community Server to do another redirect when it receives it):

    Fixing this issue is easy. I never would have realized I actually had an issue, though, without the Site Analysis tool pointing me to it.

    Future Automatic Correction Support

    There are a bunch of additional violations and content issues that the Site Analysis tool identified when doing its crawl of my web-site.  Identifying and fixing them is straight-forward and very similar to the above steps.  Each issue I fix makes my site cleaner, easier to crawl, and helps it have even higher search relevancy.  This in turn will generate an increase of traffic coming to my site from search engines – which is a very cost effective return on investment.  Once a report is generated and saved, it will show up in the list of previous reports within the IIS admin tool.  You can at any point right-click it and tell the IIS SEO Toolkit to re-run it – allowing you to periodically validate that no regressions have been introduced.

    The preview build of the Site Analysis tool today verifies about 50 rules when it crawls a site.  Over time we’ll add more rules that check for additional issues and scenarios.  In future preview releases you’ll also start to see even more intelligence built-into the SEO Analysis tool that will allow it to also verify on the server-side that you have the URL Rewrite module installed with a good set of SEO-friendly rules configured.  The Site Analysis tool will also allow you to fix certain violations automatically by suggesting rewrite rules that you can add to your site from directly within the site analysis report tool (for example: to fix issues like the “/” and “/Default.aspx” canonicalization issue we looked at before).  This will make it even easier to help enforce good SEO on the site.  Until then, I’d recommend reading these links to learn more about manually configuring URL Rewrite for SEO:

    Summary

    The IIS Search Engine Optimization Toolkit makes it easy to analyze and assess how search engine friendly your web-site is.  It pinpoints SEO violations, and provides instructions on how to fix them.  You can learn more about the toolkit and how to best take advantage of it from these links:

    The IIS Search Engine Optimization Toolkit is free, takes less than a minute to install, and can be run against any existing web-server or web-site.  There is no need to install anything on a remote server to use it – just type in the URL of the site and you’ll get a report back a site analysis report with actionable items that that you can use immediately to improve it.

    Today’s release is a beta release, so please use the IIS Search Engine Optimization Toolkit Forum to let us know if you run into any issues or have feature suggestions.

    Hope this helps,

    Scott 

     

  • Don't Ask Me If It's Possible

    Years ago, ex-Googler Doug Edwards wrote a blog post to explain the meaning behind a few favorite words in the software developer’s vocabulary: orthogonal, cruft, canonical, and the big one - non-trivial.

    Non-trivial
    It means impossible. Since no engineer is going to admit something is impossible, they use this word instead.

    I’ve spent the bulk of my career developing commercial software, and it’s amazing how many sales people, executives, and marketing managers ask if something is possible. Example:

    Is it possible to reverse the flow of time when we click this button?

    The requests are more realistic, of course, but the stock reply is:

    Given enough resources, anything is possible.

    The response doesn’t say the feature is impossible, or even non-trivial. You might think the response is the habitual reply of yet another passive-aggressive software developer (the canonical developer, to use Mr. Edward’s lexicon), but I much prefer to think of the reply as a zen-like answer that points the questioner to the realities of software development. Despite what the pointy haired boss hears from tool vendors and analysts, creating software still requires resources – both time and mental effort.

    What I’ve Learned

    Building commercial software for a vertical market is a … non-trivial endeavor. But, not non-trivial in the impossible sense. The problem is you don’t know precisely what features will provide enough value to attract new customers until you’ve done some work.

    I’ve generally found that if someone inside an ISV is asking if a feature is possible, it’s only a feature they want if it comes for free. They haven't done the homework to understand how the feature would work to provide value for the mainstream customer, and the idea is still in an incubation phase. It will be difficult to even estimate the amount of work required.

    When they turn the question into:

    What does it take to get this into our software?

    Then you know they are serious and passionate about the idea, and it’s time to start talking. 

  • Microsoft Web Platform Installer

    One of the cool new releases coming out this year is a small download manager - the Microsoft Web Platform Installer - that makes installing and configuring web server and web development stacks really easy.  It is a free tool that you can download from the www.microsoft.com/web site (here is the direct link to the installer – choose the 2.0 version).  It works with Windows XP, Vista, Windows 7, Windows Server 2003 and Windows Server 2008.

    The Web Platform Installer provides an easy way to quickly install and customize all the software you need to develop or deploy web sites and applications on a Windows machine.  The tool automatically analyses what your system currently has installed, allows you to easily mark additional components to be added, and then automates installing them all at once when you click the install button (saving you from having to manually install each one yourself). 

    For example, you can click the “Web Server” section above to customize the individual IIS web server modules installed on the box.  This includes both the built-in IIS modules that ship with Windows (like the directory browsing module), as well as additional modules available as separate downloads.  Below I’ve selected two additional modules – the Application Request Routing and URL Rewrite modules – to be installed:

    The URL Rewrite module is a free Microsoft module that enables you to publish custom URLs from your sites and optimize them for search engine optimization (SEO).  You can enforce SEO rules (consistent casing, embedded keywords, etc) and customize how your site looks from an external perspective however you want (the admin tool will even help guide you to write the regular expression rules):

    The Application Request Routing is a free Microsoft module that supports forward-proxy style scenarios, and enables dynamic load-balancing of requests across multiple web-server machines (allowing you to scale out, move machines behind DMZ firewall scenarios, and bring machines in and out of a farm for maintenance without disruption). 

    In addition to URL Rewrite and Application Request Routing, there are dozens of other web server modules you can select that enable WebDAV, Secure FTP, automated deployment, remote database management through the IIS admin tool for hosted scenarios, media server streaming scenarios, and more.  You can also install framework additions like ASP.NET MVC, .NET 3.5 SP1, SQL Express and associated SQL administration tools, Visual Web Developer 2008 Express, and more.

    Windows Web Application Gallery

    The web platform installer also integrates with the new Windows Web Application Gallery now online: www.microsoft.com/web/gallery 

    This gallery allows you to easily install existing web applications onto your server.  The gallery contains a variety of popular .NET open source applications (like DotNetNuke, ScrewTurn Wiki and Umbraco CMS) as well as PHP open source applications (including WordPress and Drupal).  You can easily browse and install them using the Web Platform Installer as well (just click the “Web Applications” tab and check the applications you want to install):

    ;

    In addition to downloading the application, the web platform installer will create a new site/application root and configure the appropriate site settings and optionally install the database.

    Summary

    If you haven’t downloaded the Web Platform Installer yet I’d recommend taking a look at it.  I think you’ll find it makes it much easier to configure and get a box up and running, and makes it much easier to find and install the various components of the Windows web server stack, as well as find and install applications to use on top of it.  Overtime you’ll see us ship more and more functionality this way. 

    You can download and start using the Web Platform Installer 2.0 Beta today.  We’ll ship the final release of it this summer.

    Hope this helps,

    Scott

This Blog

Syndication

Powered by Community Server, by Telligent Systems
'