User's Guide - Chapter 1: Understanding Visual Patch

Patching software from one version to another is a sophisticated process.

Visual Patch is designed to make the patching process as simple to understand as possible. It takes care of most details for you, such as inspecting your versions to decide which files have changed, and analyzing the individual files in each version in order to extract the differences between them.

However, there are some important concepts that you will want to understand before you begin designing your first patch. Some of these, like the concept of key files, are unique to Visual Patch and are central to how the patching system works.

This chapter will discuss the nature of patching and explain some of the concepts you need to know in order to use Visual Patch effectively.

What is a Patch?

A patch is a file that, when run, modifies or replaces specific files on a computer system, usually to bring an already-installed software product up to date.

Patches are used every day to fix problems, add features, solve unforeseen compatibility issues, and fix security holes. They can be used to update all kinds of files: software executables, word documents, satellite images, medical databases, ocean maps, game data files, even parts of the operating system itself.

Because they only contain the data that has changed, patches are also used to transmit changes to very large files as efficiently and securely as possible.

Benefits of Patching

The role of patches in the software deployment cycle is to get already-installed software up to date after it becomes outdated. Patching technology offers numerous benefits over simply redistributing new versions of the original software in whole form.

Smaller file size

Because they only contain the data that has changed from one version to another, patches can be much smaller than a full software installer needs to be. Especially in situations where large data files are involved, the savings are often dramatic-patches that are less than 1% of the original file sizes are possible.

Reduced traffic

Smaller file sizes translate into reduced bandwidth costs, and reducing the amount of traffic leaves more bandwidth for other services.

Faster transmission speeds

Having less data to transmit means that updates can be sent and received faster, which means less time is spent waiting for updates.

Security

The best way to protect information during transmission is to never transmit it in the first place. By only transmitting the data that has changed, patches reduce the risk of third-party interception. Even if some hypothetical future technology made it possible to "crack" the encryption methods used to package the changes, the unchanged data would remain safe.

Integrity

A patch can’t update something that isn’t there. If a user doesn’t already have your software installed, they won’t be able to apply the patch. And if someone is using a modified version of a file, that file won’t be updated-unless you expressly permit it when you design your patch.

What Can a Patch Do?

The basic role of a patch is to modify or replace files so they match the files in a target version of your software. It also might need to back up or remove any legacy files, i.e. files from previous versions that no longer exist in the current version.

But Visual Patch isn’t limited to just updating files. With a built-in scripting engine containing over 250 high-level actions, it can also perform many other tasks involving everything from Registry changes to HTTP downloads.

With a little bit of scripting, your patch application can handle any advanced task you need it to: copying files, modifying INI files, starting and stopping services-even calling external DLL functions.

This ability to perform system changes in addition to the basic file updating is an important and valuable feature of Visual Patch.

Tip: If you use Setup Factory to build your software installer, you can use actions to add new files to the uninstall control file so they will be removed along with the original files when the user uninstalls your software via the control panel.

Patching Methods

There are two basic methods that can be used to update a file: binary patching, and whole-file patching.

Binary Patching

Binary patching or "delta compression" involves analyzing two versions of a file in order to extract only the data that has changed. The same changes can then be applied to any file that matches the old version, in order to "transform" it into the new version.

Creating a binary patch involves performing a byte-by-byte comparison between the original file and the new file, and then encoding the differences into a difference file. Each difference file contains the actual bytes that are different in the new file, along with a number of instructions that describe which bytes need to change, and which bytes are the same. This information is said to be encoded into the difference file.

Tip: The term "difference file" is often shortened to "diff file" or just "diff."

When the patch is applied, the difference file is decoded, and the instructions are used to build the new file by copying the "unchanged" data out of the old file, along with the "changed" data that was encoded into the difference file.

For example, given an old file "A" and a new file "B," a binary patching engine would compare A to B and then produce a difference file; let’s call it "AB.diff." Once the difference file is created, you can use it to create the B file from any file that matches the A file. In fact, the binary patching engine could recreate B using A and AB.diff.

Because binary patching only stores the parts that have changed, the difference files can be very small-often less than one percent of the new file’s size. The size of the difference file depends entirely on how much data has changed between the two versions.

Each difference file can update a single, specific version of a file to another single, specific version of that file. The encoded instructions in the difference file are only valid for a file that is a perfect match of the original source file. Note that binary patching cannot be used to update a file if it has been modified in any way.

For patches that need to update multiple files, the patch executable will need to contain a separate difference file for each file that needs to be updated. So, for example, to update a single file from version 1.0 or 1.1 to version 1.2, using a single patch executable, it would need to contain one difference file to go from 1.0 to 1.2, and another to go from 1.1 to 1.2.

In most cases, the difference files are so small that you can fit a lot of versions into a single patch executable and still use less space than you would by just including the whole file, as in whole-file patching (see below).

Note: Visual Patch will automatically switch from binary to whole-file patching on a file-by-file basis whenever the total size of all the difference files surpasses the size of the whole file.

In some cases, encoding the differences between the two files results in a binary patch that is larger than just compressing the new file, for example into a .zip archive. This generally means that there are very little similarities between the two files; in other words, the two files are so different that it is difficult to reuse any of the data in the old file. This can sometimes be the case for files that are already highly compressed, or files that are encrypted. Visual Patch is able to detect these cases and will choose whichever patching method provides the best results.

Whole-File Patching

Whole-file patching operates on a different principle. Instead of only containing the parts that have changed (as binary patches do), whole-file patches just copy the entire file. The "patch" is just a copy of the new version.

Whole-file patches can be faster to apply, because they don't have to search through the original file in order to copy the parts that haven’t changed to the new version. They just overwrite the old file with the new one. The downside, of course, is that whole-file patches tend to be much larger than binary patches.

There are, however, two situations where whole-file patches can actually be smaller: when creating a single patch file that is able to update many different versions, and when the files being patched are too dissimilar.

Visual Patch always chooses the patching method that produces the best results. It automatically switches between binary patching and whole-file patching on a file-by-file basis in order to produces the smallest patch possible for your project.

Patching Strategies

Although Visual Patch chooses the right patching method for every situation, it’s up to you to choose an overall patching strategy. The three general strategies that you can choose from relate to the three different "kinds" of patches you can create.

Visual Patch supports three general patching strategies: incremental patching, multi-version patching, and full-history patching.

Incremental Patching

An incremental patch is a patch that is able to update a single, specific version to a single target version. For example, a patch that is able to update version 1.3 to 1.4, and only 1.3 to 1.4, is an incremental patch. Similarly, a patch that is able to update version 1.0 to 1.4, and only 1.0 to 1.4, is an incremental patch.

Incremental patches take full advantage of binary patching. Each patch only needs to contain a single difference file for each file that has changed. This eliminates any unnecessary data in the patch. For example, why bother sending the data needed to update 1.0 to 1.4 if the user has 1.3 installed? Because an incremental patch is targeted at a specific version, it only needs to contain the information needed to update that version, and nothing else.

This is especially true for incremental patches that update two consecutive versions. Although there may be many changes over the entire history of a software product, the changes between any two consecutive versions are typically very small. For example, if there are files that changed from version 1.2 to 1.3, but these files didn’t change from version 1.3 to 1.4, an incremental patch to go from 1.3 to 1.4 doesn’t need to contain any data for the files that changed from 1.2 to 1.3. This minimizes the amount of data that needs to be included in the patch.

Incremental patching generates the smallest and most secure patches possible.

Multi-Version Patching

A multi-version patch, as the name implies, is a patch that is able to update multiple installed versions to a single target version. For example, a patch that is able to update versions 1.2 and 1.3 to 1.4 is a multi-version patch.

Multi-version patches are larger than incremental patches. The more versions that a patch supports, the more information it needs to contain. This increases the amount of redundant data within the patch.

When a user runs the patch, they are only interested in updating a single version: the one they currently have installed on their system. All of the other versions that a multi-version patch supports are just excess baggage for that user.

The benefit of multi-version patches is that they are simpler to coordinate. A single patch file can be used to update multiple versions. Your users have fewer patches to choose from, and you have fewer patches to distribute. If there are 15 different versions of your software in the field, you would need 15 incremental patches to support them all. Using multi-version patches allows you to support all 15 potentially installed versions with fewer patches.

Full-History Patching

A full-history patch is able to update every previous release of your software up to a single target version. It is essentially a multi-version patch for every version of your software.

Full-history patches are the simplest patches to coordinate since the same patch file can update all versions of your software. Your users don’t have to know what version they currently have in order to choose the correct patch. You only have to provide a single download that will work for all of your users. It’s simple, straightforward, and uncomplicated.

However, full-history patches are the largest patches you can produce. In some cases they can even approach or surpass the size of a full install. For this reason, you will want to weigh the benefits of full-history patching vs. the other patching strategies.

Finding the Right Balance

Each patching strategy has different benefits and limitations. You will need to choose a combination of strategies that provides an appropriate balance between file size and logistical simplicity.

Since a lot depends on how many versions you need to support, and what methods you use to distribute your patches, there is no single "right way" to handle everything. The frequency of your updates is another factor that can determine how "up to date" your users are. For instance, if you release new versions often, and you don’t use an automatic updating technology like Indigo Rose’s TrueUpdate, the chances are higher that several of your users will be more than one version behind.

You will probably want to use a combination of incremental and multi-version patching in order to get the benefits of both. One strategy that works well is to use two separate patches:

• An incremental patch to go from the previous version to the current version

• A multi-version patch for all of the other versions

The assumption is that if most of your users always stay up to date, you will save a lot of bandwidth by providing the incremental patch. This is especially true if you use technology like TrueUpdate to keep your users up to date automatically.

Even if your patch distribution isn’t automated-for example, if the users just click on a download link and run the patch themselves-this approach provides a good balance between minimizing patch size and making things less complicated for the user (by giving them fewer patches to choose from).

If you are using a tool like TrueUpdate, you might want to provide even more patch files, and let TrueUpdate decide which one to download and run.

For example, if you’re about to release version 1.29 of your software, and you know that most of your users are using 1.28 and 1.27, it would make sense to create two incremental patches: one for the users of 1.28 and one for the users of 1.27.

You might also want to create a few different multi-version patches, for example one for versions 1.20 through 1.26, and another for the really old versions 1.0 to 1.19.

Ultimately, the patching strategy you choose depends on the number and sizes of files you need to update. The key point is to consider which method makes the most sense for you, given the distribution methods available to you, and how comfortable your users are with the patching process.

Versions

A version is the collection of files and folders that makes up a single release of your software.

If your original release is version 1.0, then all of the files in that release-everything that gets installed onto the user’s system-constitutes one version.

Each time you modify your software and release it to the public, you create a new version. For example, if your next release is version 1.1, all of the files in that release constitute another version.

Note that a version isn’t just the files that have changed from one release to the next. Each version contains all of the files in your software from a specific point in time. It even includes all of the files that remain the same.

In essence, a version is a complete copy of your software from a specific point in its life cycle.

Version Tabs

Each version is represented in Visual Patch by a version tab on the project window. The version tabs are listed in increasing order, with the oldest supported version on the left and the newest supported version on the right.

Your project should have one version tab for every version of your software that you want to build patches for. Whenever you build the project, you will be able to select which of these versions you want that particular patch to support.

Each version tab contains a file list where you can add the files that belong to that version. Adding a new version to a project involves adding a new version tab, and then adding all of the files from that version into that tab’s file list.

You should put all of the files from each version onto a separate version tab. So for example if you have version 1.1 and version 1.2, put all the files from 1.1 onto a "1.1" tab, and put all the files from version 1.2 onto a "1.2" tab.

Note: Make sure you don’t have any "empty" tabs, or Visual Patch will report an error when you build your project.

For more information on version tabs, see chapter 4, Versions and Files.

Version Management

Since each version tab needs to reference the files that belong to that version, you need to keep a copy of each release of your software.

Each version of your software should be stored in a separate location on your system. A good way to organize your versions is to keep them in separate folders, with each folder named according to the version it contains. Each folder should contain a complete copy of your software, with all internal subfolders intact.

Visual Patch uses delta compression to create amazingly small patch files. This binary differencing engine needs to analyze the entire contents of each file in each version in order to build a binary patch that contains only the data that has changed from one version to another.

The Installed Version

The version that the patch application detects on the user’s system is known as the installed version. It’s the version of the software that was installed by the original software installer.

For example, if the user has version 5.8 of your software installed on their system, and the patch application successfully locates it, "5.8" is the installed version.

In some cases, there may be more than one version of a software program installed on a user’s system. In these cases, the installed version is the one that the patch application identifies as the version to update. Usually this will be based on some kind of information that was recorded on the system by the installer, for example a "current version" entry in the Registry.

The Target Version

The version that a patch application is designed to update an installed version to is known as the target version.

For example, if you create a patch to update version 5.8 to version 5.9, "5.9" is the target version.

Version Detection

Before a patch can begin updating, it needs to determine whether a compatible version is installed on the user’s system. In other words, the first task that the patch must perform is to locate and identify the installed version of your software.

The location where your software is installed is referred to as the application folder.

The Application Folder (%AppFolder%)

The application folder is the folder where your software is installed on the user’s system. Finding the application folder is very important-without it, the patch has nothing to update.

In Visual Patch, the search for the application folder is implemented using actions in the project’s On Startup event. When you use the project wizard to start a new project, it automatically configures the On Startup script to handle this for you.

In screens and actions throughout the project, the application folder is represented by a session variable named %AppFolder%. Storing the application folder path in this session variable is the ultimate goal of any version detection method.

In the default action scripts, this is handled by the VisualPatch.CheckFolderVersion action. This action inspects the folder to determine whether it contains all of the key files for a compatible version.

If the folder meets all of the requirements and is recognized as a compatible version, the VisualPatch.CheckFolderVersion action stores the folder path into the %AppFolder% session variable.

Tip: For more information on session variables, see chapter 7.

Detection Methods

In addition to any "custom" methods you might implement, there are three standard detection methods that are used in Visual Patch. Each of these will be implemented for you by the project wizard in the form of an action script in the On Startup event.

By default these detection methods are designed to follow a specific sequence. Assuming all three methods are enabled, the sequence is to check the current folder, check a specific Registry key, and then perform a file search on the user’s system.

Current Folder

The current folder method checks the folder that the patch is running in to see if it contains a recognizable version of the software. This is done by checking for specific key files in the folder.

This method is useful when your software is installed in more than one location on the user’s system, and the user wants to control which instance of the software is patched. Since the current folder check is performed before the other detection methods, it allows the user to override the other two detection methods by copying the patch file into the folder where the installation they want to patch is installed.

If the current folder doesn’t contain a recognizable version, the patch moves on to the next detection method.

Registry Key

The Registry key detection method attempts to retrieve the application folder path from a specific Registry key. This is the recommended detection method, since it is the fastest and most reliable way to locate the application folder.

In order to use this method, your software’s installer needs to have written the application folder path into a Registry key so that it can be retrieved by the patch.

If an application path is found in the specified Registry key, the patch will verify that it points to a valid version of your software. As in the current folder method, it does this by confirming the MD5 signatures of specific key files in the folder.

If no path is found, or if the key files don’t match, the patch will proceed with the next detection method (assuming it is enabled).

File Search

The file search method searches the user’s system for a folder that contains a version of your software, by checking every folder for the existence of key files. The search ends when it finds a folder that contains all of the key files for a compatible version and the MD5 signatures prove that the key files are a perfect match.

Custom Actions

Since it’s ultimately all done with actions, it’s possible to use a completely different method to determine the folder where your application is installed. In fact, you could even write a script that just set %AppFolder% to a hard-coded path if you were absolutely certain that your software was installed at the exact same place on every system, and was never modified or installed incorrectly.

In the vast majority of cases, though, you will want to use the standard methods described above.

Key Files

Each version of your software usually includes one or more files that are unique to that version. For instance, as new features are added, your software’s main executable might change, along with a help file and perhaps a few data files. Visual Patch refers to these "identifiable" files as key files.

Key files are used to locate and identify your software on the user’s system.

Designating a file as a key file means that you want Visual Patch to verify its existence and its MD5 signature in order to fully identify the version it belongs to. If the key file doesn’t exist, or its MD5 signature doesn’t match, Visual Patch will consider that version not found.

Each release of your software must have at least one key file, but you can specify as many as you want. It’s important to remember that every key file in a version must be found in order for that version to be identified. In other words, a user must have all of the key files from a given release installed in order for their version to "qualify." If you have four key files for a particular version and only three of them are found, the version on the user’s system won’t be considered legitimate. The same goes for a user with three key files from one version, and one from another. All the key files must match the original files from a single release absolutely.

It’s also important to remember that each key file will be verified by its MD5 signature. Care should be taken to avoid selecting key files that are likely to change for legitimate reasons once they’re installed. For example, if your software uses a database file that is constantly updated, that file wouldn’t be a good choice for a key file because its MD5 signature will change. The key files on a user’s system must all be present, and their MD5 signatures must all match the original values determined for those files at design time.

If the patch doesn’t find a valid release anywhere on the user’s system, the user won’t be allowed to update their software.

Choosing Appropriate Key Files

Key files are usually files whose contents are unique to a single version. Visual Patch requires at least one unique key file per version in order to uniquely identify that version. The file names and paths don’t need to be unique, but their contents do. In other words, you must designate at least one key file per version whose contents are different from every other version. Otherwise, the patch won’t be able to tell that version apart from the others.

Key files must also not change after being installed or patched because Visual Patch relies on their MD5 signature for validation. Files that are normally modified after they are installed (for example, .ini files) should never be designated as key files. If a key file has been changed in any way from the original file that is referenced in your project, it will prevent the version from being identified.

Each version can contain as many key files as you like. In fact, it’s a good idea to designate additional key files to help ensure a positive identification.

Tip: Good candidates for key files are executables, images, help files, PDF docs, readme files...anything that can be used to tell one release from another, but isn’t expected to change once it’s installed.

The best key files are files that change from one version to the next, such as the main executable for the software. As a matter of fact, having a main .exe file that is different from one version to another is the perfect example of a key file. It’s a file that must exist (if the .exe isn’t there, the software isn’t properly installed). It’s also usually different from one version to the next, even if all that changes is the version number or the text on the "Help > About" window.

Mission-Critical Files

Although at least one of the key files in a version needs to be unique to that version, they don’t all have to be. You can also use the key file feature to perform validation on mission-critical files. For example, if your software contains a really important file whose existence and integrity must be checked, making it a key file will prevent the patch from proceeding if the file has been removed or modified.

Remember: a version will be identified only if all of its key files are there and they match the originals exactly.

MD5 Fingerprinting

Visual Patch calculates the MD5 fingerprint of each file in order to identify files that are the same and in order to detect whether two versions of a file are different.

The MD5 algorithm is a standard algorithm that is widely used to generate cryptographic signatures. It was developed by Professor Ronald L. Rivest of MIT.

To quote RFC 1321, which describes the MD5 standard:

[The MD5 algorithm] takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest.

An MD5 fingerprint is essentially a large number that is calculated directly from the entire contents of a file. If even a single byte within the file changes, the MD5 signature changes as well. For all practical purposes, no two files can have the same MD5 fingerprint unless their internal data matches exactly.

Unrecognized Files

Once the patch application has detected a valid version, it will update each file in the release on an individual basis. By default, Visual Patch will only update files that it recognizes. If a file that is part of the installed version doesn’t match the original source file in your project, it will not be updated. Visual Patch will skip over the unrecognized file and continue with the rest of the patching process.

This behavior is both a security feature and a requirement:

• If whole-file patching is being used, preventing the file from being installed unless the user has a recognizably valid version is a security feature.

• If binary patching is being used, it is impossible to update the file unless its contents exactly match the original file.

You can override this behavior for individual files by enabling the "Force install" option in your project.

Version Numbering

The whole point of Visual Patch is to make it easier for you and your users to update your software. One part of this process is deciding on the version numbering scheme you will use to identify each release of your software.

Visual Patch allows you complete freedom in naming your versions. (You could call one version "George" and the next version "Henry" if you wanted to.) We recommend using an industry-standard version numbering scheme that will be more readily understood by your users. Whatever you decide on, here are some guidelines you might want to consider.

Give each release a number

Using numbers makes it easy for your users to identify the hierarchy between different versions of your software. If you name one version 1.0.3.2 and the next version 1.0.3.3, it’s readily apparent which version is the newer one.

Make the numbers mean something

Version numbering schemes like "version.revision.sub-revision" are popular because they allow you to make the magnitude of an update readily apparent in the version number itself. Going from 1.5.2 to 1.5.3 normally indicates a small change, like a bug fix. Going from 1.5.3 to 1.6.0 would indicate moderate changes, such as new features being added, or improvements to the program code. Going from 1.6.0 to 2.0.0 would be reserved for sweeping changes, like complete rewrites or a completely new interface design.

Aim for clarity

Try to avoid version numbers that might confuse your users. A prime example is a number like 1.10. Is this version newer or older than 1.9? That depends on the numbering scheme being used. The standard "version.revision" scheme makes 1.10 newer than 1.9, since it marks "the tenth revision of the first version" of the product. But many users mistake the version number for a fraction. Even more savvy users that are aware of the "version.revision" standard might wonder if you were as savvy as they are-enough software has been released using fractional notation to make it a difficult guess. One solution is to use double digits for each part, so that 1.9 becomes 1.09 and the ordering becomes readily apparent.

Don’t rely on file sizes or date stamps

Make it a habit to issue a new version number whenever you release a new version of your software. Don’t expect your users to identify versions based on changes in the file size or date stamp alone. Not all users will be able to determine this information easily, and date stamps may be subject to change as files are downloaded or copied.

Simpler is better (within reason)

Avoid using a numbering scheme that involves long awkward version names like "49823.B345.14231-A." At the same time, avoid overly simple schemes that might limit your ability to release updates often.

Don’t go overboard

Your external version numbers don’t need to reflect the number of compilations your software has been through since the last version. They also don’t need to reflect how many failed attempts there were before a new feature started working. The version numbers your users see should only reflect the changes that are visible to them. Keep external version numbers "tight" between consecutive releases. (Releasing version 2.0.29 right after version 2.0.21 could have your users searching for nonexistent versions like 2.0.28 if they run into any problems with 2.0.29. If you must track recompilations internally, consider keeping your internal version numbers separate from the version numbers that your users will see.)

Be consistent

If your numbering scheme is "MajorUpdate.MinorUpdate.BugFix," don’t start incrementing your "major update" number when you’ve only put out a bug fix.

Avoid unnecessary changes

Once you decide on a numbering scheme, stick to it. Switching from one scheme to another could be confusing for your users, especially if the new version numbers look similar to the old ones. If you really must switch, make sure you take steps to explain the changes to your users.

Tell your users what has changed

It’s always good to tell your users what new features or bug fixes a new version brings. A rich and detailed version history makes a good impression on your users, because it shows how much time and effort you’ve spent improving your product.

Help URL: http://www.yourdomain.com/help/index.html?users_guide_chapter_1__understanding_visual_patch.htm

Learn More: Indigo Rose Software - Visual Patch - Buy Now - Contact Us