Welcome


The front page shows all the recent posts. To see a more organized view use the tabs above.


WhsDbDataDump - 1.0.0 Build 3 BETA

Posted by Alex at 12:16 AM

Notes:

This fixes a problem in the cluster selection algorithm that can cause incorrect data to be output under certain circumstances.

This build also writes out Alternate Data Streams (http://msdn.microsoft.com/en-us/library/aa364056%28VS.85%29.aspx) in ntfsdump mode. If you’ve ever wondered how windows knows if you’ve downloaded a file with Internet Explorer, so that it could prompt you for your permission every time you run it, this is how. Yes, this was before UAC.

Changes:
  • Corrected WHS cluster selection algorithm. This would have sometimes caused incorrect cluster data to be written out. Affects both rawdump and ntfsdump modes.
  • NTFS Alternate Data Stream (ADS) support. Alternate streams will now be written in ntfsdump mode if there are any.

Download 1.0.0.3 BETA

WhsDbDataDump - 1.0.0 Build 2 BETA

Posted by Alex at 2:59 PM

Notes:

WhsDbDataDump is now capable of extracting backed up files from a Windows Home Server backup database. It has its own Windows Home Server backup database parser and NTFS implementation built in and does not rely on any third party libraries/dependencies. It tries to use the database in a fault tolerant way, so that if something goes wrong it will try to continue and not break.

This is the first build to have this feature, so consider it experimental.

The NTFS dumper supports sparse files but does not support NTFS Compressed files, NTFS Encrypted files and multiple data streams in this version.

Example Usage:

  • WhsDbDataDump.exe /ntfsdump=”c:\WhsBackup”
    • This will read the Windows Home Server backup database files from the current folder and extract all files to c:\WhsBackup.
  • WhsDbDataDump.exe /dbfolder=”c:\RecoveredWhsDb” /ntfsdump=”c:\RecoveredPictures”
    /find=”*.JP?G”
    • This will open the Windows Home Server backup database located at c:\RecoveredWhsDb and extract all files matching the pattern *.JP?G to c:\RecoveredPictures. The match is case insensitive.

There is also a /findregex parameter that can be used instead of /find for more fine grained control over what gets extracted. This uses a regular expression (http://www.regular-expressions.info/reference.html) to filter the files extracted. The regular expression is matched against the full path of the folder and file in the backup volume, if the match passes the file is extracted.

Changes:
  • Added ntfsdump operating mode. Capable of dumping files directly from backed up volumes. Supports sparse files. No support for compressed files, encrypted files or alternate data streams yet.
  • File filtering option (find/findregex) for ntfsdump.

Download 1.0.0.2 BETA

WhsDbDataDump - 1.0.0 Build 1 BETA

Posted by Alex at 10:09 PM

Notes:

WhsDbDataDump is a new utility that is similar to WhsDbDump, but it will extract actual backed up data out of the Windows Home Server Backup database. For this first build it only supports one operating mode which is capable of dumping an entire backed up volume to a raw file. In future versions I will add more operating modes capable of extracting actual files from NTFS backup volumes.

One interesting feature of WhsDbDataDump is that it will try to work with broken databases. In particular, this build supports extracting data from databases with missing dat files. It will simply fill in the missing clusters with a special signature $MISSINGCLUSTER.

Changes:
  • Basic implementation that supports raw data output.
  • Optimized raw cluster output.

Download 1.0.0.1 BETA

WhsDbCheck - 1.0.0 Build 15 BETA

Posted by Alex at 9:34 PM

Notes:

Well here it is, the new build with extended error information. I should warn that this build features sweeping changes in the code and therefore might not be as stable as the last build, but you never know.

So what does extended error information mean? It's really a twofold change:

  1. Errors are now reported with much more detail (with extended ASCII). They all feature extended descriptions telling you exactly what went wrong, and in many cases have relevant data from where the check failed, such as a file offset.
  2. Before this version there were 2 types of errors, an Exception and a check Error. When encountering either the check would abort and both meant that your data is most likely corrupt. Now there is a new type of error, well it's not really an error, it's a Warning. A Warning means that there was an inconsistency found within the database files, but the problem does not directly affect the integrity of your data. Unlike an Error or an Exception, when a Warning is encountered it will be noted on the screen, but the check will continue to run. All Warnings, Errors and Exceptions will be displayed at the end of the check in all their glorious ASCII detail.

As you can imagine, both of these required a severe overhaul of every check’s logic and were pretty significant changes.

So in summary, WhdDbCheck tries to be as detailed as possible as it performs hundreds of checks on the backup database, depending on the size of your database really. But it will only register an Error if it is sure that the problem detected is causing data loss.

Also, you should be aware that just because WhsDbCheck considers something a Warning that does NOT mean that the Windows Home Server backup engine will be able to open the database. That will depend on how resilient it is to database problems.

So what use is a Warning you might ask? If you can’t open the data with the Windows Home Server, then shouldn’t that be an Error? Yes, it should. Except that the next utility that I will release is called WhsDbDataDump which will be able to open up damaged databases and extract backed up files from them. So that is how I decided what to make an Error and what to make a Warning. If WhdDbDataDump will be able to get the original data with 100% integrity, then it’s a Warning, otherwise it’s an Error.

In general, if a database is registering any Warning/Errors then it should be considered compromised. But if it’s a Warning, then you did not technically loose any data.

As for WhdDbDataDump availability, it’s in the planning stages now, and I will only give a very rough estimate of months until the first build surfaces.

There will be another build of WhsDbCheck soon adding another crucial feature and maybe fixes to .

Changes:
  • New status presentation.
  • Each check can now have warnings associated with it. A warning will be reported but will not stop the test. Add warnings will be re-listed at the end for the run.
  • At the end of the run, all errors/warnings/exceptions will be reported with an extended description indicating what went wrong and advice on how to remedy the situation.

Download 1.0.0.15 BETA

WhsDbCheck - 1.0.0 Build 14 BETA

Posted by Alex at 11:18 PM

Notes:

This release completes the underlying progress reporting changes. Most of the changes are under the hood and invisible but give me the ability to tweak things on a more fine a level. The estimated time should be much better in this release, although at this point I'm sure it's not perfect as I tweaked it by hand.

So at this point the progress reporting message gives the overall test completion percentage, the estimated time remaining until the entire test completes, the current step, the total number of steps, and finally, the current step's progress. Note that the estimated time left will change dynamically with system load. It has a memory of sorts, so past slow downs will affect the estimate. But the memory is not indefinite, it will completely forget of a slow down after a few minutes. So effectively, you can say that it always tries to compute the time remaining based on the recent rate of progress.

Next comes the status / error reporting overhaul. The idea is to have much more meaningful error messages for every check failure. This actually inspired the latest flurry of builds anyways, so it will be good to get that done.

Changes:
  • Better unreferenced file check that checks each individual backup file and is capable of reporting exactly which files are unreferenced/uncommitted.
  • Progress estimation extended to be more fluid and extensible. This also corrects an issue with the last build's progress estimation on multiple Level 4 passes.

Download 1.0.0.14 BETA

WhsDbCheck - 1.0.0 Build 13 BETA

Posted by Alex at 10:39 PM

Notes:

Re-architected the status/progress reporting for better integration into future products. Fixed a bug that was causing cluster checks to pass on incorrect data in some rare cases (Level 2+). Put in an explicit int parse check. Before this, we would throw an index out of range exception on corrupt data, now we get a more informative exception. Again, this only happens if data is corrupt. Tiny bump in overall performance.

Be aware that this version is the first to display overall progress, along with an estimated time to complete the whole check. The time can be somewhat inaccurate at this point, I still have some work to do there that didn’t make it into this build. In particular, the time can shift when changing check types. Even though this shift happens, the estimation algorithm is an adaptive one and should catch on after 30 to 60 seconds and re-compute.

Changes:
  • Better status reporting architecture. Now allows for console color output.
  • Added /pause switch.
  • Better progress reporting architecture with overall progress report and less scrolling.
  • Fix for index cache. Can cause cluster checks to pass even though index was out of range. Now, cluster checks will fail correctly.
  • Added Windows home server int overflow checks.
  • Tiny bump in performance by optimizing out one loop in int parsing routine.

Download 1.0.0.13 BETA

Rehashing the Hash Cache

Posted by Alex at 10:48 PM

In the last post I’ve explained what the hash cache is and what it’s for. So now let’s talk about the new /keephashcache switch. The hash cache itself is written to a file in temporary sub-folder, every time a level 4 check is performed. This is done once per index file and takes a significant amount of time.

Generally speaking, a level 4 check works in 2 simple steps:

  1. Data is read and hashed.
  2. Data is verified with stored hashes.

Step 1, which takes much longer, creates the hash cache and step 2 uses it. So in order to save the time that it takes to perform the reading and hashing for any subsequent checks, we can save the hash cache and reload it next time. That is, as long as the database has not changed. If it has, you must rehash, or else the hashes will not line up with the actual data and the check will fail.

In order to accomplish this, run WhsDbCheck with the /keephashcache switch. This will create additional files in the backup database folders. Now, if you run WhsDbCheck again, it will pick up the hash cache file and use it instead of performing a fresh read and hash. Instead, it will perform a simulated read which is much faster.

If you want to perform a fresh level 4 check, you will need to delete the Index*.md5 files created by this switch. They will be located in the backup database folder.

All in all, use this with care, because it can create a situation where your level 4 check will fail for seemingly no reason.

Up next, an explanation of the different check levels.