Wednesday, January 25, 2006

Binary Serialization of DataSets in .NET 2.0

Last week I mentioned that I would post some test results on the different options we now have with serializing DataSets; Binary (new in 2.0), XML, and custom serialization surrogate. First, let me direct your attention to an updated surrogate class that I used in the test that can handle the new DateTimeMode property on columns.

Now before you read these numbers I'd like to be clear about the scenarios I tested. To make it easy on me I set up the internet test to work with one of our current product's test beds already set up with some real test data. There are two servers in this test bed, one is the application server hosting remote components in IIS (HttpChannel) and accessed by the client using the BinaryFormatter. The second server is the database server. So what I was measuring was also including the time it took to select the records from the database through the components (no business logic, just filling an untyped dataset from a stored proc simple select -- about 19 columns with various data types). The servers are in Florida and I work from home in California where I was running the client. I was accessing them over a cable modem. In the local test I had the exact same code and databse running in all tiers on my local development machine to simulate no network latency.

That said, we don't want look at the numbers per se, look at the trend -- that is what is important here. There are also a lot of factors that affect serialization over the internet on a cable modem and you'll notice that there are a few anomalies in the local numbers with small sets of data, probably because my dev machine hiccupped at that moment.

So to recap, this is what we're measuring;

1) The time it takes to make the remote call to activate the component,
2) The time it takes to select the records from the database and create the dataset,
3) In the case of the surrogate class, the time it takes to convert the dataset to a series of byte arrays,
4) The time it takes to transmit the data,
5) In the case of the surrogate class, the time it takes to convert the byte arrays back into a dataset.

My conclusion is that native Binary serialization of datasets is only better over a long wire when the number of rows are in the thousands. Our application is 99% data entry forms so we would never be returning that much data. The surrogate class does have a slight overhead if your network is not congested/not the internet, however nothing that the user would notice. Therefore, for now, I'm sticking with the surrogate class for our application. I'll let you know if I change my mind later based on more formal load testing.

Okay here are the performance numbers:

LOCAL TEST (Low Network Latency)      
# Records Size (bytes) Transmission Time (ms)
  Surrogate Binary XML Surrogate Binary XML
2 13,761 56,198 11,673 15.63 31.25 15.63
10 15,261 57,419 15,754 15.63 31.25 15.63
20 17,133 58,972 21,018 31.25 31.25 15.63
100 32,233 71,626 63,434 31.25 31.25 31.25
500 107,336 134,641 276,350 62.50 46.88 62.50
1000 203,943 216,285 550,832 93.75 78.13 109.38
2000 392,296 374,665 1,087,187 171.88 125.00 250.00
4000 772,451 694,776 2,174,403 343.75 265.63 515.63
8000 1,521,110 1,323,693 4,383,707 734.38 531.25 968.75
INTERNET TEST (High Network Latency)      
# Records Size (bytes) Transmission Time (ms)
  Surrogate Binary XML Surrogate Binary XML
2 13,761 56,198 11,673 312.50 578.13 312.50
10 15,261 57,419 15,754 328.13 640.63 390.63
20 17,133 58,972 21,018 343.75 655.43 343.75
100 32,233 71,626 63,434 421.88 671.88 593.75
500 107,336 134,641 276,350 906.25 1078.13 1875.00
1000 203,943 216,285 550,832 1484.38 1562.50 3531.25
2000 392,296 374,665 1,087,187 2640.63 2515.63 6750.00
4000 772,451 694,776 2,174,403 5312.50 4500.00 13312.50
8000 1,521,110 1,323,693 4,383,707 9687.50 8406.25 26609.38

Saturday, January 21, 2006


One of the key features of our products is the ability to support multiple deployment scenarios without recompilation. Scenarios ranging from desktop single user to few user workgroup to client-server to web-based n-tier with the option of NLB. This is all done with configuration files. All of our deployments up to now have been web-based n-tier against SQL-Server 2005. So yesterday the owner of the company asks me to load a single user demo system on his laptop. No problem, this should be fun and a good test of SQL-Express.

I didn't realize how easy this was going to be. First thing was to load SQL-Server Express edition on his machine (which also loads .NET Fx 2.0). Since he wanted some demo data, I needed to grab the database on our test DB server. I fired up SSMS, connected to the the test server, detached the database, took a copy the physical Mdf file, then reattached the database. I placed the copy of the database file in the same folder with all our application files and then changed the application connection string to:
Data Source=.\SQLEXPRESS;AttachDbFilename=C:\MyAppFolder\DBNAME_Data.mdf;Integrated Security=True;User Instance=True
This sets up a single user file-based connection string. That's it! You don't have to perform an attach or run any sql scripts if you just want single user file-based access. Wow. I love it! This is a REALLY REALLY REALLY nice feature of SQL-Server. This same database file can be attached to a SQL-Server 2005 database server as the scalabiltiy needs of our customers change. Added with our own application deployment options, our product provides solutions for a very broad customer base. Good job Microsoft!

Thursday, January 19, 2006

Serializing data across time zones in .NET 2.0

This week I've been implementing support in our framework for the new DateTimeMode setting on the DataColumn class. This property is used on DateTime columns to determine a couple things; 1) it specifies how dates should be stored in the column and 2) how a date should be serialized when marshalling across time zones. Setting the DateTimeMode to Local or Utc will not only affect serialization but will also affect how the data is stored in the column. Setting to Unspecified or UnspecifiedLocal does not affect the storage, just the serialization. In .NET 1.x the dataset would always serialize dates as UnspecifiedLocal. This means it would apply the appropriate local time offset when it deserialized. So if you entered the value 12/1/2005 00:00:00 in Florida and passed that dataset to California it would show up as 11/30/2005 21:00:00. This can be a problem depending on the meaning of the datetime value. If you are say storing this in a database and/or not using the time part, when you display the date part to the user you have a problem.

One way to solve this in 1.x is to use a surrogate class which also provides the added ability to serialize the data as true binary. When wrapping the dates up in the surrogate before serialization, you can have it not apply the offset to the DateTime columns. However this approach works across the entire set of columns in the dataset; there's no granularity.

So here comes the DateTimeMode in .NET 2.0. You can set each individual column to the mode you want and the serialization will behave exactly how you want. Great. So lets start designing our typed datasets and you'll notice a DateTimeMode property in the property sheet in the designer. Set that to the mode you want and regenerate your typed datasets... right? WRONG. There's a bug in the designer (and XSD.EXE) where it refuses to code spit the DateTimeMode even though it's properly declared in the xsd file. Well that totally sucks. For more info see the bug report.

So the work around is that you have to set the DateTimeMode at runtime before serializing. Also be careful if your merging an untyped dataset into a typed dataset because the DateTimeMode on the typed dataset will prevail in that case. (See the documentation on compatible merges.)

So with all the workaround code I now have running I turn my attention to the new binary serialization support in the 2.0 DataSet. My initial performance numbers are NOT impressive at all. In fact, the Xml serialization is much faster for small sets of data. This is in drastic contrast to the surrogate class I was using in 1.x. The surrogate class performs better in all cases. I'm updating the surrogate to include the new DateTimeMode support and once I have it all tested I'll post a new GotDotNet sample. I'll also post some more formal numbers once I finish my tests.

Thursday, January 12, 2006

Happy New Year and all that jazz....


Okay I know it's a little late. Sorry I haven't posted in a long while but lots of things were going on for me during the holidays.

For one, I was heads down for the last month upgrading our framework to take advantage of .NET 2.0 and Visual Studio 2005. Things like generics, custom events, ADO.NET enhancements, new WinForms data binding stuff to mention a few. I've also been playing around with the Visual Studio Team System Tester load test tools. Love it.

Also, I was busy in December prepping for my first Christmas dinner. Thank god for my sister. She's a Martha Stewart Wonder Woman. I couldn't have done it without her. She's a scientist (a real one, not computers) and she had the entire Christmas prep and dinner planned out in an Excel spreadsheet. The roast leaving the oven was T-minus zero and everything before the roast was negative minutes and everything after the roast was positive numbers. Of course, at time of execution we ended up drinking too much champagne and the whole planning spreadsheet went out the window and the "Force" took over. We did great though; even Nona said it was a wonderful dinner. I love cooking but I need to watch more Emeril.

Finally, Alan and I threw a NYE party and it was off the hook (do people say that anymore?). Anyways, it was a lot of fun. All the neighbors came and we all got drunk on ice wine martinis my sister was shaking. Some got a little too drunk. Luckily I'm a beer drinker and maintained my cool (and my dinner unlike some people who used my bathroom). Only two broken glasses and no one in the hospital so it was a success.

Resolutions this year? Blog more, read more, get an Xbox 360 and set up media center, get new furnaces, and find a landscaper before the weeds outside grow to "Land of the Lost" size this spring.