Friday, December 17, 2004

My thoughts on Datasets...

I know alot of people have been discussing the Dataset vs. O/R-Custom entity issue. I got involved in a similar topic on the Universal Thread today and thought I'd post my thoughts to this blog as well.

You should definately check out O/R tools and the object approach to creating entities, however, taking a data-based approach to your .NET applications is not a wrong way to go about things depending on your skill set and the type of application you are writing. In fact, you can end up with a very successful distributed entity-based application using datasets. If you prefer writing SQL-Statements to define your entities (views on your data), and are comfortable with having the complex rules and validations in separate objects then the entity-based approach using datasets may be the best choice for you. For instance, if you are familiar with building FoxPro applications this will probably be the easiest approach.

Using a pure O/R mapper means that you're taking a totally OOP approach and will create custom entities (instead of using the dataset as the entity). As many people point out there is a compelling reason to do so: You can encapsulate your business logic right into the entity. Also, it may be easier to exchange your entity with other non .NET systems. However, there are MANY compelling reasons to use datasets directly especially if you're sending them to Winforms clients:

1. Databinding is much easier with the Dataset. This is especially true for Winforms apps where you have 2-way data binding. If you create custom objects you will need to implement a handful of interfaces to get it to work: IBindingList, IList, IListSource, etc. I'm not saying you can't do it manually, but there's some work involved.

2. Filters/Sorting/Views are all made very very easy with the Dataset using the DataView.

3. Complex relationships between datatables can be easily created and managed and referential integrity can be enforced without having to go back to the database. Navigating from a parent row to the child rows in a related table is a snap.

4. AutoIncrementing columns automatically increment the keys when records are added.

5. Easy data manipulation and persistence. This is huge. The dataset takes care of remembering current and original values for you. It handles row state very well. If you add a record then delete it again inside the dataset there are no changes sent to the database. It persists relations, constraints, errors (row and column), calculated columns (expression-based), and has an extended properties collection in which you can put any additional information you need. The dataset also lends itself well to dynamic columns because on a Merge you can specify to automatically add any additional columns it finds. This is very powerful. Sure you can code all of this into your own entity object but you need to ask yourself is it worth it?

6. XML integration/serialization is a snap with ReadXML/WriteXML methods.

7. Simple data validation is built in. AllowNull, MaxLength, Referential integrity, uniqueness, data type. It also has an event model so that you can capture row/column changes. And with Row and Column errors (SetRowError/SetColumnError) you can easily indicate which rows/columns have problems and display them by databinding with the ErrorProvider. Complex validation or validator objects running on the middle-tier can simply set the row and/or column errors and send the dataset back to the client for resolution.

8. Strongly typed datasets are very easy to generate from an XSD file. That XSD which contains all of the schema information for the entity can be dynamically created by calling your middle-tier interfaces that return the Datasets (just temporarily set the DataAdapter's MissingSchemaAction setting to AddWithKey while you're generating them). Then you can run the xsd.exe utility to create the strongly-typed dataset code.

Keep in mind that datasets are NOT a business object in its traditional sense, they are simply the business data. Data is separated from behavior. If you think about data as being something that passes through your tiers and is validated, manipulated, twisted and banged into or by other objects (or other pieces of data) then choosing to use datasets is the right way to go. However, if you are more comfortable with the data as a real business object which encapsulates its own rules and behavior then use an O/R mapper and create custom business objects. But remember, if you are writing a complex Windows Forms business application then datasets "just work".

Of course, some O/R mappers work well with the entity-based approach too. I suggest you read this excellent post by Frans Bouma which may help you figure out what approach you should take.


Rick Strahl said...

I've been down this line of thinking many times, and to this day I have not made up my mind which way I prefer. On paper I would much prefer entities over DataSets merely because it provides the ability keep data and logic together and it's a more OO based approach which is much more inline how I work. It just feels more logical to have an object that can be referenced directly rather than indirectly referencing DataSet rows. However, none of the OR frameworks I’ve looked at and played with make working with data as familiar and relatively logical than using DataSets. I use DataSets in my current business framework at the moment, and although I’m not 100% happy with this implementation it works well for the applications I built which tend to be medium sized and include lots of .NET based distributed. My implementation wrappers DataSets internally and is more of a hybrid with light weight entities that access the DataSet similar to a typed Dataset but without all the overhead of typed datasets. Because of this mixed interface there are some ‘programmatic inconsistencies’ but from a functional level it works extremely well because it essentially provides the best of both worlds.

Tough decision, but it seems to me either Microsoft needs to provide something in this field or the tools need to mature a bit before OR Mappers really take off.

Beth Massi said...

Rick, great comment. I have seen some frameworks out there that encapsulate the business rules AND the dataset into a sort of hybrid "business object" and they seem to work quite well. I on the other hand use a lot of code generation and code generate all my typed datasets directly from my middle-tier so I chose to implement rules and validation in separate reusable objects. Plus using datasets is much more in line of how I think coming from a Fox background.

Anonymous said...

There's an interesting and long thread on this issue in


Andres Aguiar