Last year, John Bristowe at Microsoft Canada asked several of us to present some pre-written talks at Tech Days in Vancouver. I was interested in building up my presentation chops anyway, so I agreed.

Looking over the topics he needed presenters for, the majority involved Silverlight or IE8 — stuff concerned with the presentation layer. One, however, was for ADO.NET Data Services, and since I work in Habanero's BI practice, and get involved a lot with databases and data warehousing, I thought that this might be the best fit for me.

When I volunteered, I had never even heard of this technology, but it sounded like it had something to do with data, which I thought was pretty cool. They had given us a recording of the original presenter, and the first time I watched it (in the middle of the night on a Lufthansa flight over the Atlantic) I was hooked.

It turns out that the technology really doesn't have an application in the BI sphere, although like many Microsoft technologies, it could be made to do so. Basically, they give you a lightweight way to communicate with a backend database from an app without having to write either web services to get the data or your own intermediate data access layer. They expose individual entities and collections of entities (which would typically map to a row of a table or multiple rows of a table) via individual URIs, and automate a whole bunch of plumbing that you would have to write yourself, like filtering and paging. 

In order to take advantage of it, you have to map your database tables (and views, if you want to) into logical entities. The easiest way to do that is to use Microsoft's new Entity Framework technology, which can expose a physical database as a logical set of entities. In the future, there will be other techniques for doing this (most likely plug-ins from third parties). For example, suppose your database had a Customer table, and you've defined that as a "Customer" entity. You have a Silverlight app that starts off by displaying all customers in the table, ordered by Customer Name, and paged so that it shows 20 customers per page. A typical URI to get this would be:                

(Get) <a href="http://<dataserveraddress originalattribute=/">/Customer?$orderby=CustomerName&$top=20&$skip=0">http://<DataServerAddress>/Customer?$orderby=CustomerName&$top=20&$skip=0</a> 

The "Customer" says "give me the collection of customers". The $orderby=CustomerName defines the sort order. $top=20 means "give me back 20 customers", and $skip=0 means "start returning customers from the start". This would return a document like this (I've removed a bunch of the namespace stuff so it looks more concise):

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>

<feed xml:base="<a href="http://DataServerAddress/bikes.svc/">http://DataServerAddress/bikes.svc/</a>">
  <title type="text">Customer</title>

  <id>http://DataServerAddress/bikes.svc/Customer</id>

  <updated>2009-03-12T18:16:35Z</updated>

  <link rel="self" title="Customer" href="Customer" />

  <entry>

    <id>http://DataServerAddress/bikes.svc/Customer(817)</id>

    <title type="text" />

    <updated>2009-03-12T18:16:35Z</updated>

    <author/>

    <link rel="edit" title="Customer" href="Customer(817)" />

    <link rel="<a href="http://schemas.microsoft.com/ado/2007/08/dataservices/related/SalesOrder"> http://schemas.microsoft.com/ado/2007/08/dataservices/related/SalesOrder</a>" 
    type="application/atom+xml;type=feed" title="SalesOrder" href="Customer(817)/SalesOrder" />

    <category term="BikesModel.Customer"/>

    <content type="application/xml">

      <m:properties>

        <d:CustomerID m:type="Edm.Int32">817</d:CustomerID>

        <d:CompanyName />

        <d:Phone />

      </m:properties>

    </content>

  </entry>

  <entry>

    <id>http://DataServerAddress/bikes.svc/Customer(1)</id>

    <title type="text" />

    <updated>2009-03-12T18:16:35Z</updated>

    <author/>

    <link rel="edit" title="Customer" href="Customer(1)" />

    <link rel="<a href="http://schemas.microsoft.com/ado/2007/08/dataservices/related/SalesOrder"> http://schemas.microsoft.com/ado/2007/08/dataservices/related/SalesOrder</a>" 
    type="application/atom+xml;type=feed" title="SalesOrder" href="Customer(1)/SalesOrder" />

    <category term="BikesModel.Customer"/>

    <content type="application/xml">

      <m:properties>

        <d:CustomerID m:type="Edm.Int32">1</d:CustomerID>

        <d:CompanyName>A Bike Store</d:CompanyName>

        <d:Phone>245-555-0173</d:Phone>

      </m:properties>

    </content>

  </entry>

</feed>

In this example, "bikes.svc" is the name of the service. The example I pulled this from is (not surprisingly for Microsoft!) a bikes manufacturing business. Now suppose you display the customer list with a link so that you can click on an individual customer name, and the app will return the details for that customer (based on the customer id). Suppose the user clicks on "ABC Company", which has a customer id of "1". The app would issue a request like this:

(Get) <a href="http://<dataserveraddress originalattribute=/">/Customer(1">http://<DataServerAddress>/Customer(1</a>)

This would return a document like this:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>

<entry xml:base="<a href="http://DataServerAddress/bikes.svc/">http://DataServerAddress/bikes.svc/</a>"> 
  <id>http://DataServerAddress/bikes.svc/Customer(1)</id>

  <title type="text" />

  <updated>2009-03-12T18:22:30Z</updated>

  <author>

    <name />

  </author>

  <link rel="edit" title="Customer" href="Customer(1)" />

  <link rel="<a href="http://schemas.microsoft.com/ado/2007/08/dataservices/related/SalesOrder"> http://schemas.microsoft.com/ado/2007/08/dataservices/related/SalesOrder</a>" 
    type="application/atom+xml;type=feed" title="SalesOrder" href="Customer(1)/SalesOrder" />

  <category term="BikesModel.Customer"/>

  <content type="application/xml">

    <m:properties>

      <d:CustomerID m:type="Edm.Int32">1</d:CustomerID>

      <d:CompanyName>A Bike Store</d:CompanyName>

      <d:Phone>245-555-0173</d:Phone>

    </m:properties>

  </content>

</entry>

One of the questions that I had immediately was, "great, so when would you use this?" Microsoft is great at releasing all kinds of new things that are often similar or overlap. What they often don't do very well, however, is help you decide which of a plethora of tools is better at solving a particular problem. A second observation is that the data services are entity-oriented (naturally!) 

What does this mean for a BI-type application? Suppose you want to return calculated metrics about customers (and not customer information directly) — maybe, "average sale amount for all customers in a given geographic area over the last 3 quarters). In this case, a customer-based data service isn't the best choice. There are a couple of ways around this. You could expose some stored procedures (ADO.NET Data Services allows this via "service operations") that do the calculations on the fly (at a potentially lethal performance hit). A more obvious way would be to pre-calculate the metrics (maybe in a scheduled nightly process) and then define a different set of entities — maybe "AverageSaleAmountByAreaThreeQuarters" — that could then be returned very simply by the data service, because then you would just be accessing rows in a table, or (in Data Services terminology) entities and collections.

One potential downside to using Data Services in this way is that (currently) they are not dynamic in that adding or reconfiguring an entity requires a change in the Entity Framework mapping. If you suddenly had a user request to show "total" sales amount by area over three quarters, and you didn't already have that in your data model, you would have to add a table for it, add the process to calculate it, add the mapping to the Entity Framework model, and redeploy the data service, and update the client app to use it. You're likely not going to be able to turn around client requests very quickly this way.

Another thing to remember is that Data Services are meant to be lightweight. That means that the client app typically makes LOTS of calls, and exchanges relatively little data each time. Since Data Services were really architected for web applications, they are meant to be consumed by highly-interactive pages, rather than big monolithic pages that get all their data up front and take forever to load. So if your app doesn't fit that model, then Data Services probably aren't for you.

One thing to note is that "a data service is not a database". When the beta version of ADO.NET Data Services was released, a bunch of people wrote in with database-specific questions like "what if I want to create an index on the fly?" or "What if I want to give the database a hint in the query?" The answer is "that's not what data services are for". Remember that data services are logical — and are intended to insulate your client application from the physical details in the database. If your app needs to do specific things against your database, then a generic data service isn't the way to go. You should probably consider writing a Web service with your own data access code, and you could do exactly what you wanted to in your SQL query.

Share