A little about Linq

Linq is a creation of Microsoft.  It is designed to solve a number of problems.  Part of the idea of Linq is to be able to abstract queries from the type of data source.  This enables all sorts of useful things.

So, for example, if we have a query

From c in Customers Where c.Tel <> “”

We don’t actually know what Customers is – it could be from a database, XML file, a set of objects in memory, from a web service, a text file, a third party API or any number of different sources.  The point is that we can write that query without worrying about that, and if we decide later to change what we are querying, or to use that same query for multiple sources, it should work just as well.

So how does that work then?  What happens when you write a Linq query is that the query is built into an Expression tree.  Expressions are the basic building blocks of the query.  An expression may be something like c.Tel <> “”, which would be broken down into…

  1. Access a property on the parameter c called Tel
  2. A constant, which is an empty string
  3. Take 1. and 2. (above) and use as the 2 sides of a binary operator, where the operator is not equals

What then happens is that when the query is executed, the provider for the particular type of data source will take those expressions and convert them into a form that it can execute (or possibly execute them directly).  For example, the Linq to SQL provider will take the parameter c and convert it into a string of “[c]”.  The property access on it will then get tacked on the end as “.[Tel]”, giving “[c].Tel”.  The empty string constant will be another string – “”'” (that’s two single quotes).  The not equals operator would be “<>” with the above stuck on either side, giving “[c].[Tel] <> ””.  The rest of the query would be compiled this way into SQL and then run to get the results.  The results would then be translated back into appropriate objects.

Now, if the same query was run on a set of objects in memory, the Linq provider would probably convert the expression into a lambda function, which would probably look like Function (c as Customer) c.Tel <>””.  It would then run a loop with some code a bit like the following (this is simplified a little)…

Dim WhereFunc = Function (c as Customer) c.Tel <> “”
Dim ret as New List(of Customer)
For Each c in Customers
If WhereFunc(c) Then ret.Add(c)
Next c
Return ret

As I said, this is a little over-simplified.  In actual fact, Linq queries are not executed until you explicitly try to access the results.  This is typically either using one of the extension methods suck as ToArray, ToList, Single, SingleOrDefault, First, FirstOrDefault or just starting to iterate through the data typically with a loop.

One of the more interesting things that you can do with Linq queries is that you can actually build queries on top of queries.  What this does behind the scenes is to actually create a new expression tree, usually containing a copy of the old one.

So, for example…

Dim cs = From c in Customers Where c.Tel<>””
If MobileRequired Then cs = From c in cs Where c.Mobile<>””

This is a very useful way of implementing a search form.  The effective query at the end if MobileRequired=True will be similar to (in SQL)
SELECT * FROM Customer c WHERE c.Tel<>”” AND c.Mobile<>””

It is also possible to manipulate expression trees in order to create new modified versions of them to execute.  This can be useful in all sorts of ways, but it is too large a topic to include in this post, so I will hopefully come back to it at some point in the future.

Another benefit of Linq that I haven’t mentioned is that because it is part of the actual language, the compiler is aware of whether a query is well formed and querying actual existing fields.  This also means that Linq works well with intellisense, so Visual Studio can provide you with suggestions as you type a query, which is a very useful time saving feature.

Anything you think I’ve missed?  Please let me know!

Leave a Reply