Each data source provides its own technique for searching and manipulating individual items. What’s common in all data sources is the operations we perform with the data: We want to be able to query the data and select the values we’re interested in. It’s therefore reasonable to assume a common query language for all data sources. This common query language was introduced with version 3 of the Framework and is now part of all .NET languages. It’s the LINQ component.
LINQ stands for Language Integrated Query, a small language for querying data sources. For all practical purposes, it’s an extension to Visual Basic. However, LINQ has a peculiar syntax.More specifically, LINQ consists of statements that you can embed into a program to select items from a collection based on various criteria. Unlike a loop that examines each object’s properties and either selects or rejects it, LINQ is a declarative language: It allows you to specify the criteria, instead of specifying how to select the objects. A declarative language, as opposed to a procedural language, specifies the operation you want to perform, and not the steps to take. VB is a procedural language; the language of SQL Server, T-SQL, is a declarative language.
Although defining LINQ is tricky, a simple example will demonstrate the structure of LINQ and its role in an application. Let’s consider an array of integers:
Dim data() As Int16 = {3, 2, 5, 4, 6, 4, 12, 43, 45, 42, 65}
Code language: PHP (php)
To select specific elements of this array, you’d write a For. . .Next loop, examine each element of the array, and either select it by storing it into a new array or reject it. To select the elements that are numerically smaller than 10, you’d write a loop like the following:
Dim smallNumbers(data.Length-1) As Integer
Dim itm As Integer = 0
For i As Integer = 0 To data.Length
If data(i) < 10 Then
smallNumbers(itm) = data(i)
itm += 1
End If
Next
ReDim smallNumbers(itm)
Code language: PHP (php)
Just the statements for indexing the smallNumbers array add a degree of complexity to the code. It would be simpler to store the selected elements into an ArrayList by using a loop like the following:
Dim smallNumbers As New ArrayList
Dim itm As Integer
For Each itm In data
If itm < 10 Then
smallNumbers.Add(itm)
End If
Next
Code language: PHP (php)
Let’s do the same with LINQ:
Dim smallNumbers = From n In data _
Where n < 10 _
Select n
This is a peculiar statement indeed, unless you’re familiar with SQL, in which case you can easily spot the similarities. LINQ, however, is not based on SQL, and not every operation has an equivalent in both. Both SQL and LINQ, however, are declarative languages that have many similarities. If you’re familiar with SQL, you have already spotted the similarities and the fact that LINQ rearranges the basic elements. The equivalent SQL statement would be something like the following:
SELECT *
FROM data
WHERE data.n < 10
Code language: CSS (css)
(You can’t process arrays or other data structures with SQL; this example assumes the existence of a database with a table called data, and that this table contains a column named n.) You’d use the exact same LINQ query to select items from an ArrayList, and a similar statement to select elements from an XML document.
Let’s start with the structure where the selected elements will be stored, which is the result of the query. The smallNumbers variable is declared without a type, because its type is determined by the type of the collection where the data will come from. We select elements from the data array, so smallNumbers is an array of integers. Actually, it’s not exactly an array of integers; it’s a typed collection of integers that implements the IEnumerable interface. The LINQ query starts with the From keyword, which is followed by a variable that represents the current item in the collection, followed by the In keyword and the name of the collection. The first part of the query specifies the collection we’re going to query. As with the result of the query, the variable need not be declared; it has the same type as the elements of the collection.
Then comes the Where keyword that limits the selection. The Where keyword is followed by an expression that involves the variable of the From clause; the expression limits our selection. In this extremely trivial example, we select the elements that are less than 10. The last keyword in the expression, the Select keyword, determines what we’re selecting. In most cases, we select the same value we specified after the From keyword, but not always. Here’s a variation of the previous query expression:
Dim = From n In data _
Where m mod 2 = 0
Select "Number " & n.ToString & " is even"
Code language: JavaScript (javascript)
Here we select even numbers from the original array and then form a string for each of the selected values. The Where part of the statement is an expression, which evaluates to a True/False value and determines whether the current element will be included in the result of the query. As you will see shortly, the criteria can get quite complicated, but the idea is to express a filtering expression that limits our selection.
But why bother with a new component to select values from an array? A For. . .Each loop that processes each item in the collection is not really complicated and is quite efficient. For the time being, LINQ is actually less efficient than the equivalent loop. The promise of LINQ isn’t efficiency (not yet, at least), but its potential for becoming a universal querying language. LINQ isn’t limited to arrays: It applies to collections, XML files, objects, even relational data, and it provides a uniform querying language regardless of the data source. LINQ is an extension to the .NET Framework that allows developers to query any data source that implements the IEnumerable or IQueryable interfaces. Collections, XML files, and DataSets implement these interfaces and can be queried with LINQ. The DataSet is a structure for storing data you retrieve from a database at the client, and it’s discussed in detail in Chapters “Programming with ADO.NET” and “Building Data-Bound Applications”.
LINQ Components
To support such a wide range of data sources, LINQ is comprised by multiple components, which are the following:
LINQ to XML
This component enables you to search XML documents in many ways. In effect, it replaces XQuery expressions that are used today to select the items of interest in an XML document. Because of LINQ to XML, some new classes that support XML were introduced to Visual Basic, and XML has become a basic data type of the language. The following statement declares an XML variable, and it’s quite valid VB code:
Dim Employees = <Employees>
<Employee ID="1001">
<Title>Developer</Title>
<Name>John Doe</Name>
</Employee>
<Employee ID="1002">
<Title>Manager</Title>
<Name>Joe Doe</Name>
</Employee>
</Employees>
Code language: HTML, XML (xml)
If you enter this statement in a VB project and hover the pointer over the Employees variable, you’ll see that its type is XElement. This type belongs to the System.Xlinq namespace and is new to VB 2008 and treated by VB as a new type. Notice that there are no line-continuation symbols in the XML segment, because line breaks are of no consequence to XML documents. Moreover, as you enter XML statements in the editor, the XML Editor’s facilities are activated. LINQ to Objects This component enables you to search collections of built-in or custom objects. If you have a collection of Color objects, for example, you can select the colors with an intensity of 0.5 or more via the following expression:
Dim colors() As Color = {Color.White, _
Color.LightYellow, Color.Cornsilk,
Color.Linen, Color.Blue, Color.Violet}
Dim brightColors = From c In colors _
Where c.GetBrightness > 0.5
Code language: PHP (php)
Likewise, you can select the rectangles with a minimum or maximum area by using a query like the following:
Dim rects() As Rectangle = _
{New Rectangle(0, 0, 100, 120), _
New Rectangle(10, 10, 6, 8)}
Dim query = From R In rects _
Where R.Width * R.Height > 100
Code language: PHP (php)
LINQ to SQL
This component enables you to query relational data by using LINQ rather than SQL. You will find examples of LINQ to SQL samples later in this chapter.
LINQ to DataSet
This component is similar to LINQ to SQL, in the sense that they both query relational data. The LINQ to DataSet component allows you query data that have already been stored in a DataSet at the client. DataSets are discussed in detail later in this book, but I won’t discuss the LINQ to DataSet component, because the DataSet is an extremely rich object and quite functional on its own.
LINQ to Entities
This is similar to the LINQ to Objects component, only the objects are based on relational data. Entities are not discussed in this tutorial.