TheSharperDev

Posts about C# and F#

An Introduction to C# Expression Trees

Expression Trees are an interesting C# language feature that you might not have knowingly used before. Expression Trees are fundamental to Entity Framework being able to turn C# code into SQL queries. So if you’ve ever used Entity Framework then you’ve definitely taken advantage of this language feature.

For example, here is the SelectExpression class in the Entity Framework Core (EF Core) library, searching for the term Expression shows that is exists 652 times in this class, pretty important!

Lately I’ve been interested in how Entity Framework turns C# into SQL and wanted to write an article about Expressions to aid in my understanding of the subject.

As a note, there is a difference between Expressions, but across the internet and in software, such as Entity Framework, you’ll find Expression Trees are called Expressions. Which makes it a bit confusing and hard to google.

Why do we need Expressions?

Most of the time, the code we write is intended to be executed in the exact place that we’re writing it.

For example, this method is used to sum numbers.

Whenever this method is called with two numbers, it returns the sum. It always operates inside this method, inside of a C# runtime.

What if I were to tell you that I needed you to write some piece of code that was intended to be run somewhere else? Like a database?

You can’t pass code like this to a database (more realistically, some library that communicates with the database) and expect it to know what to do with it. You need a different way to represent code.

That is what Expression Trees or Expressions are for.

Expression trees represent code in a tree-like data structure, where each node is an expression, for example, a method call or a binary operation such as x < y. ~ Microsoft Docs

Expressions are a different format to describe code. They’re a data structure that represents code. They’re also “portable” in the sense that Expressions can be passed around and some other piece of code can investigate it to see what it’s suppose to do.

Here’s an example of the earlier Sum method but now it’s in expression format.

It performs the same functionality, adds two numbers, but it’s defined completely differently. That difference gives it it’s “portability”. The visitor pattern is a common way to investigate an expression to figure out what it contains or means, which I hope to explore soon on this blog.

Both parameters are defined by a ParameterExpression which denotes the type. Those parameters get passed into a BinaryExpression which adds them. Then all three of those Expressions are passed into a Lambda Expression which results in an Expression<Func<int, int, int>>, which is a mouthful.

Expressions can’t be called directly, so in order to use it we need to compile the Expression. Which turns it back into a regular Func<int, int, int> which can then be called and the sum returned.

I believe most anything that you can write as regular code can be written as an expression. I wouldn’t recommend it, but I believe it is possible.

Hello World Expression

What would Hello World look like if it was written in Expressions? Something like this.

Lets go through this line by line.

Line 3 defines our “Hello World” message as a constant of type string.

Lines 5-7 defines the call to the method to print the message. The Expression.Call() method takes in two arguments,

  1. Information about the method (line 6), which consists of: the type the method exists on (Console), the method name* (WriteLine), then what are the parameters we can pass to this method, (1 parameter with type string).
  2. The parameter (line 7).

Line 9 pushes that expression into a lambda expression.

Line 11 compiles that expression into an Action.

Line 13 runs the Action.

*You’ll notice that the name of the method is a string, so there’s no compiler checking for you on that.

So that’s a Hello World example. Here are some examples of more expressions.

Instead of printing “Hello World”, this expression can now print any arbitrary string. The main difference is we’re defining a ParameterExpression instead of a ConstantExpression.

Now we’re printing any int. So we’ve updated some of our types and also combining defining our action expression and compiling it.

Finding Max Number

Here we are taking two ints and calling the Math.Max function. Notice that in this example we’re using a Func instead of an Action. Action is only defined for 0 or 1 parameters. Now that we have 2 parameters we have to use Func.

Calculate Slope

This one gets a bit more complicated, 4 parameters. It calculates the slope between two points on a line, (x1, y1) and (x2, y2), using the formula (y2-y1)/(x2-x1).

Extract Order Id

Now instead of dealing with primitives we’re dealing with an Order object that has the property OrderId. Notice that we have to define a MemberExpression to extract a member of a type.

Sum Order Costs

Then the last expression example for today, taking two Order objections and adding the cost for each one.

Final Thoughts

At first building expressions by hand is odd. The expression version definitely takes a lot longer to define, but those extra steps helps an expression be more usable than a line of code that you would find in a normal C# program.

I’m just getting my feet wet with Expressions and will continue to learn and blog about them here.

If you have any questions or observations I’d love to hear about them in the comments below.

Resources