Category Archives: Uncategorized

Mapping OO patterns to FP

Scott Wlaschin goes through many common OO principles and discusses how functional style languages solve the same issues.

https://vimeo.com/113588389

Advertisements

Error handling with Option/Maybe

Have you ever encountered the Option or Maybe type? It’s a pretty common thing in the Javascript world, especially around the Angular community, but it actually comes from the world of functional programming. Option/Maybe provides a way to communicate the possibility of an error occurring. In this talk Scott Wlaschin goes over a model for how to handle error cases in those pipelined operations in a way that is clean, composable, and unobtrusive.

At one point DDD or Domain Driven Design is referenced. See this presentation for more info on that.

Using values in a mutable system

Andy Matuschak explores how to use a value oriented approach to data within a system that is built around mutable state.

http://realm.io/news/andy-matuschak-controlling-complexity

For more on the idea of what Andy calls the value layer, and perhaps a different framing I would suggest Gary Bernhardt’s talk Functional Core, Imperative Shell.

Functional Principles for Object-Oriented Development

Jessica Kerr presents some great topics that help frame why functional programming concepts are useful even within an object oriented system.

You can read more at her blog

Same old tools, new uses

In case you missed it: Part 1

This is part 2 of my series on functional programming concepts for OO / imperative developers. If you’re a not really clear what imperative means, I’d direct you to Part 1 where I attempt to illustrate some of the conceptual differences between imperative and functional programming. In this part we’re going to take the two tools we established in the previous post and start doing things with them.

Loops

How the *%^@ are you supposed to iterate without mutating a variable? This was my first reaction as well, I think it’s a fairly universal reaction. Let’s take a look at our arsenal of traditional tools for dealing with iteration.

For loops

for (int i = 0; i < 10; ++i) {
    // do something interesting here
}

We declare the variable that should be used to keep track of our progress iterating through the sequence. I’m sure you’ve written such a line thousands if not tens of thousands of times throughout your programming career, it’s canonical, it’s idiomatic, it’s simply the way things are done. It definitely involves mutating a variable many times.

If we wanted to do the same thing to a collection that supports direct indexing such as an array or List<T> then we could basically do the same thing.

int[] numbers = new [] { 43, 75, 6, 2, 91, -5 };

for (int i = 0; i < numbers.Length; ++i) {
    // do something with numbers[i]
}
List<int> numbers = new List<int> { 43, 75, 6, 2, 91, -5 };

for (int i = 0; i < numbers.Count; ++i) {
// do something with numbers[i]
}

This sort of fails though for any collection beyond your standard lists and arrays. Stuff like LinkedList<T> or any tree type structure won’t be indexable this way and although Dictionary is indexable, it doesn’t use a pre-defined sequence. There’s a pretty standard solution to this, the Iterator pattern. Sure enough, all the collections in C# implement IEnumerable which has 1 method, GetEnumerator which returns our iterator. We call MoveNext on the IEnumerator and it knows how to go to the next element in the collection and we call Current to get the current element out of the collection.

LinkedList<int> numbers = new LinkedList<int>(new [] { 43, 75, 6, 2, 91, -5 });

for (IEnumerator<int> iter = numbers.GetEnumerator(); iter.MoveNext();) {
    int num = iter.Current;
    // do something with num here
}

Not quite as nice as the for loop with all that getting of the IEnumerator, but it works for Arrays, Lists, LinkedLists, Dictionaries, and well, anything that implements IEnumerable (which is everything collection’ish). This version like the last has a counter type variable that we keep doing an operation on to move to the next element. Where with the for loop we were incrementing the i variable, with the Enumerator we’re calling MoveNext. You could argue that this is a lot nicer since the actual mutation bits happen inside the Enumerator but it’s still pretty clear you’re telling an object to mutate to the next value. In C++ you actually use the ++ operator to move the iterator to the next element.

Now perhaps your language has gone out of its way to wrap up this concept in some syntactic sugar, C# foreach is a good example of this. This allows us to use the IEnumerator to traverse the list without having to see any of that gunk.

LinkedList<int> numbers = new LinkedList<int>(new [] { 43, 75, 6, 2, 91, -5 });

foreach (int num in numbers) {
    // do something with num here
}

This is much nicer, and has a lot less boilerplate than the initial for loop. This version doesn’t have variables floating around getting updated so that part is cleaned up. Generally I would be fairly happy with this solution in a language, it is certainly “good enough” for every day iteration needs.

Loops without loops

We still haven’t answered the question of how you iterate without ticking some type of counter up and up and up until you hit your end condition and stop looping. The answer, and I’m sure you’ve seen it before, is recursion. Before you look at this code, know that this is not how I would advocate writing all your loops, as if you should ban for and foreach and copy paste this little snippet around everywhere. This activity is about spotting patterns so that we can leverage some proper reuse of concepts not about writing the most optimized version of this particular iteration.

void DoSomething(int[] numbers, int index) {
    if (index == numbers.Length) {
        return; // if we are at the end of the list, stop recursing
    }

    // do something with numbers[index] here
    // recursively call ourselves with the next index
    DoSomething(numbers, index + 1);
}

var nums = new [] { 1, 4, 5, 6, 7, 12, 1 };
DoSomething(nums, 0);

So what on earth is going on here? We have a function that checks to see if the value of its index variable is equal to the end of the list, if so it returns without doing anything. If not, then we can proceed to do whatever it is we want to do as part of this iteration. We then proceed to go to the next element in the collection by calling the same function again but with an index of index + 1. You’ve probably seen something like this before as a canonical definition of recursion, usually it’s factorial but it comes up a lot when iterating over tree type structures as well. It’s not the prettiest thing, but hey, we iterated over a collection and did not in fact mutate a variable! What we did was allow the local state of the function we called over and over hold the current state for us. If you think of the stack as we recursively call the DoSomething function over and over, each stack frame is effectively what the i variable was in the for (int i = 0; i < numbers.Length; ++i) example.

<DoSomething index = numbers.Length>
...
<DoSomething index = 2>
<DoSomething index = 1>
<DoSomething index = 0>
<Main>

Obviously having to create a method and pass in 0 as the start point is not very nice, although the 0 for the initial parameter is pretty analogous to the int i = 0 we use in the for loop. Now I’m not advocating that recursive iteration is the only way to do things or that it is inherently superior to say, the iterator pattern used by C#’s foreach, but it is a way to iterate without any of your mutable state “leaking out” of your iteration. A very common misconception about functional programming is that it contains no state and that it doesn’t allow any mutation of state. A more accurate way of describing the functional approach would be that state is contained in the local scope of functions. Once you’re in that world where your functions aren’t having side effects and you can get values in and out of those functions, the immutability of everything inside that function becomes rather academic. So, we could move forward in a functional style and still use foreach and everything is hunky dory as long as we stick to the pure function and only working with values parts.

And now it doesn’t even have a body

In all of the examples we’ve done so far, whether using for, foreach, or a recursive function, one thing has been constant. We’ve done a lot of ceremony and work just to execute our actual logic. Wouldn’t it be nice if we never had to write a loop again, if we could boil down the essence of iteration to its bare essentials and then when we need iteration we could just say “hey do this for each thing in the collection”. Well good sir or madam, you are indeed in luck this fine day. C# 3.0 introduced a language feature called lambda, it’s going to be your new best friend. A lambda simply put is a function that you can declare and use without needing to give it a name. You can give it a name, but you don’t have to. In this sense it is sometimes called an anonymous function. Really this sort of thing isn’t all that strange. In the above example, what name did we give to the int[] we passed into DoSomthing? That is an anonymous int[] and since C# has the concept of delegates, we can pass in an anonymous one of those as easily as you can pass around an anonymous int[].

Fortunately for us C# has developed some very good functional tendencies in the last few versions and as a result this whole process is much smoother than it would be in, say Java which does not currently have lambdas. Ruby has good support for lambdas and Python has a fairly limited version of lambda that is restricted to single line expressions. Still, better than nothing.

For those of you not up to speed on these new fangled “delegates” and “lambdas” things, let me give you a brief introduction. Delegates in C# are like a variable that points to a function instead of your regular variables that reference a piece of data. Like regular variables, delegates must have a type, specifically they must match a type signature of some sort. You may see C# code that looks like this

delegate int MyDelegateType(int a, string b);

This declares a new type MyDelegateType which is a function that takes an int and a string and returns an int. You may see this written as (int, string) -> int or (int * string * int) which both mean, “this is a function that takes an int and a string and returns an int”. C# has already defined several such delegates for us, and apart from creating your own for the sake of conciseness, you can go a long way with just the built in types. Action, Action<T>, Action<T1, T2> and so on represent a function that takes nothing, T, (T1, T2), etc. and returns void. So Console.WriteLine is an Action<string>. As we move forward, we’re not going to write many Actions as they tend to not return values and anytime you’re calling a function and it doesn’t return anything you can be sure there are side effects, otherwise what possible purpose could it serve? The other delegate type C# has for us is Func<T>, Func<T1, T2>, Func<T1, T2, T3> and so on. Unlike Action, Func always returns a value. If you specify a Func<T> then you are saying this is a function that takes zero parameters and returns a value of type T. If you declare a Func<T1, T2> then you are saying it is a function that takes a T1 and returns a T2.  The int version of Math.Min is a Func<int, int, int>, it takes two ints as parameters and returns an int. We could create a new variable of type Func<int, int, int> and assign Math.Min to it and then invoke the delegate and it would in turn invoke Math.Min for us. In this way, we can pass around functions as pieces of data which it turns out is really handy. We can also create a new function (a lambda) and assign it directly to our variable. Using the Func<int, int, int> example from before we could do the following.

// this is our variable of type (int, int) -> int
Func<int, int, int> intAndIntToInt;
// notice we're not calling Math.Min,
// we're assigning the function to our variable
intAndIntToInt = Math.Min;
// this calls Math.Min just a sure as if we wrote Math.Min(6, 3)
intAndIntToInt(6, 3);
// creates a new lambda of type (int, int) -> int that returns the
// sum of the params and assigns it to the variable
intAndIntToInt = (a, b) => a + b;
// this return 9
intAndIntToInt(5, 4);

As you can see, we made up a new function on the fly, one that takes in two parameters, named a and b and returns the sum. C# has special syntax for one line lambdas that allows you to omit the return statement. Written out the long way (and the way you must use for multi-line lambdas), we would get the following.

Func<int, int, int> intAndIntToInt;
intAndIntToInt = (a, b) => {
    return a + b;
});

If you squint and turn your head to the side a bit this really looks pretty close to your normal everyday function, except there’s no return type and it’s got the => between the params and the function body. If we can just make our own functions willy nilly and pass them around like any old variable, then perhaps we can make “hey do this for each thing in the collection” actually happen. Since we’re approximating foreach I’ll name the function after that keyword.

void ForEach(int[] numbers, Action<int> action) {
    foreach (var num in numbers) {
        action(num);
    }
}

var nums = new [] { 1, 4, 5, 6, 7, 12, 1 };
// parens around the parameter are optional if there is only 1 param
ForEach(nums, num => // do something with num here);

So we call ForEach, passing in the num and our Action<int> that will be invoked for each element in the array. This of course only works on int[], but if you’ve ever touched generics in C# or Java or templates in C++ it’s a short hop skip and a jump over to a nice generalized version.

void ForEach<T>(IEnumerable<T> values, Action<T> action) {
    foreach (var value in values) {
        action(value);
    }
}

The usage of ForEach doesn’t change one bit and now the majority of the code on our single line is the business end, not the “make the compiler happy” end. This is incredibly powerful. While we have not introduced new syntax into the language, with the power of C#’s extension methods, we can get fairly close.

static void ForEach<T>(this IEnumerable<T> values, Action<T> action) {
    foreach (var value in values) {
        action(value);
    }
}

// this allows us to do
var nums = new [] { 1, 4, 5, 6, 7, 12, 1 };
nums.ForEach(num => // do something with num here);

Even less non business cruft to get in the way*. As you may have suspected those smart people who work on C# have already implemented ForEach as part of System.Linq which also was introduced in C#3 (although only for the List class which seems like such an oversight). Thanks to Linq and a few other extension methods I wrote, I have not written a foreach loop in a very long time. This is not a coincidence.

*Note this example is slightly simplified. Extension methods must be declared as part of a static class.

These functions that take functions are called “higher order functions” and functional programs are absolutely chocked full of them. There are functions that take two or more functions and make a new function that runs all of them in sequence. There are functions that take in a function and wrap it in some functionality such as logging or caching or communicating over a network without the wrapped function having any idea it’s being used in such a context. Higher order functions are very cool, and they are the reason we can build reusable structures when we follow a functional style.

Different but the same

Iterating over collections sometimes feels like the only thing our programs do, so naturally we run into situations where just a foreach style loop by itself isn’t sufficient. There are only so many cases where you want to “do something” directly to each value in a collection. What if you wanted to do an operation that spanned multiple values, or all the values. The simplest of these is probably summing the numbers in a collection. In C# you could implement it this way.

int[] numbers = new [] { 43, 75, 6, 2, 91, -5 };

int sum = 0;
foreach (int num in numbers) {
    sum += num;
}

or using the magical new extension method

int[] numbers = new [] { 43, 75, 6, 2, 91, -5 };

int sum = 0;
numbers.ForEach(num => sum += num);

Now wait just a second here. The lambda that we’re declaring takes only a single parameter, num, but we’re adding it to sum, how the heck does the lambda have sum in scope? Even though we are declaring a new function, because it is a lambda the scope of that function includes all the variables in scope at its declaration. So since sum is visible where the lambda is declared, sum can be used inside the function, even if our lambda isn’t run until after the function that declared sum terminates. This is known as a closure and it’s quite important for some of the more fancy things that are done with functional programs (such as the aforementioned higher order functions). For our purposes it means we have to pass one less variable into our Action.

This works, it’s not overly complex, and it gets the job done with only 1 extra line that can be located right next to where the iteration occurs. So far so good, let’s try something slightly more complex. What about finding the largest number in the list?

int[] numbers = new [] { 43, 75, 6, 2, 91, -5 };

int largest = numbers[0];
numbers.ForEach(num =>
    if (num > largest) {
        largest = num;
    }
});

This looks pretty similar to what we did before, it’s just our action that has gotten more complex. If we wanted to find the smallest element we would write the same Action but just flip the > to a <. It seems like those are really the same logic with a teeny tiny difference, the kind we could parameter’ize away. What we want is some iteration type construct that at each element asks us, should this new element become the one we’re going to return at the end?

static T Pick<T>(this IEnumerable<T> values, Func<T, T, bool> selector) {
    // do error handling here, if values is null, if selector is null, etc.
    IEnumerator<T> iter = values.GetEnumerator();
    iter.MoveNext();
    T current = iter.Current;
    while (iter.MoveNext()) {
        // if the lambda returns true, the user wants to use the
        // iterator's current value so save it
        if (selector(iter.Current, current)) {
            current = iter.Current;
        }
    }
    return current;
}

int[] numbers = new [] { 43, 75, 6, 2, 91, -5 };
// returns true when the current value is smaller than the current smallest
int smallest = numbers.Pick(num, currentSmallest => num < currentSmallest);
// returns true when the current value is larger than the current largest
int largest = numbers.Pick(num, currentLargest => num > currentLargest);

As you have surely encountered in your own work, once you generalize a solution to being able to implement 2 solutions, it’s generally pretty easy to make n different solutions as long as your constraints don’t change very much. In this case the constraint is that the function will accept a lambda that it invokes at the right time during the iteration process and then either pays attention to the result of or not depending on what kind of operation it’s trying to do. Basically everything except a ForEach will care about the result and do something with it.

For an example of some of the awesome ways this manifests, you should become acquainted with Linq. Working with collections will never be the same for you in C#. Here are some examples of things that are vastly simpler than their foreach implementation would be.

Find all words that are less than 6 letters long and sort them in descending order.

string[] words = new []	{
	"Lorem",
	"ipsum",
	"dolor",
	"sit",
	"amet",
	"consectetur",
	"adipisicing",
	"elit",
	"sed",
	"do",
	"eiusmod",
	"tempor",
	"incididunt",
	"ut",
	"labore",
	"et",
	"dolore",
	"magna",
	"aliqua"
};

IEnumerable<string> descendingOrderedWords = words.Where(word => word.Length < 6).OrderByDescending(w => w);

Convert a list of strings that represent ints into a list of ints

string[] numbers = new [] {
	"1",
	"7",
	"45",
	"9",
	"81",
	"37"
};
IEnumerable<int> parsedNumbers = numbers.Select(numberString => int.Parse(numberString));

Count the number of occurrences of an element in a list and then order the elements of the list in order of number of occurrences descending.

int[] numbers = new [] { 9, 27, 8, 3, 12, 27, 6, 8, 9, 8, 27, 1 };
IEnumerable<int> numbersOrderedByFrequency =
    numbers.Aggregate(new Dictionary<int, int>(), (count, number) => {
        if (!count.ContainsKey(number)) {
	    count[number] = 0;
        }
        ++count[number];
        return count;
    }).OrderBy(keyValuePair => -keyValuePair.Value)
      .Select(keyValuePair => keyValuePair.Key);

As you can see, these sorts of things can get fairly complex, and certainly functions like Aggregate deserve a nice sit-down with the documentation page to really grok what’s going on, but hopefully you’re getting a feel for the shape of these things. We’ve got functions that take other functions as inputs and produce new IEnumerables as a result that we can then chain together to get a whole lot of functionality for very little noise. Whether these functions use recursion or an iterator internally doesn’t particularly matter as that detail is totally hidden away from us and unknowable. Since Linq functions don’t mutate any state themselves (although the lambda you pass in certainly could), they are quite functional and thus composable. Composability is a big buzzword you’ll be hearing about more, it’s one of the reasons for pursuing a functional approach (apart from all this cool Linq goodness).

Tagged , , , , ,

Every journey begins with a single step

Purpose

I started this blog because I wanted to understand functional programming better. Having taught traditional programming for over 7 years now I firmly believe the best way to really understand something is to be forced to explain it to someone else. Secondarily, I feel that there’s a lot of interest in functional programming concepts but not a lot of good text on the subject that doesn’t presuppose a lot of terminology, concepts, syntax, and even language constructs from the functional world. My aim here is to introduce as much as I can about functional programming but through the lens of a traditional OO programmer. If you use Java or C# you should feel right at home here. Ruby and Python practitioners will also be able to follow right along although I’ll be using the curly brace and semi-colon syntax of the C lineage.

Another problem I often see is that as soon as anything functional gets mentioned it’s approximately 0.21 seconds before the author then says “so since it’s soooooo much easier to do this in Haskell, I’m going to show all my examples using that now”. While this may be convenient for the author, it is rather unfortunate for the intended goal of communicating any useful information to the reader who almost surely would not be reading an “intro to functional programming” post if they already knew Haskell well enough to read your code example. Sigh. So, my intent here is to use C# as the lingua franca of learning about all this fancy pants functional business.

Some context

Let’s get to it then, what is the “functional” in functional programming about? Honestly it’s nothing radical, no new keywords or libraries or such things, just a different application of the tools we already have. You probably are familiar with many of the concepts I’m going to talk about, but perhaps you’ve never heard them presented as a cohesive set of techniques. I would liken it to the shift that took place with structured programming, a development that took place long before I was born.

The TL;DR version is that at one point, it was common practice to use goto statements to jump all around your program as needed. You can do some nifty things with your control flow and clever individuals loved to squeeze extra little bits out of their machines, both by making smaller programs and by consuming fewer CPU cycles. The downside to this approach was that reading such programs became very difficult. So it was decided, although you lose some flexibility in how you put your program together by requiring that every subroutine (aka function/method) must return to the point in the program that called it when finished, the maintainability upside was pretty huge. This did not mean that the usage of goto disappeared. Your compiler is free to (and does) make great use of such a construct to achieve all sorts of optimizations and even the very structures that were advocated as replacements for goto as a technique, but you never see this part. To you, goto statements probably don’t exist except as the internal business of various tools you don’t need and/or want to know the internals of. We don’t maintain the binary output of our compiler, we instead compile a new version from our source code, so it really doesn’t matter what the compiler does in the process of converting our source to something runnable, as long as the meaning of the instructions is not altered.

Today we don’t even bat an eye at the concept of structured programming, “of course your method returns to where you called it from!” you might exclaim, how could it be any other way? Madness! However, to the goto-loving programmer of 1968, the idea of doing things differently might have seemed as if it was a bad trade, after all they were giving up precious control over their program and perhaps even believed it to be impossible to write certain types of programs under such draconian restrictions. Again, a laughable position from a modern programming viewpoint but I hope you can empathize just a teeny tiny bit. We are that goto programmer of the 1960’s.

Part 1: The function

Being named functional programming, you may have had an inkling that functions might play a role, and right you are! Lest I be guilty of the sins I have accused others of above, let me start at the beginning. I promise I’m not trying to be patronizing to anyone here, I just want to be super clear on the way functional programming sees functions.

In the beginning, we had instructions that came one after the other. This is machine code, which was then slightly abstracted into assembly code. Your program starts at the first instruction and proceeds down the program unless something redirects it (that would be the goto statement). So a totally unrealistic made up example program might look like this

A
B
C
D
E
..more instructions here..
H
I
B
C
D
J
K

Now you may have noticed that that some of those instructions are repeated. It seems rather wasteful to have to keep putting the same instructions in over and over each time we might want to use them so we can instead put them in a single place and then go use that chunk of functionality on demand.

:subroutine // this is a label and is the target of a goto statement
B
C
D
goto place I was called from

:main // program starts running here
A
goto subroutine
E
..more instructions here..
H
I
goto subroutine
J
K

Of course the “goto ” would need some information to actually work, clearly we can’t just use a single goto here as we need to return to 2 different parts of the program (and possibly many more). This idea of how you wire up the going to and coming back from different parts of your program is a good part of what structured programming was about and in the process we got the concept of a function in our programming languages.

So there’s the classical view of functions, they’re nice little containers for parts of your program so you don’t have to copy/paste them everywhere. Well, that’s one view of functions, one specifically in the imperative line of thinking. Imperative programming is about meeting the computer half way. We write programs that are kind-of-sorta doing what the underlying CPU will actually do and the compiler takes care of bridging the gap. When you say:

int x;
x = 5;
int y = x;

You’re being fairly explicit with both what the computer should do (bind the name x and y to the value 5), as well as how the computer should go about doing that (allocate 2 int sized areas in memory and write the bits that represent 5 to those memory locations). In assembly the gap between what the CPU does and what your program tells it to do is very small, in C it’s a bit higher, and by the time you get to languages with runtimes such as Java/C#/Ruby/Python/Perl/PHP you’re at best dealing with generating instructions for a virtual machine that then get taken and converted into the real instructions your computer knows how to understand. Still, even with all that abstraction we’re still fundamentally telling the computer to allocate memory, put values in that memory, copy that memory to other places and so on. Object oriented programming puts yet another organizational layer on top of all of this but does not fundamentally alter how you view your relationships between yourself and the machine. The difference between Assembly and C# is conceptually less than the difference between C# and a functional language like Lisp. C# and Lisp will look syntactically more similar than ASM and C#, but how you go about building your app will be a much wider gap. This is one reason people find learning functional programming so difficult, it looks in many ways to be so similar syntactically but requires you think about your problem in a whole new way.

There is another view that comes at the whole business of writing programs from the opposite approach. You may have guessed that this is indeed functional programming. In this view, a function has nothing to do with memory and CPU opcodes and registers and such, a function is an algorithm that takes in one or more values and produces a result. Of course, to be useful this algorithm must be expressed in some sort of form that can be translated to bits that do indeed allocate memory and care very much about registers and CPU opcodes, but from the programmer’s perspective that sort of thing is an implementation detail. In the same way you probably don’t fret over not knowing exactly how the JVM or CLR interfaces with the operating system to enable “Hello World” to be printed out, a functional approach to programming doesn’t care about how exactly the runtime allocates memory and in many cases even how most of the functions in the program work.

Now this is the point at which you probably think that functional programmers are lazy or stupid or some amazing combination of the two. Why would willful ignorance be of any use to a professional who needs to build real applications that solve real (and hard) problems? It does indeed seem as if there is something magical about controlling where your data goes and the ability to alter it at will. But consider this, we don’t (generally) fret that our for loops won’t get turned into efficient jump statements when translated to opcodes. The compiler has demonstrated the ability to do this competently for so long that no one worries about it anymore, although this was not always true. I can recall reading over many threads of argumentation where programmer A claims that the compiled output of compiler B is not as efficient as their hand rolled assembly and thus they should continue using assembly over C/Pascal/etc. This same argument happened with the transition from C to C++ and then from C++ to Java. At each point, we as programmers gave up some control over the specifics of the execution of our program and got in trade a lot less things to have to worry about. In the functional world, things have always been this way, the problem was always that enabling a blissfully ignorant view of the program’s execution was not particularly feasible for quite some time leading to dreadfully slow functional programs. Thankfully this problem has largely rectified itself thanks to the amazing work of our electrical engineering brethren who give us faster and faster computers to run our terrible bug ridden programs on. This excess computing power has allowed traditional imperative programming languages to take on more aspects that functional languages once embodied such as garbage collection and conversely has allowed functional languages to become fast enough to participate in our everyday computing needs.

TL;DR – Imperative languages started off fast and then got good, functional languages started off good and then got fast.

Clearly it’s not just the functional programming converts that naturally abstract away parts of their systems so they do not have to deal with the complexity, that is a fundamental requirement of programming a large complex system using a human brain (or team of human brains). A functional style adopts the mantra of “I don’t know, I don’t want to know”, if I may steal the phrase from Rich Hickey. Now of course in a functional program you do have the capability to go look under the covers and see what every part is doing, but the point is that you don’t have to do this when you don’t actually need to. What is the secret to this amazing way of building programs you ask? Ah young padawan, you already know the answer, it is indeed the humble function.

Part 2: The function

Wha? You just spent like 10 minutes talking about functions in imperative languages and how functional has a better way of doing things and now you’re telling me it was that same thing all along? Pretty much, with one eensy teensy little modification. Functions in a functional language want to only operate on their parameters. This may seem like a tiny stupid distinction, of course functions operate on their parameters, why on earth are you passing in the param if you’re not going to use it? Ah, but the distinction here is only operate on their parameters. We can easily write a function that operates on the parameters but also operates on some other variable along the way.

// assuming a C style language where globals exist, or
// alternately assuming all this is inside a class for a
// Java / C# style language
int x;

int Sum(int a, int b) {
    x = a;
    return a + b;
}

As you can see, even though x was not passed into the function, it was modified as a result. This sort of thing is referred to as a “side effect”. When functional languages talk about functions, they usually mean “pure” functions, which are side-effect free. Some languages such as Haskell even go so far as to prevent side effects at a language level. Once you have functions that are pure, you can then begin to make assumptions about them. Given the same input they will always produce the same output, so adding memoization is safe and in many cases trivial to implement. Also, when you need to dive into a function to figure out why you’re getting the color Blue when you fully expected to get the color Red you can safely confine your searching to the function itself. Since there can be no outside interference from other functions or variables, when something breaks inside a function it is contained to stay inside that function. Needing to look at less lines of code to understand what is happening is a pretty big motivator for our industry. It could be argued that this idea is one of the primary drivers for the adoption of OOP, grouping all the related functions (aka methods) and the variables they operate on in one place where you can see them as a whole. Pure functions sort of do that on a per-function level.

Part 3: Values

We’ve been talking about variables and functions and I hope it’s beginning to become clear to you now how a function in the imperative world is different from a function in the functional world, however there is still one more bit that functional languages want to stick their oh-so-mathematically-pure fingers into, your variables. This again is an area where imperative and functional viewpoints differ. In the imperative world, a variable is a label for a location in memory. You can use the variable to get at the bits it is referring to and to send new bits to reside there. The bits that you get or set at a particular memory address (aka variable) comprise some sort of data, at the simplest they are things like the number 9 or 5.2. There’s a good reason our languages let us do things like

int x;
x = 5;

but not

int x;
5 = x;

Assigning the bits that represent 5 to the memory address referred to by the name x makes sense. Changing the value 5 to be a different value does not.  5 is 5, it can be no other way. This sets up a variable/value dichotomy. We often conflate the two (often referred to as identity and value), but as you can see they are very different. For any given variable we can put a new value in that memory location and now x is 7 instead of 5 or 23 or … well whatever bits conform to the type rule of our compiler. Functional languages do not generally operate this way. In a functional language, a variable name is bound to a value, and from then on, that name refers to the value, not some memory address. Yes of course under the hood that is exactly what is happening but conceptually we don’t care about that, we care that if I say x = 5, then as long as that variable is in scope, it will be 5. Of course there can be other x’s in other functions or in different calls to this same function each with their own values bound to them, we’re not talking about some magical global variable that exists throughout your entire program, just a variable that is only assigned to once. In C# you can get values for primitive types, just add the readonly keyword and voila. Paired with functions that don’t rely on any outside variables to do their job and now you really have something. Unfortunately readonly only works on member variables, not local variables so while we’d like to do this

int DoSomething(int a, int b) {
    readonly int x = a + b;
    .. hundreds of lines of code, please don't actually write functions this way..
    // I know for a fact that x is still a + b, it can not have changed
}

That is unfortunately not possible. You can see though, that if we could guarantee x never changed (either at the language level or via our own discipline and standards), that knowledge would allow us to look at a very small part of our function and reason about it more clearly, we wouldn’t have to be concerned with the preceding lines (except the one that set the initial value of x of course).

The terms you will hear for these concepts are mutable (as in changeable) and immutable (as in not changeable).

It is now that I will direct your attention to

by the aforementioned Rich Hickey who will do a fantastic job of explaining why concept of values is so powerful and desirable in our applications.

No seriously, go watch the talk, it is amazing.

Part 4: Recap

Ok, so it’s been about 3000 words now and so far we’ve covered … functions… and um, variables. Yeah! Off to a good start I’d say. All joking aside, functional programming is way less about new tools than it is about a different approach to how you think about your tools, and how you put them together. Hopefully you have lots of questions like, how on earth can you do something as simple as add up a list of numbers if you can’t change the value bound to your variable? That is indeed a fantastic question and will be one of the topics in the next post.

Continued in part 2

Tagged , , ,