Chapter 03: 순수 함수를 통해 진정한 행복을 만나보세요
순수해지기
우리가 당장 해야할 건 순수 함수의 정의부터 살펴보는 것이다.
순수 함수란, 같은 인풋을 주면 항상 같은 결과값을 반환하고 관측가능한 부작용이 전혀 없는 함수를 말한다.
slice
와 splice
를 봅시다. 이 두 함수는 완전히 동일한 일을 합니다. (방법은 완전히 다르지만 그래도 같은 일을 한다고 볼 수 있습니다.) 우리는 slice
는 동일한 입력에 대해 매번 동일한 출력을 보장하기 때문에 순수 하다고 부릅니다. 어쨌거나, splice
는 배열을 잘근잘근 씹어먹은뒤에 영원히 변경된 채로 뱉어냅니다. 이게 바로 관측가능한 영향이지요.
함수형 프로그래밍에서는, splice
같이 데이터를 변형하는 다루기 힘든 함수들을 싫어합니다. 이러한 경향은 함수형 프로그래밍에서 splice
처럼 지저분한 함수가 아닌, 항상 같은 결과를 반환하는 믿음직스러운 함수들을 위해 맞서싸우고 있기 때문에 절대 바뀌지 않을 겁니다.
다른 예제를 살펴보겠습니다.
반대쪽의 순수한 코드를 보면, 완전히 자급자족하고 있죠. 우리는 minimum
의 상태가 절대 바뀌지 않을 것이기 때문에 이것의 순수성을 보존하기 위해 불변으로 만들 수도 있습니다. 이렇게 하려면, 객체를 생성하고 얼려야 하죠.
부작용에는 이런 것들이 포함될 수 있어요
우리의 직관을 키우기 위해 이 "부작용"들을 좀 더 살펴봅시다. 그래서 대체 순수 함수의 정의에서 이야기하는 의심할 여지없이 범죄적인 *부작용(side effect)*이란 어떤 것일까요? 우리는 함수 실행중에 반환값 계산을 제외하고 발생하는 모든 것을 작용(effect) 이라고 봅니다.
작용(effect)는 본질적으로 사악한것이 전혀 아니며, 우리는 앞으로 남은 챕터들에서 계속해서 사용할 겁니다. 부(side) 라는 부분이 부정적인 의미를 담고있는 부분입니다. 보통 물에서는 벌레가 살지 않지만, 고여있는 물은 좋은 인큐베이터와 같아서 벌레들이 군집을 형성할 수도 있습니다. 그리고 장담건대, 부작용 또한 여러분의 프로그램에서 이와 비슷한 훌륭한 번식지가 됩니다.
부작용은 반환값 계산 도중 발생하는 시스템 상태의 변화 또는 외부 세계에 발생하는 관측 가능한 상호작용 입니다.
Side effects may include, but are not limited to
changing the file system
inserting a record into a database
making an http call
mutations
printing to the screen / logging
obtaining user input
querying the DOM
accessing system state
And the list goes on and on. Any interaction with the world outside of a function is a side effect, which is a fact that may prompt you to suspect the practicality of programming without them. The philosophy of functional programming postulates that side effects are a primary cause of incorrect behavior.
It is not that we're forbidden to use them, rather we want to contain them and run them in a controlled way. We'll learn how to do this when we get to functors and monads in later chapters, but for now, let's try to keep these insidious functions separate from our pure ones.
Side effects disqualify a function from being pure. And it makes sense: pure functions, by definition, must always return the same output given the same input, which is not possible to guarantee when dealing with matters outside our local function.
Let's take a closer look at why we insist on the same output per input. Pop your collars, we're going to look at some 8th grade math.
8th Grade Math
From mathisfun.com:
A function is a special relationship between values: Each of its input values gives back exactly one output value.
In other words, it's just a relation between two values: the input and the output. Though each input has exactly one output, that output doesn't necessarily have to be unique per input. Below shows a diagram of a perfectly valid function from x
to y
;
To contrast, the following diagram shows a relation that is not a function since the input value 5
points to several outputs:
Functions can be described as a set of pairs with the position (input, output): [(1,2), (3,6), (5,10)]
(It appears this function doubles its input).
Or perhaps a table:
1
2
2
4
3
6
Or even as a graph with x
as the input and y
as the output:
There's no need for implementation details if the input dictates the output. Since functions are simply mappings of input to output, one could simply jot down object literals and run them with []
instead of ()
.
Of course, you might want to calculate instead of hand writing things out, but this illustrates a different way to think about functions. (You may be thinking "what about functions with multiple arguments?". Indeed, that presents a bit of an inconvenience when thinking in terms of mathematics. For now, we can bundle them up in an array or just think of the arguments
object as the input. When we learn about currying, we'll see how we can directly model the mathematical definition of a function.)
Here comes the dramatic reveal: Pure functions are mathematical functions and they're what functional programming is all about. Programming with these little angels can provide huge benefits. Let's look at some reasons why we're willing to go to great lengths to preserve purity.
The Case for Purity
Cacheable
For starters, pure functions can always be cached by input. This is typically done using a technique called memoization:
Here is a simplified implementation, though there are plenty of more robust versions available.
Something to note is that you can transform some impure functions into pure ones by delaying evaluation:
The interesting thing here is that we don't actually make the http call - we instead return a function that will do so when called. This function is pure because it will always return the same output given the same input: the function that will make the particular http call given the url
and params
.
Our memoize
function works just fine, though it doesn't cache the results of the http call, rather it caches the generated function.
This is not very useful yet, but we'll soon learn some tricks that will make it so. The takeaway is that we can cache every function no matter how destructive they seem.
Portable / Self-documenting
Pure functions are completely self contained. Everything the function needs is handed to it on a silver platter. Ponder this for a moment... How might this be beneficial? For starters, a function's dependencies are explicit and therefore easier to see and understand - no funny business going on under the hood.
The example here demonstrates that the pure function must be honest about its dependencies and, as such, tell us exactly what it's up to. Just from its signature, we know that it will use a Db
, Email
, and attrs
which should be telling to say the least.
We'll learn how to make functions like this pure without merely deferring evaluation, but the point should be clear that the pure form is much more informative than its sneaky impure counterpart which is up to who knows what.
Something else to notice is that we're forced to "inject" dependencies, or pass them in as arguments, which makes our app much more flexible because we've parameterized our database or mail client or what have you (don't worry, we'll see a way to make this less tedious than it sounds). Should we choose to use a different Db we need only to call our function with it. Should we find ourselves writing a new application in which we'd like to reuse this reliable function, we simply give this function whatever Db
and Email
we have at the time.
In a JavaScript setting, portability could mean serializing and sending functions over a socket. It could mean running all our app code in web workers. Portability is a powerful trait.
Contrary to "typical" methods and procedures in imperative programming rooted deep in their environment via state, dependencies, and available effects, pure functions can be run anywhere our hearts desire.
When was the last time you copied a method into a new app? One of my favorite quotes comes from Erlang creator, Joe Armstrong: "The problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana... and the entire jungle".
Testable
Next, we come to realize pure functions make testing much easier. We don't have to mock a "real" payment gateway or setup and assert the state of the world after each test. We simply give the function input and assert output.
In fact, we find the functional community pioneering new test tools that can blast our functions with generated input and assert that properties hold on the output. It's beyond the scope of this book, but I strongly encourage you to search for and try Quickcheck - a testing tool that is tailored for a purely functional environment.
Reasonable
Many believe the biggest win when working with pure functions is referential transparency. A spot of code is referentially transparent when it can be substituted for its evaluated value without changing the behavior of the program.
Since pure functions don't have side effects, they can only influence the behavior of a program through their output values. Furthermore, since their output values can reliably be calculated using only their input values, pure functions will always preserve referential transparency. Let's see an example.
decrementHP
, isSameTeam
and punch
are all pure and therefore referentially transparent. We can use a technique called equational reasoning wherein one substitutes "equals for equals" to reason about code. It's a bit like manually evaluating the code without taking into account the quirks of programmatic evaluation. Using referential transparency, let's play with this code a bit.
First we'll inline the function isSameTeam
.
Since our data is immutable, we can simply replace the teams with their actual value
We see that it is false in this case so we can remove the entire if branch
And if we inline decrementHP
, we see that, in this case, punch becomes a call to decrement the hp
by 1.
This ability to reason about code is terrific for refactoring and understanding code in general. In fact, we used this technique to refactor our flock of seagulls program. We used equational reasoning to harness the properties of addition and multiplication. Indeed, we'll be using these techniques throughout the book.
Parallel Code
Finally, and here's the coup de grâce, we can run any pure function in parallel since it does not need access to shared memory and it cannot, by definition, have a race condition due to some side effect.
This is very much possible in a server side js environment with threads as well as in the browser with web workers though current culture seems to avoid it due to complexity when dealing with impure functions.
In Summary
We've seen what pure functions are and why we, as functional programmers, believe they are the cat's evening wear. From this point on, we'll strive to write all our functions in a pure way. We'll require some extra tools to help us do so, but in the meantime, we'll try to separate the impure functions from the rest of the pure code.
Writing programs with pure functions is a tad laborious without some extra tools in our belt. We have to juggle data by passing arguments all over the place, we're forbidden to use state, not to mention effects. How does one go about writing these masochistic programs? Let's acquire a new tool called curry.
Last updated