Wednesday, October 05, 2016

RubyConf 2015 - Messenger: The (Complete) Story of Method Lookup by Jay McGavren

wanna make time for questions
so I'll move at a brisk pace.
If I'm overwhelming you, just raise a hand,
I'll slow it down a little bit.
So, how y'all doing?
I'm Jay McGavren, I wrote this thing,
and this is Messenger: the (Complete) Story
of Method Lookup in Ruby.
This talk's gonna take place in two major parts,
the first is all about methods defined on classes,
including some places you might not expect them to be.
The second part's all about modules.
So, classes part.
We're gonna start in a slightly unusual place,
singleton methods.
Those are methods that are defined
on only a single object, if you've ever stubbed methods
out for a test, you've used singleton methods.
So, for example, we've got a WebSpider class here,
it sends out an actual http get request across the network
and returns whatever html it gets back
from the remote server, you probably don't want that
in your test, though, so this test right here,
you can see it creates a new WebSpider object,
it calls get, and that sends an actual network request.
You don't want your test to have to wait for that
and you don't want whatever random response
you get back from the remote server.
You want it quick and you want it predictable.
That's a good alternative for writing a singleton method
to override it.
So here's a quick singleton method,
we create a WebSpider object and we define a new get method
that just always returns a static html string.
It's fast, it's predictable.
Now, let's take a look at how method lookup
works with those.
So, to create a singleton method,
we need an object, for starters.
And before we can create an object,
we're gonna need a class, so, here's MyClass.
I'm gonna wanna create an instance in MyClass
behind the scenes, Ruby will create a singleton class
specific to that object.
And you can access any object's singleton class
via the singleton_class method.
So, there's the output you'll see
if you call singleton class on our new object.
Now why does Ruby do this?
Chief reason is consistency.
The same logic that lets you call methods
on an object's class also lets you
call methods on its singleton class.
When we define singleton methods on an object,
they're added to its singleton class
and that class makes the methods available
exclusively to that object.
So we define a singleton method here.
And it'll live on a singleton class.
When you call a method on an object,
Ruby dispatches a message to that object,
looking for that particular method.
Ruby's first stop in its search for a method
is always gonna be the object's singleton class
cause that's what the object refers to first.
And as soon as Ruby finds a method
with a matching name, it'll invoke it.
Now, what would happen if we got rid
of that singleton method, and defined it-
defined a method by the same name on the class instead?
If we call my_method on the object,
Ruby will look on the singleton class first,
but there's no method by that name there.
Well, where can the method be found?
Each class maintains a pointer
to the next class that Ruby should look on for methods.
So, the ancestors method, that's your cheat sheet
for understanding the places that Ruby's going to look
for a given method.
And you can access that list of classes
that Ruby's going to look on via that ancestors method.
So, here we create an instance of MyClass.
Call singleton class on that to get the singleton class
and call the ancestors method on that
and we get this array in response.
Our singleton class is up first,
and it's followed by MyClass.
That's the place that Ruby's going to look next
when it doesn't find that method on the singleton class.
So, when Ruby fails to find my method
on the singleton class, it gets directed
to the next class on the chain, MyClass.
Goes there, invokes the method, and we're good to go.
Now, supposed we were to move my method again
to a super class of MyClass.
So we define my method up there on MySuperclass
and say that MyClass is a subclass of that.
We can create an instance of MyClass
which will get us a singleton class.
If we call ancestors on that singleton class again,
we can see all the places that Ruby will look for my method.
It starts, of course, with the singleton class,
proceeds to MyClass as before,
and the new addition here is MySuperclass.
So, we've got our output from the ancestors method up there,
and you can see it exactly mirrors the places
that Ruby looks for the method,
it starts with the singleton class,
moves onto MyClass, moves onto MySuperclass
where it finds my method and invokes it.
Now what happens if a method appears
in more than one place in the lookup chain?
For example, if we defined it on MySuperclass
and then defined it on MyClass as well.
If a method- or, if a subclass method
has the same name as a superclass method,
it'll override that superclass method.
Ruby will simply invoke the first method
with a name matching the one it's looking for and stop.
The- my method on MySuperclass never gets invoked.
But what if you want to call that overriden method?
Well, you could just use the super keyword
within the overriding method that'll cause Ruby
to resume its search on the next class in the chain.
So it proceeds to MySuperclass,
finds my method there, and invokes that as well.
Okay, slides are all well and good, of course,
but I find that actually doing stuff in a terminal
makes things a little bit clearer,
let's play around with some classes from the terminal.
So, here we've got a superclass
and a subclass that derives from it,
whoops, I skipped right past it, I'm sorry.
This is a pre-recorded video,
y'all don't wanna watch me type in real time, trust me.
Okay, so we've got a superclass with my_method on it
that simply prints the class name,
and then we've got a subclass where
we'll override my method and call super as well.
We create an instance of MyClass,
and then we're gonna override my method
on the singleton class as well, and call super there.
We'll access the singleton class
and print the ancestors list for it,
and then invoke the method.
So if we drop back to our shell and invoke it,
you see our ancestors list starts with the singleton class,
proceeds to the subclass, and the superclass follows that.
And it exactly reflects the order
that the print statements follow when you call my_method.
Singleton class first, super invokes my method on MyClass,
super there invokes my method on MySuperclass.
And what would happen if you were to call a method
that doesn't exist?
Well first, Ruby will search the entire ancestors chain
looking for the method you called.
So if we call a method named nonexistent,
you pass at the arguments one and two,
Ruby will proceed through the entire method lookup chain
looking for that nonexistent method,
but if the- or, but when it doesn't find it,
it'll call another method named method_missing,
and the search will start over on the singleton class.
It'll proceed through the entire lookup chain,
and there's a default version of method_missing
on the BasicObject class which all Ruby classes
inherit from.
That raises the exception that you're probably
used to seeing, no method error.
But you can also override method_missing
in your own classes.
So, up here, on MySuperclass, we override method_missing.
We take three parameters, the name of the method
that was invoked, and two arguments.
So first, Ruby will look for the method name
that you actually call, that nonexistent method.
It'll go right past that method_missing definition,
looking for it.
And if it doesn't discover it, it'll start back over
at the singleton class looking for a method named
method_missing.
It'll find it there on MySuperclass and invoke it,
and, like I said, Ruby passes the name of the method
that you attempted to call to it as the first argument,
plus any arguments you called it with after that.
Okay, let's take a look at class methods.
You know class methods, they're methods you can call
on the class itself, without needing to create
an instance of it first.
So, up here, we defined a class method
on my class.
We say def self and the name of the method we want to define
instead of just def and the method name.
And then we can invoke that without
creating an instance of MyClass first,
we just say MyClass.class_method.
There's alternate ways to define class methods;
you can use the class constant instead of self_
within the class body, or you can
define a class method outside of the class altogether,
again, using the class constant.
Now let's take a moment and compare
that class method decoration where we use the class constant
to a singleton method declaration,
they look pretty similar, right?
The truth is, in Ruby, a class method
is just a singleton method on the class.
A class is just another object,
an instance of the class class.
And since the class is an object,
it has a singleton class of its own.
That singleton class lets us define
singleton methods on it, of course.
Once you call a class method, Ruby will find it
one the singleton class and invoke it.
And you can confirm for yourself
that MyClass is an instance of class
by calling class on it, and there you see the class class
in response.
You can also singleton class on the class objects
if you want to take a look at that singleton class.
So how does this notation work,
where you say def self and the class method name
within the class body.
Within the body of a class, Ruby sets self
to point to that class.
So, def self.class_method is exactly equivalent
to def MyClass.class_method.
And the results is the same, a singleton method on the class
that you can call.
What about methods defined at the top level?
Within a Ruby source file, you can just declare methods
and call them without surrounding them in a class or module.
And you can call them immediately after defining them
without needing to create instances of any class first.
The way it works behind the scenes is pretty simple,
top-level methods just get defined
as private methods on the Object class.
And since all other classes inherit from Object,
you can call methods defined at the top level
from any instance method of any class.
So here we create a- create MyClass,
define call_my_method within it,
and we could call that method defined at the top level
from within that instance method.
Ruby will start its search of the singleton class,
proceed - sorry, proceed all the way up to the Object class
where it finds my method and invokes it.
But then how can you call the method at the top level?
After all, what we're invoking here
is a private method, and we're not within a class.
Well, the secret is, Ruby just sets self
to an instance of the Object class
when you're at the top level, meaning you don't have to
define a recipient when you call the method.
You can see here that we print out self
and we get main in response, which,
if we check the class of that Object,
the class is Object.
At the top level, self gets set to an instance of Object.
And since self is the implicit receiver of that method call
and it's an instance of an object,
the whole thing just works.
Alright, so that's everything for classes.
Now let's move on to modules, which work behind the scenes
a lot like classes.
You can define methods on them, after all.
And you can also use a module kind of like a superclass
by mixing the module into a class.
Doing so will add the module to the ancestors list
of the class.
So, if we take a look at the ancestors of MyClass,
you see MyClass comes up first,
followed by MyModule, kind of like if it was a superclass.
In fact, internally, Ruby will treat the module
as if it was a class.
That means the method lookup works the same
with a mixin as with a superclass.
So we create an instance of MyClass down there,
invoke my method on it,
when we create that instance it creates
a new singleton class.
Ruby starts its search there, at the singleton class,
moves onto MyClass, because that was what was next
in the ancestors list, and then moves on to MyModule
where it finds my method and invokes it.
Since mixins use the same lookup mechanism as classes,
all the same rules apply.
For example, method overriding,
we can override my method within my class.
Ruby will encounter that first when you invoke the method.
And the my_method on MyModule never gets invoked.
Super works as well.
You can use super within the overriding class,
or within the overriding method within MyClass,
and that'll invoke my_method up on MyModule,
again, just as if it was a superclass.
So you can override a method from a module
with a method from a class, but not vice-versa,
not if you use include.
So let's suppose that we had a method
on MyClass that throws an exception,
and you want to override it using MyModule.
Unfortunately, if you use include in MyClass,
it's not gonna work.
The reason is that the include method
adds the module to, or adds the module
after the class in the ancestors list.
Meaning that my method on MyClass
overrides the method on the module.
And that's what Ruby encounters first,
it invokes it, and it throws an exception.
And we can firm- can confirm all this
if we call the ancestors method on MyClass.
You'll see that MyClass is first on the list,
followed by MyModule.
However, if you were to use the prepend method
instead of include to mix the module in,
that'll add it to the lookup chain before the class,
so, let's call it- let's use prepend
and let's call ancestors again,
and you can see the difference.
MyModule appears first, followed by MyClass.
Prepend allows the module method
to override the class method.
So, we use prepend here.
Which makes the module appear first in the lookup chain,
causing my method on MyModule to override
my_method from MyClass.
It finds that first, invokes it,
and no exception gets thrown.
Alright, back to the console.
Let's play around with modules a little bit.
So, we've got a- two modules here,
one of which we include in the MyClass
and the other which we prepend.
We're gonna define the method name my_method
that simply prints the module name.
We're gonna define that same method
within prepended module, but we'll use super there,
and we'll define it on the class as well,
and I use super there as well.
We'll create a new instance of MyClass,
call ancestors on it, and then invoke my_method.
And if we drop back to the shell and run that,
you'll see that once again, our ancestors list
matches the order of output when we call the method.
The prepended module comes first,
then the class, then the included module.
Okay, let's wrap up our tour of modules with refinements.
The refinements feature was added in Ruby 2.0,
and improved in Ruby 2.1.
Basically, the goal of it, as hopefully
you all saw in James Adam's talk earlier,
was to make monkey patching safe again.
Let's go with an example just to make things
a little bit clearer, let's suppose you want to change
the way that the capitalize method
on string instances works.
So let's say you've got a bunch of movie titles,
and you want them in title case,
where each word is capitalized.
Unfortunately, the default version of capitalize
capitalizes only the first word.
So, you can reopen the String class if you want,
this sounds like a good idea, right,
and redefine the capitalize method.
That is, you monkey patch it.
And this works fantastically well on your MovieTitle class.
You can see that it takes the string,
The Matrix there, all in lower case,
and capitalizes each word of it.
Unfortunately, you may have other portions of your app
such as this sentence class down here
that's assuming you still have the original implementation
of capitalize, which capitalizes only the first letter.
Doesn't work so good for sentences.
It's situations like this that refinements were created for.
So, you can take your monkey patch from the string class
and convert it to a refinement.
I won't go into details on all the syntax of refinements,
just google Ruby refinements and you'll find
several excellent tutorials.
Basically, this is a refinement of the String class here.
A refinement is basically a module
that prepends an existing class,
but only within lexical scopes,
that's files, classes, or modules,
that explicitly activate it.
That's a bit of a mouthful, so let's just see
what it does in action.
Now that your refinement is defined,
you can activate it within the MovieTitle class.
That'll override the capitalize method,
but only within MovieTitle.
Outside of MovieTitle in areas where we h-
or, in lexical scopes where we haven't
explicitly activated the refinement,
we still get the original version of capitalize,
and only the first letter of our string gets capitalized.
So now, the version of capitalize you get
depends on your starting part where the method is called.
In a scope where refinements haven't been activated,
the Sentence class for example,
it proceeds to the capitalize method on the string class,
the original version, and invokes that.
But, in a lexical scope where the refinement
has been activated, MovieTitle,
that prepends the refinement module
to the String class, and that's the version of capitalize
that Ruby encounters first, that's the version it'll invoke.
And we get each word capitalized.
Alright, one more time back to the terminal
to play around with refinements.
So, here we've got a class and a refinement of that class.
And on my class, we'll define my method,
which will simply print the name of that class.
We'll then override that method within the refinement,
so we print MyRefinement and then call super,
because, since refinements are just a module,
super works just like you're used to.
We'll create a new instance of MyClass,
and call my method, and then we'll activate the refinements,
and call my method again.
All works fantastically well.
We'll also put in a couple of markers
so that we can tell whether the refinements
are active or not.
Okay, so now we'll drop back to the shell,
and invoke it, and you can see that
without refinements active, we get just MyClass,
all by itself, but after we activate the refinements
with using MyRefinement, we get MyRefinement first,
and then super invokes my method on MyClass.
Alright, just about done.
I wanna recap everything we've covered real quick,
we'll have a few minutes for questions,
I wanna thank a few folks who've helped out with this talk,
and then I've got a few resources you can go to
to learn more.
So, to recap, you can call the ancestors method
on any Ruby class to get a list of its ancestors.
Every Ruby object has a singleton class
that Ruby will look on for methods first.
If a class inherits from a super class
that gets added to the ancestors chain
right after the class.
If you include a module in a class,
that will add the module to the ancestors chain
between the class and its ancestors.
And if you prepend the module,
that will add the, that module to the lookup chain
prior to the class, allowing it to override methods.
If you have a refinement, then in lexical scopes
where it's added, that gets prepended
to the ancestors chain before the class it refines.
You can define methods anywhere along this chain
that you want.
Ruby always starts its search for those methods
at the singleton class
and then proceeds along the chain
looking for those methods.
Soon as it finds a method, it'll invoke it.
If a method's defined at more than one point
along the chain, Ruby will stop
when it finds the first occurrence of that method
and invoke that.
And that's how method overriding is implemented.
Alright, that's all I have.
So what questions do you have?
Yes.
(audience member asks question inaudibly)
Yes, corr-
Okay, so the question is, if you include two modules
in a class, the one that you include second
comes before the one that you included first,
I believe I'm quoting you correctly there.
Do I know the reason for that,
I'm afraid that I don't, and that's why I
advocate using ancestors to confirm what order methods
will get called in - called in.
Yes.
(audience member asks question inaudibly)
What happens if you define a singleton class-
(audience member responds)
Right.
(audience member continues)
If you have a singleton class,
you get a singleton class of that
and define a method on that, does it still proceed
up the chain?
I'm not even sure what the code would look like
for that exactly.
(audience member responds)
I'm not sure that singleton class
is a method you can invoke on a singleton class,
that-
(audience member responds)
Okay.
(audience member continues)
Okay, I'm hearing back there that the singleton method
lookup chain can recurse, and it gets really messy.
which I totally believe, so.
(audience member responds)
Awesome.
Okay folks, we got people watching the video at home
that can't ask questions, please ask on their behalf.
Yes.
(audience member asks question)
That is a weird syntaxing, okay,
so the question is, if you go class
less than less than self, right,
do I have that syntax right?
Then yes, you can define methods within that scope.
They do wind up on the singleton class for that object.
And I can't remember what the official name
for that syntax is, I believe - yes?
(audience member responds)
Ah, okay, question is, is there a difference
between class and s-
(audience member responds)
So, the question is, is there a difference
between class shuffle, which is those two less than symbols,
right- oh shovel, shovel operator, yes, sorry.
Is there a difference between the class shovel self
syntax and def self dot method.
The end result is exactly the same, I can tell you that.
Whether Ruby treats them differently, I'm not sure.
I have a ref- in the references section,
I mention Ruby Under a Microscope.
The answers you need would be there,
I almost guarantee.
Alright, going once.
Going twice.
Alright.
Thanks to a few folks who helped out with this presentation.
Avdi Grimm, right up here, did technical review on it,
thanks very much.
Thanks also to Pat Shaughnessy,
author of Ruby Under a Microscope,
and to James Adam, who did a talk earlier today
on refinements for lots and lots of info
regarding refinements.
Thanks also to these Wikimedia users,
and Openclipart users for artwork that got used
in the course of the presentation.
And now, here are those helpful resources I promised you.
Great place to start is The Well Grounded Rubyist
by David Alan Black.
It starts out with singleton classes,
moves on to modules, basically
everything you want and to know
regarding method lookup is covered in there,
it's a fantastic resource.
If you want to go even deeper,
look up Ruby Under a Microscope by Pat Shaughnessy.
Basically, that's a tour of the Ruby language
from its c sour- from the perspective of its c source code.
Once you understand that, you will fully understand Ruby,
believe me.
And, finally, I want to mention the RubyTapas episode
on refinements, also by Avdi.
He made it freely available, it's a fantastic guide
to refinements in general.
Also, once James Adam- he just recorded it earlier today,
so I didn't get a chance to include it in the slides,
but James Adam's talk on refinements,
once that hits Confreaks, another fantastic resource
regarding refinements.
Quick plug, if you want to get your friends
hooked on Ruby, find me dur- Head First Ruby
is a great resource for you, find me during RubyConf,
I have little cards with discount codes on the book,
I also have a few up here at the podium.
That's all I got.
Thank you very much, folks.