Software Development: RubyConf 2015 - A Muggle's Guide to Tail Call Optimization in Ruby by Danny Guinther

Wednesday, October 05, 2016

RubyConf 2015 - A Muggle's Guide to Tail Call Optimization in Ruby by Danny Guinther

Good morning everybody.
Going to try to start just a few seconds early
because chances are I will go over time.
But I'd also like to give you guys an opportunity
to ask questions
in the event that you can't read any of the slides
or anything like that.
Be able to shout at me and say,
"Hey, could you read that for me because..."
Lots of good stuff.
All right.
So the first thing I want to ask you guys
is how many of you guys went
to Aaron's keynote yesterday?
Nice.
That's great because I was so psyched
about his keynote because it was such
a great introduction to a bunch of things
we're going to talk about today.
So without further ado, let's do that.
So today, my presentation, as you all guys know,
clearly, because you're all here,
is A Muggle's Guide to Tail-Call Optimization in Ruby.
I'll get into what I mean by that a little bit more
in just a second, but first let's consider
some alternate titles I was thinking of.
Harry Potter and the Well of Eternity.
Because systems back errors can be a nightmare
especially those back traces that go on for days.
And just kind of wanted to get this out upfront.
Harry's a parseltongue, and his scripting language
of choice is probably going to be Python, not Ruby.
So if we can start to kind of come to terms with that now,
that'd be great.
So, kind of actually in that vein, I want to also
give you a warning.
Maybe not a warning, but like
for whatever reason, this topic
like I guess quite a few topics in our industry,
gets pretty religious at times,
whether it's like Bem or Emacs.
In this case, we run into that divide
between functional and object oriented languages.
So if you'll indulge with me for a moment,
I'd like you to imagine if you dare
a world without Minaswan.
And for you guys who don't know what Minaswan is
it stands for Matz is nice, and so we are nice.
I don't know where that picture came from.
I actually can't find it on the internet any more
now that I found it
It's Matz with a Matz puppet.
What the hell?
All right, so, from the blog of Guido Van Rossum,
who, for those of you who may not know,
is the creator and inventor of Python,
so back in 2009 there was a lot of talk going on
about tail-call optimization.
Here are just a few select quotes, comments
from his blog related to it.
"How very pythonic of you to make up stupid reasons
"for not implementing a very simple optimization.
"This is very pythonic because it shows
"poor decision making, poor performance,
"and immature ideology."
So very clearly like somebody who like kind of more
functional type things and just can't believe
that Python is not going to include it.
Total opposite end of the spectrum.
"Refreshing that you stuck with your intuitions
"rather than submitting to these TCO-requesting
"functional fiends."
It blows my mind a little bit that a language feature
can insite so much argument.
But that's a
The good news is no matter which side you are on
in this particular matter, because it doesn't matter,
you're going to be right.
Because everything,
it's all about your point of view, right?
So as Obi Wan will tell you here,
It's all
Everything kind of depends on your point of view.
I also should warn you there's a few
Star Wars references in here despite
the Harry Potter theme of this talk
because I am so excited for next month.
But, ah, stay focused.
Stay on topic.
Move along.
So, all right, good.
I'm glad you guys laughed because somebody
was threatening that I would get a twilight crowd
and I am not prepared for that.
So. (laughs)
What the hell is a muggle?
So for those of you who may not know,
a muggle in the Harry Potter books is someone
who lacks magic ability.
Now if you're wondering why you care about this
in this particular context,
the way I want to present this talk is
without magic.
And by magic, I kind of mean like,
I think you most often hear it with Rails,
like Rails magic.
It's doing all these magical things behind the scenes
for you, and on the other side of that magic
emerges something wonderful.
But you have to suspend disbelief in that middle process.
But if you're one of those people who loves magic,
I don't mean to exclude you either.
That works.
But it's going to be up to you to decide kind of where
the line between knowledge and magic fall.
Because as much as I'd like to explain to you every
minute detail of this, we only have so much time.
And I just can't do it.
So, let's get down to our subject.
So, what is a tail-call?
Now if when you hear the term, tail call
you think about certain types of phone calls
that happen between midnight
and three a.m. in the morning,
I'm going to need you to put down the Bud Lite Lime,
focus, and let's do this.
So, kind of definition dictionary-wise of a tail call is
a subroutine call performed as
the final action of a procedure.
So that's a little bit terse.
So why don't we like look at this in action?
So here's kind of like your canonical example.
Mostly canonical because it is just about
as simple as can be.
You have this method, a call in tail position,
and you have another method inside of it
that is very clearly the last thing that
the outer method does before it is complete.
To contrast that with kind of your canonical counterexample,
you have another method, not a call in tail position,
which calls other method and then adds one to it.
So this kind of defies that definition we looked at
because other method is not the last bit of action
that is happening in this method.
You have to still add one.
And I know there are some clever folks among you
who are thinking, "Ah, but what if we do
"one plus other method?"
Still not a tail call because again
it's not necessarily that it has to be
the last thing at the end of the line
or on the last line.
It really just needs to be the last thing
that happens in that method before it's done.
And in this case, both cases, once the result of
other method comes back, we still have to add one to it.
So the kind of interesting thing about this
is it means there are very certain circumstances
in which you can make a tail call.
So some examples of those types of,
they call them tail call saves, are some of these.
So in this case we have like in the middle of a method,
not even the end, kind of behind an if statement,
we have this return sum call.
This is actually a valid tail call
because it's attached to return.
So really the last thing this method does
is this call to sum call.
Another one here kind of like those counterexamples
we just looked at is this call with
an expression inside the arguments.
So this is different than the one we looked at
before because by the time we actually call
other call, the expression my int plus one
will have already been evaluated.
So other call will really be our last operation there.
You can also attach it to boolean operations,
so you have false or other call.
That would work, as well as true and other call
because in both of those cases
that other call's really just going to be
the last thing that happens in the method.
And a final example here, and kind of what gets us
into some of the real power that comes with from
tail call optimization is this recursive count call,
which really just recursively will keep calling itself
until it gets down to zero.
So you kind of get like a five, four, three, two, one, zero
type thing.
Again this one I think is a little bit closer
to that first example we looked at where
it's not too complex.
You can really see that recursive count is the last thing,
the last operation that will happen in this function
before it returns.
And there are many more, which I will be happy
to talk about later if you're interested.
So let's take a moment here
to pause and reflect on recursion.
So if we were to look at that recursive
count we were just looking at again,
you can kind of see that the output
of this can be, as I said, five down to zero.
And if we unwind the stack, we can kind of see
the recursive count gets called,
it puts five.
Checks to see if that's zero.
And then calls itself again with four.
And it continues all the way down
until you have your base case,
where you'll hit zero and you'll exit it afterwards.
When you have recursion plus a tail call like this,
you end up with what is tail recursion.
And so it's a special kind of recursion
that's particularly useful when combined
with tail call optimization
because it allows you to do
some really neat things from a kind of like
functional perspective.
Like if you're familiar with Church's lambda calculus
one of the things that it was able to
take advantage of is this kind of like
tail recursion to actually make it so
it needed really fewer operators in some of
the modern languages that we use today.
And as we'll see in a moment,
you can actually totally replace loops
if you have tail call optimization.
In this case, you have side by side
really kind of like methods doing the same thing,
but in one case it uses recursion to
implement that loop and to track the state
of the loop in like actually arguments to the call
versus iterative count which tracks the state
more inside of the actual method body.
So the approaches are equivalent,
but there's a catch.
So the thing is tail recursion heavily depends
on having tail call optimization in many cases.
Because there's some things functionally you just can't do
if you don't have tail call optimization
because you'll just run out of stack eventually.
Depending on like how like some of the arguments
and stuff that your method takes in Ruby,
like whether it takes a block and uses special variables
in some of those types of things
usually it's somewhere under like 10,000 frames.
kind of configured by default before you run out of stack.
So for those of you kind of getting up in arms
again but into our flame war here,
chill out.
Let's just continue.
So why don't we actually get an idea of what
tail call optimization is.
And perhaps you've also heard of this
other phrase.
What is tail call elimination?
Well let's look at both of those.
So the trick is actually that they're
really kind of the same thing.
One is talking about
Tail call elimination I secretly want it to be
kind of like a, I don't know,
tail call destroying terminator.
I don't know why someone on the internet made up
Harry Potter Terminator.
I swear it was not me.
I guess I shouldn't be surprised.
So the trick is that the tail call elimination
really is kind of the optimization.
So it's getting rid of the additional
stack frames that you need when making
a tail call.
That is the optimization.
But I'm getting a little bit ahead of myself here,
so let's actually pause for a confession
before kind of digging into it.
So you'll remember before I mentioned
that so we had our recursive count function.
And, again, kind of as Obi Wan alluded to,
this is only true from a certain point of view.
And the point of view I've actually depicted here
is an environment that already has
tail call optimization enabled.
The thing is I didn't mention to you that Ruby
is not tail call optimization by default.
But we'll get more into that later.
So what's really happening is in reality,
each of those calls as you can kind of see
bu the indentation here,
is pulling out another stack frame,
and allocating more memory and what have you
so it can execute that method.
Which is fine if you're counting down from five.
But if you're counting down from even like
10,000, like I'm not sure who really thinks
that this is like a good way to move forward
with trying to use recursion in any language, really.
So you can see here, kind of your typical
stack level two deep error that you'll run into.
And again we kind of come back to, oops,
this quote, whoa.
So a lack of tail call optimization here
has proven that in this case we really can't
use this recursive solution because we just
run out of the stack.
But on to actual tail call optimization to the rescue.
So let's actually figure out this secret magic
that is tail call optimization
and how it helps some of these situations.
So as we discussed before, a tail call is by definition
the final action of a procedure,
which means that we can make some
assumptions about how our program
is going to work kind of after making that call.
And the most important of these ideas
that we can kind of derive from this
is that nothing in calling method is needed any more.
So let's hold on to that idea
and kind of figure out why we might be
holding on to it if we don't need any of it.
So and actually there's a guy,
Guy L. Steele Junior, who I think was one of the
original writers of the scheme language,
has this quote that I think also kind of
frames this up in a good way.
"While we may prefer to think of a procedure
"calls as meaning 'go do this other module
"and come back', that is only one possible interpretation.
"We must recognize that there are situations
"where coming back doesn't matter.
"And these situations may be exploited."
And that again I think is kind of
the crux of tail call optimization.
So again kind of focusing on
if we don't need this parent stack frame,
why am I even keeping it around.
There are two main reasons.
One is if you'll kind of remember
Aaron yesterday mentioned that
in the YARV VM, keeping the stack around
allows you to see your path through the VM.
If you do kind of like putz caller from IRB
or kind of a debugging session.
That's what you're going to see is really that
stack frame that's being kept behind.
The other thing it's useful for is
keeping a record of where the result
of each method needs to go.
So it needs to get passed, but eventually down to
where ever it needs to get to.
So of those two things, I'll say something
perhaps controversial, but keeping a record
of the execution stack, I think, is just a convenience.
That's really nice for debugging, but in reality,
it's not something that's needed
by any means for the program.
So if we pretend like,
you know if we skip all the argument
and pretend like we don't care about
the complete call stack,
what can we do to try to avoid that
and maybe cut it down to only one reason
we need to keep these other frames behind.
So is there any way we can get around
needing to know where to return the result of the tail call?
As it turns out, no.
Thank you guys for coming.
Enjoy the rest of RubyConf.
Sorry, I don't know.
Rhetorical questions.
Who knows?
So if the stack frame of the tail call
needs to know to return to its
result to its parent anyways,
why don't we just like circumvent the whole stack
and just jump straight from like your deepest call
to where it needs to get to?
And that is really like the magic of tail call elimination.
You can avoid some of those stack frames
and save a lot of computation and memory needed
and enable a lot of functioning solutions.
So I kind of went through that quickly.
Any questions right now?
Tylenol?
Anything?
Pause for a drink while you guys think of your questions.
All right.
I'm either doing a really good job
or I lost you guys all five minutes ago.
Oh, sir.
- (indistinct talk)
So the question was I said before it was not
enabled by default and the gentleman suggested compiling
your own MRI to get it to work.
We'll actually get into a bunch of ways
to make it work in just a moment.
So, if I can ask you to bear with me.
So one thing to kind of clear up before moving on,
why stop kind of with that?
If you're going to just like circumvent
the parent anyways, why create a new stack frame
and then, I don't know, cut the parent out after that.
In practice, what really happens is
you, you don't end up circumventing the parent.
You just reuse that same frame
to do the next thing that you're going to do.
So there's no.
Something about elimination to me kind of
suggests kind of that there's this extra stack frame
that you like avoid creating or
I guess like this extra stack frame that you get rid of.
But the reality is you just,
it's kind of like recycling.
It's like green programming.
So here's kind of our countdown function again.
On the left, you can see what it looks like
without tail call optimization.
You go three, two, one, zero.
But on the right, with tail call optimization,
you can see, we're just reusing one stack frame
to go through every one of those iterations.
So that's all well and good,
but as this guy said, "If it's not Ruby, then
"no offense, but I really don't care."
So about that, Ruby is tail call optimized
as it turns out, sort of.
So it's actually been built into the RBM since Ruby 1.9.2.
But it's actually never been enabled by default.
There is some discussion of perhaps enabling it by default
around the time that Ruby 2.0 came out,
but nothing ever came of it.
So let's look at how we can get in there and play with it.
Though this should be a warning for you here
as the dragon will illustrate.
I imagine not a lot of people are using this
in production, so don't ship into production
and then write me later and say,
"Listen, man.
"You said this was going to be great,
"and it was not great."
I don't know.
Test it out.
I think there's a lot of opportunity for it,
and I've not run into any weirdnesses
where you suddenly run into seg faults
or anything like that.
But again I would say use caution.
So there are a few different ways
that you can enable this.
So as this gentleman back here suggested
you can compile your Ruby with the flags switched around
so that it's enabled by default.
You can get into RubyVM instruction sequence,
which Aaron mentioned yesterday.
And it's actually one of those optimization
options in there.
You can also create like an instance,
your own instance of RubyVM instruction sequence.
Like there's like a global versus an instance one.
Probably best not o screw with the global one.
Though sometimes it's handy because
things like require and load will use
the global instruction sequence to compile that stuff.
Whereas if you, even if you set up
an instruction sequence with tail call optimization enabled,
if you call required a load from it,
you're not going to get tail call optimized code loaded.
And so the last one here,
full disclosure,
I am the author of this gem.
I created this to play around with tail call optimization
There's a TCO method gem that allows you to kind of
experimenting with different different APIs
to try to make it accessible and kind of feel normal.
So, I'd like to pause here for awe
because there's something to me that I think
is really astounding about a VM that you can like
have tail call optimization enabled over here
but not over here.
And, I don't know.
That's pretty awesome, and I think really speaks
to the power as you kind of saw with some of the
C code that Aaron dug into yesterday,
the complexity of the VM that Ruby runs on top of.
So, how to patch your Ruby to have
tail call optimization enabled by default.
I don't necessarily suggest that you just go out there
and take anyone's patch from the internet
to modify your RubyVM,
but you can trust me.
I'm a software engineer.
What can go wrong?
So, this is really the diff.
You really just need to flip two flags.
Something worth noting is that when
tail call optimization is enabled,
you can't use the trace instruction,
which you may be familiar with
as set underscore trace underscore funk.
I don't think I've actually ever seen anyone use it,
but I'm sure there are,
I'm sure it's out there somewhere.
As I mentioned, RubyVM instruction sequence is also
kind of a gateway to getting into there.
Won't go into great detail about that class itself,
but there's a lot of like kind of neat things
you can do to kind of like, I don't know,
get in there and like play with the VM
and get an idea of what's going on.
So I definitely encourage you to look at it.
Aaron yesterday mentioned the book,
Ruby Under a Microscope by Pat Shaughnessy.
I would definitely recommend that to you.
There's a few things he gets into in there
which I feel like will help you kind of better understand
how you kind use this to get an idea of what's going on
under the covers.
But also I think will give you a ton of insight
into how the RubyVM is running
and again kind of like one of those like
awe types moments where it's really
like, "This is crazy."
This is how you could selectively
compile code at run time
using RubyVM instruction sequence.
Just if you wanted to go with the
totally vanilla, not going to pull a gem,
another dependency into my project.
Hold on to these ISEQ OP, sequence option constant
up here at the top because I'm not going to
repeat that later just to make things easier
for you guys to read.
But again, you can kind of see it's really the same thing
we were doing in the diff where we
turn on tail call optimizaition and have to
turn off the trace instruction.
And it's worth noting that even if
you turn on tail call optimization but
don't also turn off the trace instruction
you will not get tail call optimization.
I guess the trace instruction kind of
like overrides it, which is something to watch out for.
There were many times when I was like,
"Why doesn't this work?"
But, that's the trick.
So in this case,
let's actually like see if we can look at
like a real kind of example of how that works.
So in this case, one thing I find
kind of like clunky about this particular
approach is that you may have noticed
it in Aaron's talk yesterday,
you have to like just pass it like the raw code strings,
which this particular code highlighter
is doing a good job, or
I'll say a bad job actually because it's
highlighting a string as though it were code.
But in like a VM or something else like that,
you'd probably just have a whole block
that looked like a string there.
But, yea, you have to take just like this
string and then pass it to the RubyVM
instruction sequence object to evaluate it
and actually turn it into a runnable bit of code
that would have tail call optimization enabled.
So to try and combat that, I try to encapsulate
at least some of the stuff of actually
generating that instruction sequence object.
Here you still have to use code strings,
which I think are always going to be clunky.
But it gets a,
I don't know I guess you at least have
to have a little less knowledge of what's
required to actually make this work
recursive, or with tail call optimization.
Another API I've been playing around with
for this is to use like a mixin that does
kind of like method decorators.
So you can see here, it adds this TCO method mixin
which gives the class this TCO module method.
But after I've defined recursive factorial in this module,
I can then use this like method decorator to say,
"Hey, that should actually be compiled with
"tail call optimization."
And behind the scenes, I don't know,
it's kind of hack-y so again probably not the
production quality stuff, but it'll go
find that method, find the source for it,
and basically run through everything we just
saw with the VM instruction sequence.
Which I think personally is definitely prettier
than a systems stack error.
And another kind of API I've been playing with,
this one's kind of new, but I think
it's actually getting close to more idiomatic Ruby
as you can just do a block of code
and whatever is in that block will be evaluated
with tail call optimization.
The big catch for this one though is
still under the covers it's turning this into
a string and evaluating it.
So even though you have this block here,
you're going to lose your block scope.
You're going to lose your scope.
You're not going to be able to access things
outside of it.
And vice versa.
If you define something inside this block,
things outside of it won't be able to get to it.
Maybe that's a trade off you're willing to make.
And it's actually part of the reason in this case
I kind of treated it more like
I don't know, I think it's really nice
if you're kind of like using a strategy pattern,
where you have a particular strategy of something
that perhaps would benefit from
tail call optimization.
So in this case, I used this TCO method
with TCO method and then defined the module inside it.
It's kind of take the strategy type approach.
Let's talk kind of like again about
what some of the benefits are here.
Ruby is multi-paradigm language,
which really means that Ruby kind of
takes the best of a bunch of different languages
and consolidates that into the awesomeness
that you and I all, and we all know as Ruby.
But in its multifaceted approach, you run into
that same flame war where you have
the very object oriented people in Ruby who
makes a lot of sense because Ruby itself
is so object oriented.
But with the introduction or like the
presence of things like Prox and Lamdas,
you really have a lot of functional power too.
There is still even within like Ruby,
this divide and kind of falls into this multifaceted thing.
On the functional side of things, tail call optimization
enables a wider range of functional solutions
by better enabling recursive solutions
including tail recursion.
But that's not to say that there is not
an object oriented benefit.
The example of it is a little too much
to go into in this particular case,
but again the guy I mentioned before,
Guy Steele, a few years ago in a blog post
made a good demonstration that
by having tail call optimization,
you can actually have better abstractions.
The idea being that with some things
if you can like call it and not have to worry about it,
like you running out of stack because of recursive calls,
you in many cases need to know less
about the internals of another object.
I think the straw man he put together
is like kind of, I don't know, it's not
something you'd use everyday,
but I think the fact that he,
I don't know, I think the fact that there is
kind of a demonstration of it at all
demonstrates, at least to some degree,
tail call optimization definitely does
benefit object oriented designs.
Not just functional ones.
I did some experimenting also,
and I found that tail call optimization also
helps performance.
What may or may not be obvious is
that one, when you kill those stack frames
that are otherwise just lying around,
you can actually garbage collect those objects
a lot sooner.
So if you have like, I'm not sure
what a good example would be,
but if you had a method that did a lot of
object allocations and ended up holding on
to a lot of memory somewhere back
in the stack, even if you're never going
to go back there, you potentially are
still holding it, and actually can't release
that memory until you unwind the stack
get rid of that frame and move on.
You also get better performance and a benefit
from reusing stack frames instead of just
creating new ones every time
when perhaps you don't need to.
I think there's also a good language crossover
opportunity here because with ECMAScript 6,
JavaScript is now supposed to be tail call optimized.
I think it'd be really interesting to see
how they make that transition.
I don't know that any of the major browser vendors
yet have actually released an engine that has
tail call optimization enabled,
but it's definitely in the spec.
And I haven't heard anything about people
like trying to pull it out yet.
We'll see what happens.
So kind of on the flip side,
why not tail call optimization.
And again, so the thing I kind of bushed over earlier,
the stack frames that would be part of a
tail call optimization are eliminated from the stack,
which makes debugging much more difficult.
And actually it mixed like certain gems
like do you ever use like PryBy Bug or PryGnat
or PryStack Explorer, and particularly these
guys are going to be, I'm, I imagine
it would probably seg fault
if you tried to run it against a Ruby
compiled with tail call communication enabled.
So definitely not as nice a debugging story.
As I mentioned also earlier too there's
that set trace function, which I haven't seen
really in action, but it's something to watch out for
being broken if you were to turn on
tail call optimization by default.
As you saw with some of the examples
you have to work with like these
strings of code, it can be pretty cumbersome
to use as it stands.
And again as I mentioned,
since it's relatively unused and untested,
despite its availability, that's something
to watch out for.
I actually tried to,
so I spun up a version of Ruby with
2.2.3 with tail call optimization enabled.
Tried to run the rails test suite,
and it was pretty good,
but there were definitely things that failed
that didn't fail with like a vanilla form
of 2.2.3.
Who knows if it's,
I don't know.
I'm not sure if it's really important things.
I didn't like kind of dig into each of the failures.
But definitely an argument for not
running your rails server with
tail call optimization enabled.
Again like I think there's still potential
for situations where perhaps you want to use
a strategy.
I don't know.
I don't think you really need it every where,
but I think there are cases where
you can maintain a lot of elegance
and you can get a lot of performance
and just have solutions that you wouldn't
otherwise be able to get to without
tail call optimization.
And in my experimentation, I found
that there's still some weirdness
around what constitutes a tail call.
We looked at a few examples earlier,
but it's hard to figure out the VM.
For the VM it is hard to figure out
when you're not like when you don't
need the rest of that method body.
And so in some cases you have
to do kind of like weird things to
get to a point where you can actually tell it
or it will actually know,
"Okay, I don't need the rest of this method.
"I can leave."
I don't have an example here,
but in some cases I just like made sure
that even though no code path could reach
the end of the method, if I just put a nil there
it would optimize it.
Where as in other cases it would just be like,
"I'd better hold on to this.
"I'm not sure what's going to happen."
So some weirdness there,
but it may be unavoidable without
I don't know, changing some how the VM works.
That's it for the presentation.
If you guys are curious about me,
you can find me on Github.
I have wrote a bunch of stuff about this
on my blog.
Actually if you enjoyed Aaron's walk through
yesterday where he was kind of like
digging into how some of the I sequence
stuff and what have you, I went on a voyage
to try to figure out the source of tail call
optimization inside of Ruby
and there's lots of C code, so beware,
but I thought it,
I found it very informative
and again kind of like
kind of awing in just that it
I don't know to me it seems pretty magical.
I work at datto in Boston.
We do business continuity,
which is a fancy way of saying backup.
We here you have us with Hillary
because we're actually one of the companies
that had Hillary's emails.
So if you guys are interested,
I could maybe hook something up.
Just let me know.
But that is it.
Thank you guys all very much,
and if you have any questions,
I'm here and we have time.
(audience claps)
The question is if you have to do
any Ruby recompilation if you use the mixin.
And you do not because luckily
like the
again that's kind of like what is awesome
about it is you just like tweak a setting
and suddenly you can say,
"Eh, I don't need those stack frames. Whatever."
And then you can just have things running
in totally different places that have those
different ideologies like,
"Yes I want the stack.
"No I don't want it."
And they work fine together.
(indistinct talking)
Right, yea.
Exactly.
Because you can just you can just isolate it.
Sorry, he said, "Then you wouldn't crash rails
or your whatever."
And that's true because yea it'd just be isolated
to whatever scope you had kind of compiled
with tail call optimization enabled.
Absolutely.
Yea, so I'm sorry.
The question was
let's say you kind of went the other way
and you're like, "Okay.
"I'm going to compile my Ruby with
"tail call optimization enabled by default."
Could you then have like blocks of code
that you compiled without
tail code optimization.
Yes, you definitely could.
You just would kind of like flip those options
around, turn back on,
well potentially turn back on the trace function,
turn off tail call optimization.
You should be able to compile it
as you might expect.
I think that's a good point.
I think there's lots of situations where just by.
I'm sorry.
So the question is how often when you run into a stack error
is tail call optimization the solution.
And I think there are definitely
a lot of situations where
accidentally you've called something that's
infinitely calling itself in a loop,
so obviously in those situations
it's not going to help.
But I think there are situations where
if for example I guess, I suppose
like one of the ones we looked at
was like generating kind of sequences
like Fibonacci or what was the other one
that was in here?
Factorials.
But again kind of in the keynote,
if you guys weren't there at the keynote yesterday
afternoon, if you were doing that for work,
I'd like to know where you work.
Because I don't know who's generating
Fibonacci and factorials for work.
But I think there's definitely an argument
for in some algorithms being able to say
we don't care about this stack here,
and thus allowing you to kind of proceed
in a recursive strategy.
Perhaps something like parsing HTML
would be really good for this.
Anything that like kind of like
has a tree structure is really
going to match really following kind of like
a recursive strategy I think really well.
In those situations it might be nice to say,
"Okay, let's change the game here,
"and let's just recurse as much as we want."
Yes.
So the question was if I've benchmarked
any of these comparisons.
And I have some,
but frankly I didn't see a tremendous
benefit from it.
I thought in at least my experiments,
I think it was probably with generating
large factorials that there was not a significant
speed up.
But I was mostly looking at execution time.
Potentially there's benefit there from
allocating fewer objects and thus
avoiding garbage collection pauses
and things like that.
But generally speaking it seemed
fairly similar to me.
Yea.
Nice.
Good question.
This is actually something I ran into
with some.
Sorry, so the question is if there's any way
to tell if a method is tail call optimized
without actually just trying to force it
into a stack overflow.
And the, so this is a problem
I actually ran into in trying to test the gem
to make sure that when it compiled something
it's actually doing it with tail call optimization.
Depending on how you structure the method,
there are a couple of different ways.
The probably like most rigorous way to do it is
using the RubyVM, instructions sequence object,
you an actually take the chunk of code,
and as I think like Aaron showed yesterday,
you can actually get it to translate it into the
ARB instructions that it'll run,
and when it is tail call optimized,
there are certain flags that will appear
in the instructions.
So it's not like a bulletproof method,
but I think it's the most reliable.
The other thing you can do is like
you don't have to go all the way to the extreme
of like stack overflow if you just call
the method once and can check your stack,
if it worked you should have eliminated the
last stack frame, so it shouldn't really grow.
It should kind of stay static,
which is another way you can test it.
And that's the way that I went with for
the gem just because I think
I feel more confident about actually
seeing that the stack frames are eliminated
compared to the VM telling me that it's compiled
it with tail call optimization.
Other questions?
The question is if I've tried to force an exception
with this in trying to see if something like RPM
agent or things like that would break
because of perhaps a dependency on
being able to analyze the stack.
For new relic I have not tried it.
For rails it definitely seems like there's
some potential for it.
Like some of the exceptions I remember
it seemed like that perhaps
perhaps for some of the lookup stuff
it uses some of the call stack to try to
figure out whose parents are whose.
Because some of the failures that were happening
is modules not knowing the right nesting.
So that's potentially one place you could run into
issues.
Where else might cause a problem?
I think new relic is like a great candidate for a problem
because I would imagine that it probably
does some of that tracing of a call stack
to figure out where and what your program is doing
to kind of give you some of the metrics that it does.
The question is what do I think would need to happen
in order for the greater Ruby community
to be interested in tail call optimization
and to see it more easily accessible in MRI.
That's something I've actually asked myself a lot
in regards to this talk because
in light of there being some talk about it
around 2.0, it seems like another opportunity
to kind of reflect and say is this something
we want?
Does this benefit the community?
I think
personally I think the best way to do that
would be to kind of probably make it more
I don't know like add some kind of language
primitive to be able to do it,
so you could have a method that is compiled with
tail call optimization.
To me that I think makes the most sense
because it's usually like you have a
particular method that you want to work
with that in.
I don't know.
You potentially run into some issues
where it's calling things that aren't getting
tail call optimized.
And so you're getting like some parts of the stack
left behind and others not.
I'm not sure if that would work.
I think actually on the like the Ruby issue to talk about
making it enabled by default,
there was a note there saying that if they wanted
to introduce some kind of language primitive
it would be somewhat difficult the way that the VM
was constructed at that time.
But personally, I think that would be
the way to do it.
To make so it's easier for people
to explore and to get curious about it.
That's actually a lot of what drove me
into learning about tail call optimization at all
is kind of like, you know.
I was at a stand up meeting at work
and someone was like,
"Oh if Ruby had proper tail calls,
"we could do this, this, this, and this."
And I was like I have no idea what you're talking about.
But then there's a guy, Nathan Bacall,
posted a blog post saying, like,
"Oh, Ruby does have tail call optimization
"under the covers."
And presented with that, I finally had
somewhere I could go explore and say,
"What is this tail call optimization?"
And personally I think making that more
available for people to explore their curiosity
I think would really help, number one figure out
if we care about it.
And hopefully push it forward as something
in the language.
Follow up question?
Not
I mean yes.
There is definitely are.
I was going to say not that I know of,
but I don't have personal experience
with any of them.
The question, sorry for those who couldn't hear,
was are there any other optimizations in Ruby
in the VM that we might be interested in playing with.
I think Aaron showcased a couple yesterday,
though it was never quite clear to me
how those played into the stuff he talked about
but that might be my fault.
But there definitely, I think there's like a hash
of options you can hand that particular thing.
And I think there's like eight or nine different things
you can turn on or off.
And they're meant to be VM optimizations but
there's definitely something there in the right use
case could help out.
The question is do I know of the status
of what tail call optimization is in other Rubies
like Ruby NS or J Ruby.
I don't actually have any idea.
So one thing I know in researching this,
JAVA is one of the major languages that got
lampooned a bunch for not having
tail call optimization enabled.
That said, that doesn't mean J Ruby couldn't have it
because it's a VM.
Like J Ruby's running the VM that runs Ruby.
So I would think they definitely could do it
if there were something that
Hedious and Mware were into.
I have less knowledge of Rubinious,
but I would think that all of these things
are probably building a VM in some other
kind of lower level language on which they
compile and run Ruby.
So I would think it's possible, but
I don't know if they've tried to achieve it
again mostly because in MRI Ruby
it's like vague but it's not really needed.
It's not required for anything.
Actually, sorry, a follow up to that
I don't know.
You'd have to speak to Charles more specifically
about this, but I'm pretty sure I've seen
him tweet that running the Ruby test suite
against J Ruby has been successful in places.
Maybe fully.
Don't quote me on this, but there are tests
to test that it works.
So if J Ruby is really running the Ruby test suite
on their VM, then I would say it's in there.
But that's, I don't know, guess work.
Oh, what situation might you enable it in production.
I would think like certainly not a web server
given kind of what my experience was with Rails there.
But i think there are probably native type applications
where just perhaps for performance or to minimize
the amount of memory used by RubyVM,
you might want to use it for those reasons.
That said depending also on your style of programming,
if you were more functionally oriented
you might want to do it so you can have all of the
niceities that come with it kind of from
a functional perspective.
And also if you have a particular algorithm
that you think would benefit from
being able to recurse indefinitely.
I think that would be another situation for it.
Again I think anything that is kind of related to
trees, linked lists, which is kind of
like a tree with just one limb,
any of these types of things I think really
work really nicely with recursive solutions
and I think if those are things you work with a lot
it might benefit you to try turning on
tail call optimization.
Last chance.
All right, guys.
Thank you so much for coming.

Software Development

Pages

Wednesday, October 05, 2016

RubyConf 2015 - A Muggle's Guide to Tail Call Optimization in Ruby by Danny Guinther

About Me