Software Development: RubyConf 2015 - Ruby's Environment Variable API by Jack Danger Canty

Wednesday, October 05, 2016

RubyConf 2015 - Ruby's Environment Variable API by Jack Danger Canty

anger
It's a silly, silly name, but it is mine.
And I work at a company called Square.
We're in San Francisco and a few other places.
We have a New York office, Atlanta, St. Louis,
Waterloo, Canada, and Tokyo.
It's a pretty great place to work.
Because Square sent me here I feel like I should,
it's obligatory that I tell you a little bit about
how we do engineering.
I'll just sum it up by saying that we do
billions of dollars in financial work with
our own manufactured hardware that we create,
code the firmware for, manufacture
and then send out to all of our people.
And then all the iOS, Android, and JavaScript applications
that all the customers that we care about use.
And on the other side of the internet,
we rack our own hardware and lease data center cabinets
and manage all the networking, all the way up
and all the services that run on those completely,
and we do it all with just about three or 400 engineers.
And we'd love for you to join us
cause we do it all in a really secure, superbly secure
and really, really pretty way.
Very well designed way.
And it's a lot of difficulty. (laughs) It's really hard.
So, please come join us.
We'd love to show you how we do it
and have you help us do it better.
All right. This talk is about the
Ruby Environment Variable API,
which, in my description of the talk,
I actually said, "I'm not going to call it that"
and then Abdi was like,
"No, that's great. You should totally call it that."
So, I'm calling it that and I actually kind of like it too,
because what we're gonna talk about is
the things that you can do with environment variables,
which I'll explain in a second if you're new to them,
to tell Ruby how to behave.
And, this is the core, actually, of how all of the tools
that we use, that are not Ruby itself, operate.
How Bundler and RVM and rbenv, chruby,
and, um
I'm missing some of them,
how they all actually work,
they work through environment variables.
And we're gonna talk about how to
debug your machine when they go wrong.
Another name for this talk could be,
"require, cannot load such file"
because, when you get this error,
something in the chain of stuff I'm about to show you
has been misconfigured.
To begin with, let's talk about the ideal world.
So, the perfect world is this.
There's one version of Ruby.
It's at, let's say, bin/ruby
and you run it as, let's say, root
because you don't have to worry about users
and permissions and stuff
and if there's one argument and one argument only
and it's one file that contains
all of the code that there is.
You've just copied out of your favorite gems,
just copy out of the rail source,
just paste into the top of this file, at the top.
And then, as you go down, you paste in other dependencies
until the very bottom, there's your app.
And you run this, then it runs
and it never has to call a require.
And that's perfect.
And, that's what we're gonna do.
That's not what we're gonna do. That's ridiculous.
That's also just not the real world.
The real world has a lot of rubies.
Like, like a lot of rubies
and your application requires a lot of files.
This is the reality of it and your application,
you don't say require slash full path to file
because that would be extraordinarily tedious, error prone,
too tied to specific versions
and it would be impossible to manage.
Instead, you just say, I want this thing,
give me this thing.
In fact, give me the right version of this thing
and the right version of this thing
considering these other things that I also want.
And it should just work.
Except when it doesn't.
You know, you can just require nokogiri,
"cannot load such file nokogiri",
Oh, also, please forgive me, the syntax highlighting,
it's hard to do it on slides.
So, sometimes I get it wrong.
Also, I'm color blind, so you're gonna see
like seven shades of yellow that I can't distinguish.
I tried to make them all the same, I really did.
(audience laughter)
Now, can we make this make more sense?
Leslie Knope thinks so and so do I.
And we've done that, mostly through these projects.
We have a bunch of tools that wrap up the complexity
of working with different versions of Ruby and different
Ruby dependencies and make it really easy, intuitive
to just run your code and not have to think about things.
Also, I would like it very much if you would just
pretend there are logos on the slide.
Because, some of these projects don't have logos
and I didn't want them to look bad
next to the projects that did.
Now, in spite of the heroes who've built these projects,
and the incredible amount of work that has gone into
making them work across many different kinds of systems
and with many different dependencies.
This is how most people
solve problems with their environment.
And, this is like, kind of written up as a joke,
but then, I've been asking people, during the conference,
so, if you can't find the right file
or if Bundler doesn't seem to be working the way you want,
what do you do? (laughs)
And it's almost exactly this list.
Like, no body actually said farmer, right, but
if you can't read it or if you like to hear my voice say it,
it goes, rake, or whatever command,
then, oh, whoops, oh bundle exec rake, oh right.
Oh, wait no rvm use the right thing
and then bundle exec rake
or actually, you know what, close terminal, start over
or ask a coworker and then, you know,
if Danielle can't figure it out she'll tell you,
close your terminal, start over.
And then, after that, you're like, well, whatever,
rvm implode (explosion sound)
just start over completely, reinstall rvm
and then, like, installed some rubies,
you're compiling them from scratch.
Sure, that'll fix it.
Or, rerun the bespoke setup script
that you and your team, or your company, or your client use.
In fact, reboot your machine,
Actually, that probably should've been up there higher.
Or, just buy a new computer,
but then definitely become a farmer.
This is blindly trying things
to stem the bleeding from the wound that
was caused because our tools were so sharp, they cut us.
And what we need, is not, um
like, there are people who blame their tools, right
and we all blame our tools sometimes.
But, the people who built the tools
that I mentioned in the previous slide,
they've done incredible work.
They continue to do amazing work
and all those tools are getting better.
What we need are not for those people to stop doing it.
What we need are just more heroes, like them, to do it.
And that's what this talk is about.
This is indoctrination.
You are all about to be equipped
with all the tools necessary
to join the folks on projects like RubyGems
and Bundler and RVM and Chruby.
By the end of this, if you don't want to do it,
it'll be because you choose not to do it,
not because you're not capable.
Now, how to become a superhero,
Parts number one through infinity.
There's one step. You just learn the tools.
If you want to be somebody
who everybody else calls a wizard.
If you've ever talked to someone who's not a programmer
after you started programming, you'll realize
that they think there's something about your brain
that just works differently, and that's not true.
You just sat down and got really frustrated for a long time
until you got a little bit less frustrated.
And we call that understanding.
And now you know how a thing kind of works.
And that's all there is.
And then they're like, "Wow, how'd you do that?"
And you're like, I mean, Stack Overflow mostly, right.
(audience laughter)
So, you learn the tools.
We're gonna talk about a couple tools here.
And the main thing, because it's in the title of my talk,
I'm gonna start with, is environment variables.
Now, environment variables, I've seen trip up
even really, really brilliant people.
The people who seem to work with them really easily,
are just people who've worked with them a lot.
But, what they are is, it's a global hash of strings
that, or HashMap or dictionary, however you want to call it,
that your program inherits from its parent program.
So, if you are in a shell session, you've inherited
it from something earlier on in a boot cycle
that created your shell session.
And any program you execute
will inherit you environment variables.
Now, if you want a better image,
I think of it, like a really big purse.
Like a silly-y large purse that you carry around with you
and that your parent gave to you
and then you can give to your kid.
That is how, well a copy of it,
you don't take the actual purse,
you make a copy of the purse.
Give your kid their own gigantic purse.
That's how the environment works.
And the environment looks like
a bunch of individual environment variables,
but think of it as just one thing.
The environment variables are all just the keys in a hash
and their strings, and the values are also strings.
And, you can have some things (bumps mic)
that are just for yourself, whoops
but, uh, anything you export get passed on
to your child processes.
One other note about this presentation,
please dink around your computer
and follow along and type in whatever you want here.
I'd actually love that.
I'd love if you got a little bit of muscle memory
in some of the things we're gonna do.
I won't be offended.
In fact, now that I've said that,
I will give myself permission to pretend
that everyone on their computer if following along closely.
(audience laughter)
You can also just watch me do it in the not live coded,
because I'm scaredy pants, but in slides code.
So if you were to type env,
you would get a print out of all of your environment.
But, some of those might be things like
back, control characters and like,
funnily escaped codes that make colors show up funny.
So, if you just want to look for something specific,
especially if you have a giant env,
just grep through something interesting like home.
And, you can find the home variable
will point to your home directory.
It won't be particularly surprising.
You could also play around with setting env.
They're usually all caps, the keys,
that's just a convention, they don't have to be.
Here, I'm setting ohdear equals uhoh
and I type env and I grep for ohdear and
well, that's weird, if you're following along,
where did it go?
The output is blank.
And that is because of this other interesting property
of environment variables, which is that
when you inherit them from the process that created you,
you get, what looks like,
all of your parent's environment variables,
but not really. You got all of the exported ones.
So, anything that is any point
in the hierarchy been exported
you'll get, and your children will get,
but you can have some just for you,
just for your own internal bookkeeping.
Ohdear belongs to my shell session.
And whatever process ID it has,
that environment variable is tied to in the kernel.
Now, if I were to echo it, I can see it,
but it's not going to show up when I type env.
Well, that's funny.
Now, if I export it
and there's two ways to do this,
you can either export it same time you set it,
export x equals y, or export x
and x equals y on different lines.
Either way works.
So, here I'm kind of like doubly setting it.
But, you can play around, you'll see that
it's actually pretty intuitive.
And then, when I type env I grep it
and I'm like, ok now I see it.
It'll show up in the output.
Here's a visual way to imagine that.
When your computer starts,
there's one program or one process.
That process has a process ID of one.
Process ID is just integers.
And, if you're on Linux, this is called init.
That's the process name.
If you're on OS X, it's called launchd.
If you're on Windows, the whole of this talk,
I can't help you with. I'm sorry.
(audience laughter)
And, process one starts all the other processes.
All the processes on your computer exist in a big hierarchy.
There's no other parent process.
There's one that starts first and it starts the others.
It's actually pretty simple.
And the first one might set some values.
One of which exports, one that doesn't.
Happy equals yes, excellent.
All the children will then be happy.
And then food equals pizza, just for itself (laughs).
And that was just like, muscle memory, food equals pizza.
That must've been like from a third grade exercise
or something.
I don't know why that just came up,
but it made it to the slides.
And then, later on, some processes starts
then you click on something and your shell starts.
And happy equals yes and you actually,
you just inherited that, also you said ohdear equals uhoh.
And then when you type env, you are not
printing your variables.
You're running a new process called env
that prints its variables.
And it did not inherit ohdear, but it inherited happy.
However, if you had typed export ohdear equals uhoh
then env, would in fact, identify that,
cause it would've inherited it.
So far so good?
Awesome, I got some nods.
I'm gonna pretend you all nodded.
That'll make me feel great.
Ok, let's talk about the most imporant variable of all.
This one absolutely dominates things. This is the PATH.
Now, the PATH is slightly misnamed,
because it's actually a list of paths.
It is a list of places that the operating system looks
for commands to be run.
Now, you can type env pipe grep PATH and see it.
You can also just echo path.
One note, environment variables,
you set them without the dollar sign.
x equals y, but you read with the dollar sign.
Echo dollar sign x. That's just the syntax.
Now, I wanna show you something (laughs)
I did a little animation here
before I knew the projectors would be so big.
I'm gonna show you a little something that you can type
that will make this more readable.
Where you can echo path or anything
and you can run it through the translate tool,
which turns colons into new lines.
Which, for the PATH, is pretty handy
because then you get it printed like this.
So, this a slightly edited version of my path.
And, what the computer will do when I type a command
is it will look in each one of those things,
those paths to the command.
We're gonna see exactly how it does it.
It is not very sophisticated.
Apologies that you can't run this if you're on a Mac.
Strace, or system call trace, is something on Linux.
It's really, really powerful.
It really demystifies some stuff and makes you feel smarter
because it makes the computer seem dumber.
And you can't run it on a Mac.
There's a thing called dtrace and a script around that
called dtrust, which is kind of like this,
but usually requires pseudo
and in El Capitan pseudo is kind of broken,
so you're going to have as much fun running it on a Mac.
But, take my word for it,
the slightly edited output of this on Linux,
when I'm running this command.
And to explain this, what I'm doing
is I'm typing strace, then some command.
That some command happens to be the simplest shell, sh,
followed by c, which is a string to run.
It's the same as if I typed sh, then got a prompt,
and then typed that command.
So, I'm saying, run this program,
which is just a shell running one command
that doesn't exist.
And show me all the system call it makes.
And now we're gonna step back a second
because a system call needs a little bit of explaining
for somebody who doesn't know what it is.
A program at any point in time
can be doing one of two things.
This true of all computers.
It can either be reading and writing to memory,
basically just doing math and keeping track
of its own internal state,
or it can be asking the computer,
the kernel, to do something.
And every kernel, every operating system
has a different set of functions that you can call,
and different ways to do it, but what you do
is you're either doing math with your own memory,
variables and stuff,
or you're asking computer to do something.
You're opening something, you're writing to something,
you're reading something, you're getting the time of day,
that sort of stuff.
Now, when I tell strace, hey run this command
and tell me all the things it asks the computer to do.
The kernel to do and show them to me.
It tells me this. It first, well it'll exec this thing,
let's just ignore that, it ran a command.
And then it called stat on a bunch of files.
And stat just means status or file status.
You can just run stat in your terminal right now.
Stat space dot, and you can like stat the current directory
and you'll see some information about the current directory.
Who owns it, is it writable, that sort of stuff.
When was it created.
So it's trying to stat all these files
to basically find out, is this a file and can I run it?
Is this a thing? Is this a thing?
And it gets this error on the end,
which ENOENT is a c error code.
It expands to no such file or directory.
But, it's look for, if you notice,
someone path, and then the exact thing I typed.
Missing command.
Some little thing and then the exact thing I typed.
Over and over and over.
Let's line these up.
Path, just printed out on the left
and then strace, what is it doing, what is it asking
the computer to do on the right?
And you can see that the computer
is not all that sophisticated.
What we're doing, is we're just iterating over,
we split on the column,
we iterate over the entries in the path
and you just jab on to the end
whatever the person typed.
You type missing command or get,
it says, all right is there a (mumbles) get?
Oh, no? Ok, well is there a (mumbles) get?
Is there a (mumble)? All the way through,
until it finds one that's executable and it runs it.
That's how commands work.
You can see a little bit closer up,
it's just iterating through.
How excited are you? (whispers) How excited are you?
(audience laughter)
This is great because that concept of a path,
I went a little bit slow there,
I know a lot of people in here are actually
probably already very familiar with paths.
That concept is gonna be really portable
to the world of Ruby in a second.
Because there are two other paths we're gonna work with.
The next one is called the gem path.
And this is the one that mystifies most people,
myself included until I had to write this talk
because we don't do things like activate gems very often.
If that phrase doesn't sound familiar,
it's probably cause you don't have to do it.
Cause you're not like hacking inside ruby gems or something.
But, there's a gem path
and you can see it by running some of these commands.
You don't have to. It's probably not set on your machine,
unless you use a tool like rvm
or something that sets it by custom.
But you can always get what the default one is
with that top command.
And, you'll notice that it's
just a colon separated list of paths on your machine.
And, let's talk about how it's used.
You can do this on your computer.
This is something that I tested a few times.
It definitely works.
If you set gem path equal to itself,
but with a new directory proponent to it.
And let me walk through this line-by-line.
GEM_HOME means where gems get installed.
Gem install, whatever, will place that gem in GEM_PATH
And if GEM_PATH isn't set,
there's like a default one inside Ruby
that's like inside the Ruby install directory.
But, if you set it,
it'll definitely install the gem right there.
And if the directory doesn't exist,
it will create temp lol wut on your machine
and put the gem there.
And then, because you want to be able to find
the gems in the place where you're installing the gems,
you make sure that that, we're gonna make sure that that
gem path we set, GEM_HOME,
is at the beginning of our GEM_PATH, so we can find it.
And notice there's no, like, double quotes
around the this setting of GEM_PATH.
That's just because everything in bash defaults to a string,
so it just sort of assumes strings,
you don't have to use quotes that much.
GEM_PATH equals something colon GEM_PATH
just concatenates.
Now, when I gem install eallydumb,
which is a gem I'm sometimes proud of, sometimes not.
You could look at the source to find out why.
And then gem list, you'll see that it's there.
It's in gem list.
And if you ls to the directory,
you'll see not only has the directory been created
and not only is the gem there,
but it created the whole tree structure
of gem folders.
There's things like a specifications directory and a cache
and a gems directory. They're all there.
So, it's now a place where lots of gems could be put.
Oh, I had an animation to make that bigger. That's nice.
So, we have two paths so far.
PATH and GEM_PATH.
There's one more. And this is the one
that dominates our problems in Ruby
Require, what was it, command not found?
No, yeh, the require problems,
where you try to require something and it just blows up.
And that is the LOAD_PATH.
Now, you'll this -e and some other things.
I'll try to explain them just once, whenever I
encounter them. I apologize if I miss one.
But -e means execute the string that follows,
as if that string were in a file
and you just said ruby that file.
It's just a short hand for one lining some Ruby code.
So, I'm printing the LOAD_PATH.
Now, on my machine, it looks like that.
And also, that -r means require.
- r pp -e something,
is the exact same as just saying Ruby,
run a script where the first line is require pp
and the second line is pp LOAD_PATH. It's identical.
r stuff mean require, you can do it a bunch of times.
e means execute this, I think you can do it just once.
So here is my LOAD_PATH.
Now, this LOAD_PATH might look suspiciously similar
to what we saw on the operating system path.
The strace of this simple Ruby program,
which I had to do on Linux,
which had a 1.8 install, but it's still true.
This is what Ruby asks the kernel to do
when it tries to require a missing file.
If I line this up with the other
and there's spaces on the left I had to add
because you'll notice that
Ruby will try to require two things for each LOAD_PATH.
And that's because .rb files it looks for,
if it can't find it, it'll try to find a .so file,
which is a compiled Linux shared object.
It's basically just compiled c code, not an executable,
but a thing that can be used by an executable.
And, if you run this on a Mac you'll find
that there's also a .bundle it looks for,
which is mock OS executable format.
But, anyway. It is the exact same thing as we saw before.
Is that visible? Yeah.
So, all it does is it iterates over the LOAD_PATH,
when you require missing file
and it just sticks forward slash,
then missingfile.rb on the end,
over and over and over until it finds one.
You can imagine how this is written right?
It's basically LOAD_PATH.each do string
or do dir, dir plus thing you're requiring
and then return if you find it.
Or, actually, eval File.read, the file.
Otherwise, raise an error.
That's basically what it is.
Slightly more complex, but not really.
And if we were to try it out, you can see
I'm printing some very simple code into a new Ruby file.
And then, I'm requiring my file.
Which I put in the current working directory.
Which is not actually in the LOAD_PATH,
for the same reason that current working directory
is not in the operating system's LOAD_PATH by default.
For security reasons, if you git check out my project
and I have the top level in executable called cd
and you try to leave that project,
I can do whatever I want, right.
That's terrible if it picks my cd over yours,
so, by default current working directory
is not in either the PATH or the GEM_PATH or the LOAD_PATH.
But, you can add it, so I'm going to add it.
In my e, I'm just going to add the little string, LOAD_PATH
Jabby, jabby stab thing dot
and I now,
when I iterate over the LOAD_PATH,
it'll actually expand that to the current working directory
and stick a myfile.rb on the end
and when we find it, we run it
and it prints out like that.
And now, because this will be important
when we see more about how
the environment variables are used.
I'm gonna show you two other ways to modify the LOAD_PATH.
There are three ways to mess with it
and all three are used by things like Bundler.
ruby -I . , is the exact same
as add . to the LOAD_PATH.
And you can add -I
as many times as you want when you're calling ruby.
If you ever look at the verbose output
of some complex rake command,
you might see, especially if it's like building a gem
or like, trying to install nokogiri or something complex,
you might see something like rake -I . / lib
- I, some other thing and like, cause
that sets it up so that when you just require something
it knows which thing to require.
And it's able to find that file.
You can add as many things as you want to it.
And if you're on a big rails app
and you're on a version of Ruby pre, like, 2.1
the reason require is so slow,
is because ever lib directory of every gem
on your machine and many other places,
is in a LOAD_PATH,
it could be hundreds and hundreds of entries long
and every time any of those gems,
any of them has a file that calls require
it iterates over that and looks for
every single file appended to the end.
You'll see things like, require rail and it'll be like,
(mumbles)/nokogiri/lib/rails not found,
well, then it'll look at all these places.
It's really bizarre.
But, you can get it huge and really inefficient,
though modern Ruby does it pretty well.
And the third way, besides the load path variable in Ruby,
and the -I, is this thing called RUBYLIB.
Now, this is just your environment variable.
This lives along side PATH.
You can modify your PATH, you can also modify your RUBYLIB.
As long as it's exported, ruby will pick it up.
And it is a (laughs),
in the documentation, if you man Ruby, it says,
it's a colon separated list of paths
to add to the LOAD_PATH.
That sounds a lot like PATH, right?
That's exactly what it is.
And anything you add to there, when you require something,
the file will be searched for right there.
Ok. The three paths so far,
they all work the same, they're all editable by you,
and they are each colon separated list of directories.
Now, if you get nothing else out of this talk,
it's that, these paths are how things
are found on your machine
and if you're in a bind and nothing's working
and you're on a deadline, you can do on your machine
or in staging, or heaven forbid, but maybe in production,
you can remove Rubygems and Bundler, all that.
Just get rid of it. Not require it, not use it,
and you can hand edit these values
and your app will boot and it'll work perfectly.
You don't want to do that, because it's brittle.
You want to use a tool that does it automatically for you.
But, that's all you actually need.
So this comes down to the basic strategies we have
for how we find the right version of a gem.
If you have a version of rails
that requires a specific version of rack,
and two versions of rack installed,
one too new maybe that doesn't work
with the version of rails you're using,
how do you activate the right version?
There's two approaches.
One is, for each project you have,
use a different GEM_HOME and GEM_PATH,
and it's got just the gems you need for that project.
And this is what pip and virtualenv,
if you have any python exposure, does.
This is what rip did, which is when
Chris Wanstrath of GitHub, back in the day, saw.
We started doing some python stuff
and he was like, "Wow, this pip just has a file
with like a list of the requirements I need
and the version numbers and it can just install them
and keep a local cache in the project. This is great."
So, he wrote rip for a Ruby port of it,
which, you could do it like, saw and knew of as
he was building the initial forms of Bundler.
But Bundler does a different approach.
Bundler, it's not quite right
to say it ignores GEM_PATH and GEM_HOME,
but it doesn't use them, except when necessary.
And, Bundler identifies each gem you need,
cause it has the lockfile
or the gem file and gemfilelock.
It knows what versions you need.
And it knows how to find those on disk.
And so it looks at each one
and it finds what LOAD_PATH would be necessary
to add to your LOAD_PATH, what directory,
so that when you require rails, say,
the right rails has its files found.
Don't put the wrong rails in the LOAD_PATH,
make sure just the right one.
So, that's the other approach.
You can have either a directory or a path,
with only the gems in it,
or you can have all your gems in a big blob,
but something has to be smart enough
to know which gems to pick
and add a LOAD_PATH by hand.
And with that, we're gonna see how Bundler does it
given a giant bag of possibly unlimited gems
with mismatched dependencies.
So, here and I apologize if
this is getting maybe a little bit
removed from what you thought you'd be learning,
but bear with me for a second cause these variables
will make more sense in a moment.
This is, this one slide, is how Bundler works.
If you ignore the installing of gems,
Bundler has a thing where it'll read your gem file
then it'll recall gem install,
it actually uses Rubygems for that.
It uses the actual Rubygems facility of fetching,
and like, installing and unpacking and all that stuff.
But, if you ignore how Bundler gets your gems,
when you type bundle exec some thing,
bundle exec does this slide.
It sets RUBYLIB equals to
the LOAD_PATH necessary for when you require
bundler you get bundler.
And then, RUBYOPT equal to bundler/setup
or -rbundler.setup.
If you haven't noticed, the -r and the -I,
the spaces after them are optional,
that make it kind of weird.
So this RUBYOPT in Bundler is -r no space bundler setup,
but Ruby processes that as -r with a space,
it knows how to do that.
I find that kind of confusing sometimes.
And RUBYOPT is, I kid you not, just a string of stuff,
that anytime you call Ruby, will just get inserted
right after the word Ruby and before all the rest of it.
It's, just like, it changes the way you boot Ruby,
no matter how it's invoked.
Rake, IRB, Ruby, whatever, as Ruby is booting,
it's like, oh, is there a RUBYOPT?
Well, let me just jam all that stuff
into my own flags, my own command line flags
and pretend it was there.
And by doing these two things,
by setting those two, what bundle exec does is
it makes sure that it is available
when you type require bundler/setup
you get a gems bundler, whatever libs/bundler/setup.rb
It's found, it's read, and then RUBYOPT,
go ahead and require that so it is found and read.
And then, before you script is even running
just bundle exec itself, just right at the beginning,
it looks at all your gems, it finds load paths for them
and it finds the paths where you can require them
and it adds those all to the LOAD_PATH.
So, if you type env, you'll see something
and if you type bundle exec env,
you'll see something completely different.
And I recommend actually running those through,
diffing those too, because you'll see
exactly what Bundler is and isn't doing.
And that's all the Bundler executives,
bundler exec just sets these,
so that when you're booting Ruby, it's doing something.
And just so that, I don't want to leave you confused,
so we're gonna talk through the difference between
the environment variables and how they fit
with the command line flags.
Maybe you're new to both, maybe you're new to just one,
but RUBYLIB is the LOAD_PATH and -I is also the LOAD_PATH.
And colon, or $LOAD_PATH is the LOAD_PATH.
RUBYOPT -r rails, is the exact same as saying Ruby -r rails.
And, RUBYOPT is setting it to
- I fakedir something I something else r something,
will actually just totally work
because RUBYOPT is a great way to
take somebody completely by surprise
and do whatever it is you want to do with their code.
All right.
You've now learned, or at least been exposed to
everything about environment variables
in Ruby that there is.
There's no more hidden,
like, oh if I knew that API,
I would understand how Rubygems works
or I would understand how Bundler works.
Like, this is it, this is how it works.
These are the only hooks to get some code to run.
After this point, once Bundler setup starts running,
and sets the LOAD_PATH,
the problem is entirely in Ruby code.
And, like maybe it is doing or isn't doing
something you want, but you can debug that separately.
As long as you know what environment variables are set,
you'll be able to debug the situation you're in.
If you're in a crisis, something's not able to be found,
that's all you need.
And, we're gonna talk a little bit about the tools
that have automatically managed this for us, so far,
and where we are in current history.
So, RubyGems is totally old. It's great.
It started early. We needed it. It's been the bedrock
of how we've managed dependencies in Ruby
and I'm certain it will continue to be
because the gem format is what we all use.
It came out in 2003. Then, Rip came out in 2009,
more as a prototype, although I'm not sure
what Chris intended with it.
And then, RVM also came out,
because people were straddling
one eight and one nine at the same time.
It was a very confusing time.
And Bundler followed fast after.
A couple of years later Sam Stevenson
built a thing called rbenv.
He meant it to replace RVM.
Now they exist kind of parallel.
A year later, a guy name Hal Brodigan built chruby, chruby
which is my personal strong recommendation.
It has like tens of lines of code. It's tiny.
It does just the right thing.
You say, change ruby to this ruby,
and it's like, all right
there's your path entry, there's your GEM_HOME,
there's your GEM_PATH.
That's basically all it does. It's great.
Strongly recommend it.
Now, can we, you think,
talk about how all these parts fit together in practice?
Yeah, let's try. All right.
I'm gonna show you two versions of this.
One, is I'm gonna talk about what piece
of the environment variables
and variables each tool cares about.
And then we're gonna talk about
how the tools actually relate together.
And, I'm gonna leave you with
a bit of a challenge, that I don't think
I'll even have to say explicitly,
because it'll be pretty obvious
when you see how these tools relate.
It's somewhat problematic.
Ruby. It cares about its own internal LOAD_PATH.
It cares about RUBYOPT,
a lot, it really let's you just beat itself up with RUBYOPT.
And it cares about RUBYLIB to set the LOAD_PATH.
And, just to belabor the point,
you've heard of Ruby as a virtual machine,
the path is to your machine as LOAD_PATH
is to your virtual machine.
In so much as, running commands on Linux or Mac
is the thing you do,
requiring files in Ruby is the thing you do.
And the paths are one-to-one,
this is the path for governing how
to configure this machine, virtual or not.
So, that's all Ruby cares about.
If you grep the Ruby source and you exclude lib RubyGems
which is separate project actually,
that's periodically just imported in,
You won't see things like GEM_ any of that.
RVM cares about setting the path,
so that you find the right ruby.
And then, setting GEM_PATH and GEM_HOME.
And, other RVM asterisk, other tools that
set your different Ruby versions.
And, Rubygems cares about reading GEM_PATH and GEM_HOME.
So that things can be installed in the right place
and found in those places.
Bundler doesn't care about any of these.
It'll read them all and mess with them,
but you configure Bundler through its own
environment variables.
BUNDLE_, and if you look through the docs,
there's like a ton of stuff.
You can really, really customize it.
More than I've ever needed to.
You can set your own gem file in a different directory.
You can set your own startup script over here.
But, aw man, I really went
back and forth on leaving this pun in.
That's just not even, it's not even a necessary slide.
It doesn't need to be there (audience laughter)
But, I'm kinda glad I left it in.
Ok, we're gonna, we're gonna blitz through
all of how they connect again in a second.
And I promise you this will be over soon.
We start with, Ruby.
Ruby has a require method.
When you require a file,
Ruby will look through the LOAD_PATH and find it.
We looked at that, we saw through strace how that works
and it's pretty reliable.
Rubygems, alias is a way to require
to gem original require, just hook in and remember it,
and then replaces it with its own require.
Which, in a kernel require, goes away.
So, kernel require is now not the thing you thought it was
no longer iterates over the LOAD_PATH.
This is not, maybe all that intuitive.
And then,
Bundler, has a method (laughs)
amazingly, I love this,
called reverse rubygems kernel mixin
that just undoes that,
puts things back where they found them
and then takes the gem method that Rubygems defined
and deletes it and recreates its own kernel dot gem method.
So, you could say, that
Rubygems is the bottom, or Ruby is the bottom,
then Rubygems is built on top of that,
then Bundler is built on top of Rubygems.
And that would make, like a simple, like you know,
diagraph of dependencies; it's pretty straight forward.
That's a reasonable assumption.
That you would think that
those three work in exactly that way.
In fact,
Rubygems, thanks to the great work
of many people who build it,
but for this particular feature is Eric Hodel on Tenderlove.
They built, in Rubygems, the ability to
parse and understand a Gemfile.lock.
So you can actually read your bundler gem file
in Rubygems, using only Rubygems.
You could have an app with a Gemfile.lock
just require standard library Rubygems
and you can read that, which means we now have
Bundler and Rubygems, which are mutually aware
somewhat, um--
Yeah, they can delete each other's methods
and do mutually aware projects
that together, are the system that we depend on
for managing all of our dependencies.
And this, is actually,
oh, and Bundler is managed separately from Ruby,
but Rubygems, while managed separately,
is checked in the Ruby core,
but I think, raise your hand if you use Bundler.
I think, yeah, right, ok.
You also use Rubygems because Bundler uses Rubygems.
But Rubygems is in Ruby core.
So, this setup here is the reason for this talk.
And, less anyone misunderstand me,
there's nothing wrong with the work
that's been done by anybody on any of these projects,
but these projects are not finished.
This is not what Ruby-ists deserve.
They deserve, the straw man example,
they deserve to use Ruby
and then the project called, like, gem or something,
that everybody uses, that works perfectly,
that everybody contributes to, that solves all their needs
for making sure gems are in the right place,
and packaged and locked and deployed and everything.
That's what people new to this environment deserve,
and it should be that clean.
Now, these projects are getting there,
but folks like André Arko
are working super hard to make it better
and Tenderlove and everybody at Seattle
and outside of it working on Rubygems
are working super hard to make this great,
but what they need is you.
And now that you know how environment variables work,
and how the paths involved work
to work with these tools and to manage your gems,
it would be really great if you would look into
maybe contributing to one of the
open issues on either Bundler or Rubygems
and seeing if you can become a contributor
that can make some of these rough edges
a little bit smoother for the next person
who comes down our path.
And with that, thank you very much.
(audience applause)
The question is, can I explain more about how
Rubygems uses the Gemfile.lock and
what it does with it when the gemfile isn't present,
or when Bundler isn't present.
Uh, I can't necessarily. I know that there's
a full featured parser
and it gives you a DSL for working
with the contents of that.
And if I were to look closer in the Rubygem source,
but Eric Hodel would come by and correct me in a moment,
and say, like, "No, Jack. I built a whole thing
that downloads Rubygems for you and it solves them",
but I don't know.
Suffice to say, there's a, in the Ruby source
or in the Rubygem project,
look for, I think it's called lockable or lockfile,
lockfile I think,
and it's actually pretty simple.
It's just like a Tenderlove wrote-up parser,
as he's want to do and which I trust him to do,
and it works pretty well.
I guess the only answer I can really give you,
is it's open to use however you want.
You can just use it.
Or you can just like, read that file
and do what you need with it.
Maybe, it's like more for your own scripting. I'm not sure.
All right, thank you very much.

Software Development

Pages

Wednesday, October 05, 2016

RubyConf 2015 - Ruby's Environment Variable API by Jack Danger Canty

About Me