Please wait... loading: a tale of two loaders

Myles Borins

Recorded at JSConf EU 2018

hey how y'all doing today it's hot right
um so thank you all for coming into this
room and bearing the heat hello
so this is a fun when I do this a lot if
you've seen me talk before I'm sorry I'm
going to do it again but I kind of like
walk into the audience nope nope nope
nope nope okay I'll do the short version
I go like this and then I like do this
with the cat and then it calms me down
this is backfired I am NOT calm that did
not work so improvisation is a fun thing
I'm miles I work for Google as a
developer advocate on cloud platform
this little line at the bottom put this
in all your talks this is my opinions
about my companies so we're gonna start
with a quick glossary because I'm gonna
use lots of terms ESM Ekman script
module CJ s common J s module
interoperability this is the ability to
access ESM from CJ s and vice versa this
is a really important core concept of
trying to get note in the web to play
nicely together and for us to move
forward in a way that doesn't you know
fracture the ecosystem transparent up
interoperability this is the ability to
require ESM in CJ s and import CJ s in
ESM without having to know the type of
the module so the ability to npm install
a module and then just use es m or CJ s
and not have to worry about it this is
kind of how a lot of the transpiled
loaders work right now so if you use
Babel or typescript or web pack you
don't really need to know the type of
module that you're using a goal is a
pairing of a top level grammar and a top
level execution model I'll get back to
that one later bear imports is the
ability to require a module by its name
so if you ever do you know like require
lodash or import underscore or vice
the ability to just reference it by the
name of
and not have to give the full path to
things that are in your node modules is
referred to as a bear import existential
dread this is the feeling I get trying
to get all this to work in thinking
about this problem space and you know it
there are a handful of existential
problems that we have in JavaScript the
language right now you find me later we
can have a beer and talk about it but
how how do we get to where we are today
so who remembers the s4 it never
happened but es4 is introduced the
concept of packages these were primarily
based on C++ namespaces and the intent
was you know to create something similar
to the Java jar system for those of you
who don't know the joke I just made was
like es4 never shipped we went straight
to es 5 from es 3 and it was ripped out
of the standards track and it was never
seen again all these es 4 features kind
of just vanished common J's was then
introduced members of tc39 worked on it
but it was never standardized it was
intended primarily for server-side Jas
and we'll get into in a minute
why this is specifically for servers and
doesn't really scale to the web and
nodejs implemented a variant of the
common J's spec who remembers AMD I keep
your hands up if you liked writing the
files it's ok already put him down
but AMD actually wasn't specified it was
mostly implemented and required Jas and
it was more of a convention than really
a standard and because of all these
different things we had a thing called
UMD that we used to package them all up
so we didn't have to think about the
modules like UMD was I guess like the
transparent Interop of you know 5 years
ago but yes modules landed in equi to 62
in 2015 and at that point we had modules
they were standardized but they weren't
anywhere so let's go back to that
glossary again
what's a loader loader is a generic term
for a workflow it includes fetch
transform and an evaluation hook these
are the phases that happen when you try
to load a module and when I reference a
loader it is an inflammation
implementation that does all of these
things and this is where we get into
like the first really weird thing that
makes things hard commonjs technically
doesn't have a separate loader phase it
has synchronous load and inline
execution when you run node and you
start a file it just starts executing
when you hit a require that just starts
executing that file and it does this
synchronously so when you try to load
that file off the disk it just starts
executing it and it does that through
essentially the whole graph of your
modules and and this is why it's not
really good for the web if you had to
wait for something to go over the
network every single time you hit
require and this isn't able to be done
asynchronously you're gonna have a long
time to wait until your first paint
which is no good and as I said there's
no load step everything just kind of
happens and this is actually how Babel
originally implemented ESM if people are
from Babel you can find them afterwards
and maybe I'm over stretching the
implementation but essentially they took
the import statements converted them
into require and then converted that
directly just into files being linked
together so the original Babel order was
this kind of inline execution model that
was a breaking change to Babel when they
implemented the asynchronous model
that's closer to what is specified
because yes modules specify a
synchronous load and synchronous
execution these are separate phases your
entire module tree is determined first
so top level the weight may change some
of those guarantees I just said there's
a synchronous execution it is possible
that the actual execution of your graph
could become asynchronous that's the
only way to really support it but that's
not what this talk is about it's a whole
other talk grabbing at the party I'll
tell you all about this this is the fun
part though the loader slot specified
it's implemented by the embedder the way
in which it executes is specified the
way in which the graph is crawled and
everything is linked is specified but
how you actually
all those resources is left up to the
embedder there's been attempts to
specify this both at the w3c as well as
the what working group but most of these
processes that this probably have
stalled and it's primarily being
implemented and it may be specified in
the future so I mentioned that load
phase the load face first goes and
fetches all of the texts for all of the
module graphs so it can go it finds all
the symbols it finds all the things that
needs to load over the network and it
can a synchronously grab all of these
and you can imagine how this actually
pairs really nicely with a concept like
h2 push where you can go and start
grabbing everything really quickly it
resolved the specifiers within the
source texts and then moves into the
linking phase look for the linking phase
the module graph all those files need to
be in memory and after they're in memory
they're executed in pre traversal order
so this means that from the top down
it's actually going and linking your
entire graph this is important when you
have things like cycles so if you have
modules that are referencing modules
that are referencing modules that are
the as you can imagine it gets
you don't need multiple instances of
those to be executed because everything
in the graph is returning a singleton
when you grab a particular keyword so
you import react in another file you
want that same version of react because
you need to be registering the
components in the same area or things
get weird and then that moves into the
execution phase which requires your link
to be completed and actually is done in
post traversal order so it actually
starts from the leaf nodes and executes
all the way up to your root node and
this could almost be thought of as a
little bit of an instantiation phase so
you're actually going through all of the
modules that are in the graph loading
everything up into memory creating all
the singletons so that when you
eventually make a call to react later
it's ready to be used this isn't
something that common Jeff's ever needed
to worry about because everything's just
executing in line and this is where
things get weird these two things are
not interchangeable and it can actually
result in different execution order the
exact same graph defined by require and
then defined by import can actually
cause things to execute in a different
and if you're relying on the order that
your modules are going to execute that
can bite you more behavior differences
the specifier resolution algorithm is
also different
so tc39 leaves it up to the hosting
environment how to do specify resolution
and the specifier is the name of the
module that you're loading that
specifier in the browser is you know a
relative path to things that are on the
disk and I know this could be a
combination of relative paths or it
could also be a bear import as we
mentioned before so no js' has a
specific resolution algorithm for
dealing with specifiers we support their
imports so you can you know import
lodash we allow you to import JSON we
allow you to drop the file extension
that's a nice one you can just import a
file with no extension and you can also
import a directory and those play nicely
to each other
because you can you know name a file
capability and then that can turn into a
directory with all sorts of things in it
later without you ever having to
refactor your code but this isn't
exactly how things work in the where the
web the web doesn't support their
imports currently the web only supports
those paths you can only specify either
absolute or indirect paths so if you had
a module that you wrote for node that's
using a bear specifier it's just going
to blow up when you put it in the
browser for now because there is
actually a proposal it's called package
name Maps which is going to allow there
to be a look at process to actually find
the bear imports that are in the tree
but it has a slightly different feature
set than nodes model specifically you
can do bear imports but you can also do
deep module traversal by file names so
like the name of the deep import slash
and then some file that's deep in that
tree but you can't import directories
and you have to give the file extension
the reason for this is that you don't
want to be doing multiple Network calls
for all of your modules and no this is
fine it arguably can actually be a
problem but if you import a thing
without a file extension we can check
and say Oh first is it J s ok is it a
native module oh is it JSON oh is it
this thing and maybe we're only like
checking four or five or six different
tensions but those hits to the discs are
cheap but you don't want to be doing
that in the browser you don't want for
every single module in your graph to be
making like seven or eight Network
requests and for each of those network
exactly how to respond and then handling
you know 404s so this kind of
explicitness is necessary more behavior
differences the code actually executes
differently - so we were talking about
goals before and ef7 commonjs actually
run in different goals and we talked
about it before that it was a mix of two
different things and it was kind of
confusing so let's dig into it there's
four types of goals you have script
strict you have script sloppy you have
es module and nodejs and now when you
start seeing these you can get an idea
of what a goal is it's a combination of
a grammar and an execution mode so when
we talk about a top-level grammar we're
specifically talking about things like
strict mode versus sloppy mode the
ability to specify you know like what
symbols what API is what is actually
exposed by the language strict versus
sloppy is able to be denoted by a pragma
use strict which was a best practice you
know 5 years ago and still is and that
removes capabilities from the language
generally we don't want to break the
language we don't want to change the way
things work so strict mode was a way to
remove some dangerous things from the
language but you did it explicitly and
it was often so the chance of breaking
things that have been on the web before
was really low now the module goal which
is different from both the strict and
sloppy goal actually has some extra
grammar changes that are on top of that
for example you can't use HTML comments
click raisin hands who knew you could
actually use HTML comments in JavaScript
so you should change that or it's gonna
break when you make modules a weight is
a reserved keyword and this was actually
explicitly done by the committee to make
sure that top-level '08 was a thing that
could eventually happen and these
divergencies may change over time that
top that a weight is a reserved keyword
thing also makes it hard to use you know
the any variable that's name - weight
it's not allowed anymore
so this is one of the really hard parts
yes modules do not have an in source way
of determining the goal strictmode did
you had a pragma you could say you
yes modules are just code there is no
pragma the committee decided not to use
a pregnant because it is a new goal it's
enhancement or D Han sment it's not new
capabilities it is its own thing and
when you're using the browser you're
using a type equals module there's no
need to have that pragma but in node we
have no way of really knowing and
there's also more goals that may be
coming in the future that are in the
standards tract and this includes binary
AST laz 'm we have link park in the
audience right now who's really driving
making it that you can import Waze and
that's going to be like ridiculously
cool web packages which if you haven't
heard about it it's really cool when we
think about bundlers today bundlers for
scripts work because you just make a
giant script but bundlers don't really
work for modules because you can't just
make a script that has multiple modules
in it so web package is one approach
that would allow you to bundle modules
all together in a way that could be
understood by the browser and HTML
modules are another one that would
probably be pretty nice to have if
you're you know doing modern web dev JSX
has a lot of this kind of inline ability
to make modules so we have to pave a
cowpath we have to think not just how do
we make es m and CJ us work together we
have to think about a solution that
works for every number of infinite
possible future goals and this is hard
especially when we're thinking of
interoperability so if we're loading ESM
from CJ s today and node you can use
dynamic import it returns a promise it's
not really the best way of doing it you
can't export the things that are on that
symbol but you can't require ESN and
that's because ESM has this asynchronous
loader I grab me later and I could talk
more about the inconsistencies there you
can also Google it's called zebra
striping but when you move from a sink
to sink and then back to a sink just
things get weird from CJ as the ESM
alternatively though there's a lot of
different ways that we could do this for
in our implementation you can import
from FS and as of node 10.2 which we
released about two weeks ago you can
even do named exports from core modules
but the core module is a node have an
advantage of because note is compiled
and we can do all sorts of things in
advance for it and we allow you to
import a thing from CJ s in our ESM
implementation but not with named
imports you only get the default and
this capability makes it a little bit
more difficult the idea of transparent
Interop is not having to know the
difference between the types of modules
but you kind of do because you only have
a default and this is one of the things
that we're still trying to work out
import meta requires something that
we're talking about doing import meta is
the ability to put scope things I'll
talk about in a second but import meta
require would allow you to just require
CGAs right into ESM without having to
worry about transparent and Rob more
behaviour references TSM did not have a
way to inject lexically scoped variables
so as we were just talking about with
import meta CGAs has lexically scoped
variables and this will be another one
of those times where I start listing
them and all of a sudden you like it
makes sense what a lexical scope
variable is in node we wrap every single
module in a lambda and we inject these
variables into the scope of that module
for you so you these are actually not
Global's they're specific to your module
they're the specific file name the dur
name a shared instance of require and
the module that's exposed all these are
things that we that we inject for you
but TSM doesn't have that so this is
where import meta came in and it's stage
3 of tc39 it's a proposal by dominic and
import meta allows the runtime to inject
variables into the module itself when
it's executing so inside of your module
you could say import meta URL and that's
a replacement for the filename order
name and that's somewhere where we could
hang require for example to essentially
have a backdoor or an injection to allow
you to get those CAS modules in without
having to worry about transparent
Interop and as I said you can use the SM
EndNote today but it is behind a flag
experimental modules and I highly advise
you not to use it in production because
everything could change from under you
because we're still trying to figure it
out but
a quick summary of our current
implementation everything is behind a
flag we have support for named exports
four core modules ESM files need to use
the MJS file extension and we're gonna
need that no matter what even if we
allow js4 ESM the ability to do an
out-of-band solution to say hey this
thing's that es module is necessary even
you just say node thing dot ESS you
think MJS that's our way that like
you're able to just boot into an es
module no matter what we have support
for Baron imports we have support for
transparent Interop in ESM without named
exports but no support for transparent
interrupt in CJ s because with an async
loader it's currently not technically
possible but we're exploring options at
the standards bodies of how we could do
this in a way that would actually work
we also have support for dynamic in port
in both es m and CJ s you name a module
and dynamic import it returns to a
promise to resolve that top-level await
we'll make this really great in ESM but
currently the top level of wait spec is
not targeting the script school so it's
available in CJ s we also have support
for import meta but currently we're only
exposing import meta URL and some of the
decisions of this implementation has
made some people grumpy and in response
we spun up a node.js modules team and
this is really cool
we made it open and 42 members signed up
in the first week this included
represent a representation from at least
10 countries including Canada China
Germany Israel Italy Japan Nigeria the
United Kingdom in South Africa and the
US we also have representation from 10
different corporate organizations
including Airbnb applet tools Bloomberg
Go Daddy Google Group on IBM Microsoft
Mozilla and near forum and note source
I'm off by one on all of these it's a
big problem technology we've key
stakeholders from at least 14 open
source organizations and programs
programming platforms you know they know
such as angular Babel Bluebird FASTA
file libuv jQuery Jas
we also have lodash meteor npm polymer
tap KS and v8 and web pack but as you
can see like we've not a lot of people
who are working on all the tools that
we're using right now trying to work
towards consensus trying to make sure
that this is happening and we also have
representation from three of the major
standards bodies we have people who do
work at tc39 people who do work at what
working group and people who do work at
w3c this one was really easy there's
only three of them but we're all really
hard at work on this problem this is a
really really hard problem and doing it
right is this kind of existential thing
if we do it wrong we could create this
like irreconcilable drug-like schism in
the community and we all take this
really really really seriously and I
know that probably every single person
in this room is worried about this being
done right and the whole purpose of this
talk I was hoping to kind of just show
under the hood a little bit of like all
the things that make this really hard
and really weird and really tough but I
also wanted to reinforce to all of you
that like all those people all those 42
people from all those companies all
those projects all those standards
bodies we're like pulling our hair out
we are working really hard bi-weekly
meetings giant threads like you can't
even keep up trying to bash out a
solution so that you can just keep
writing code and not have to worry about
this and I really hope you know in the
next year year and a half we'll be able
to ship something my dream is that you
can npm install something and just run
it in the browser or just run it in the
node and node with absolutely no build
step which would be like in my opinion
like absolutely amazing and with that
here's a surfing dog thank you very much