Power Use of UNIX

Dan North

Recorded at GOTO 2013

now I don't have a Mac I'm neither a Mac
nor a PC which which lists which queue I
join ok can we see that who I can't see
this great brilliant do anything that 3
and timer go yeah I was playing around
with a new timer app and I first on 15
and I press GO and I was like bleh - go
fast I was I got myself 15 seconds I
thought that's not gonna bloody work
right morning everyone so you UNIX I'm
trying to scribe all of UNIX in 15
what could possibly go wrong so I've
been using UNIX for 25 years first time
I saw a UNIX prompt or a UNIX login was
in 1987 when I started University and
the head computers then I was using a
thing called pyramid OSX
so yes my first UNIX was OSX you kids
know nothing right so why is why does
Enix work UNIX works because it's based
on an incredibly simple model and it
still is so UNIX was invented by a bunch
of very smart people at Berkeley
University in 1969 and the cool thing
about UNIX is this as old as me so I'm
born just before the epoch so time
starts in UNIX on the 1st of January
1970 so anyone born before the 1st
January 1970 you're born in prehistory
ok so I am I am a dinosaur
um and so yes a UNIX turned 40 when I
turn 40 it's very exciting
oh cool we'd share a birthday we didn't
have a beer or anything which is a shame
um but here's the thing so the UNIX
model the UNIX philosophy is this is
everything is a file which is incredibly
simple assertion but very very difficult
to implement and amazingly powerful in
practice so what does that mean well
that means files of files so a file on
the file system is a file I can open it
and manipulate it like a file devices
are files let's take a look so I've got
a thing called dev
under /dev I have all these things that
look like files because I just do the LS
on them I can look at them as files
except a dev SD a or look at these
LS diversity a pad LS - L think you'll
see that as one is having things like
read/write permissions and all this kind
of stuff here the these funny characters
mmm the first three are things I can do
the middle three are things my friends
can do and the last three are things
other people can do so I can read and
write into this thing sort of route if
you look at this this is the next column
here this is route the route can read
right into this thing disk which is the
group that roots in can read and write
into this thing no one else can do
anything that the magic starts on the
left is B this B says it's not really a
file it's actually a block device it's a
and then there's other things will
character devices like terminals but you
can look at this thing like a file so
what I can do is def dev SD a one
actually represents the first partition
on the files on the drive but I can
treat it like a file I can open it and
read it and write it just like it was a
regular file on the file system which
means when you're doing programming on
UNIX and you've got a really lovely
logical model what else devices are
files processes of files this is mad
right so running processes are also
files what does that mean well let's see
echo dollar dollar
so dollar dollar is my current process
idea of my shell so I put my shell is
running as process 3 0 45 so if I do LS
/ proc so /proc has a load of things in
it it's got most of them in numbers and
those numbers are actually directory
defective I do LS minus F oops LS 1 SF
big f of proc it puts a tailing slash
after those things so most these things
are directories and each directory
represents a running process so if I say
LS minus L so 'allah dollar this is the
directory representing my running
process so if I cat so that's processor
I've got a CMD line they go - z SH or
Zed SH is how my process was started I
can look at the open file descriptors
that it has
and which is crazy soaps so it has three
four five descriptor 0 1 2 and 10 for
some reason and so on so processor files
file handles are files I'll come back to
that that's a bit of a wet
there was one divergence of this back in
the day where shared memory wasn't files
and it broke everything and it and
people are really upset about it I need
to speed up and then they fix that so
shared files are share memory is files
now shared memory what's that memory
there we go if it is now so everything
running is a process I'll if I get time
then I'll explain why basically the UNIX
philosophy is that parents sacrificed
their life to create children which is
weird and dark because rust didn't get
dark enough and you have zombies that
refuse to die so the hole is wonderful
how the story of UNIX poses is great now
every every file is text or data right
text files text files text processing is
legendary everything else is data and
the UNIX philosophy is that each command
does one thing well there's one
exception Emacs does everything really
badly particularly being an editor I
very owners going to talk about that in
a bit
other than that you've got pretty
consistent world oh and you see that
little one in parens at the end that's a
manual page for Emacs then the UNIX
documentation is amazing ok so you've
got man pages for pretty much everything
and they split into sections so section
one is command section two is system
library stuff the C language and UNIX
grew up together and so there's an
enormous leap our phillipe or for c
programming on on unix as well and
you've got this wonderful wonderful
separation of concerns which means
composition is a joy pipelines and our
get anything that can read or write so
the idea is you have three standard file
things for any process it has the stuff
it can read and two things it can write
and one of the things that writes is
most of what's happening and the other
thing it writes is when bad things
happen and that's everything and you can
compose pretty much any type of
again communication instrumentation
stuff using those three things so when
people build logging frameworks I just
point them at UNIX and say no that's a
solved problem right you do not need
logging in your app you just need to
send stuff to stood out and then the
syslog will be your friend so now the
shell is your gateway to everything a
bit of history here since we have movie
references and it's very important
the original shell was called okay and
it's called the bourne shell and it was
actually written by Jason Bourne who was
working at the CIA at the time and and
and then what happened was so if you
have seen the Bourne Identity it starts
with this guy waking up with no
recollection of who years on any memory
what would actually happened was that
someone had written the C shell and the
C shell oops the the C shell was so
toxic and so broken and so unlike
anything UNIX see that Jason Bourne went
into a complete depression and then and
that's how he lost his memory so that's
where the movie starts now a bunch of
Linux heads and Ganu heads got together
and said well launch was pretty good we
should which we should improve it like
there's a bunch of things would like it
to do better so that's where bash came
along people assume bash is ubiquitous
that's because they're on Linux bash
isn't ubiquitous but it's pretty good so
there's a corn shell which is another
variant of the the bourne shell and then
z shell which is the best of them all
there's it's the only command for which
there is a man page Z shell lovers which
doesn't reference the command it's just
a man page of really cool hacks for Zed
shell okay so so you have a look at that
it's great fun so where're we going UNIX
is good at three things I'm over halfway
through only to me faster it's really
it's really good at three things it's
good at finding things right it's really
minutes from now ah relax marvelous so
UNIX is very good at three things time
keeping not being one of them I know
ignore my little clock now oh um so yeah
it's good at finding stuff so the
command find is just wonderful
so find if you do - if this is this is a
joke hit tab it gives you a list of all
the things
and so I can say fine - type and then
hit tab again and it says what the
different things are isn't that nice
your shell should do that Oh mine does
okay so if you're using bash this is
like bash wants to be when it grows up
okay so go in stores edge shell it's
wonderful so I can I can mooch around
the place that the thing that I'm using
to jump between terminals is called T
MUX which allows you to do things like
popping open terminals or windowing
terminals and that kind of looks okay so
you can sort of do tiling and I kind of
think it's great fun go learnt Emacs
again very simple tool so let's say find
find a - fine - name dot get though I
wonder my source tree let's see what all
the get files I able to get better its I
have here we go so there's a bunch of
things and they're all the get
directories except I don't really know
they get directories because look I
could do this I could do muck they're
bogus oops I guess okay and echo not a
get directory into Vegas dot get and now
run a fine command again oh there it is
it made it into the cut but if I say
mine 5-9 get - type D that's only
directories haha it got removed find it
turns out is almost entirely replaced by
Zed shell and so I can also say print -
L which is like - list star star star
dog get so it does recursive gloving
recursive pattern matching and it so it
does right
oops sorry dog it that they are and you
see about here there's that bogus thing
yeah so I say actually only the ones
that are directories so uh okay so you
put little filters on the end I could
also say find me everything under here
that is a symlink oh there's one just to
prove it LS - al what are you know it's
just that thing oh it's a simile how
about that so so go learn about that and
what else text in files so you can look
in files and you can find stuff so I can
do find like a pipe into grep
so this this will help me find files
that match a particular name it or grep
- uh so I've got recursive breakfast
violent show you that I want to do some
other stuff um so grep is so unique is
really good at finding things like
needles in haystacks what I did is I
took the liberty of creating haystack so
LS minus L data haystack so WC is a word
count WC minus L tells you how many
lines something has so it's a WC - L
data haystack and I have all 30,000 and
2 lines in my haystack ok so let's have
a look at this you that Oh lots of look
at its lots and lots of hay lots lots of
bloody loads of hay right les hey it's
not even it's not even showing up at how
far it's gone down oh my god right so I
don't know where it's a grep needle data
haystack BAM
oh there it is right hang on grep - ah n
then I was on line seven nine one six
let's see what said seven nine okay
cool is that
ah so you can find needles in haystacks
by the way there's a Python one-liner
that built the haystack which I'll show
you in a minute
so here's fun words with all the vowels
in order in order to be able to do stuff
with UNIX you really need to learn
regular expressions regular stretches
are at the core of a bunch of stuff the
tert the word grep the named grep comes
from this comes from the initiative the
first editor for unix was called IDI
because they weren't very good at naming
things i'm a gave most ago things to
letter names and they had to think will
copy and convert which they called DD um
which stands for copy and convert yeah
they wanted to call it CC but a couple
of guys down the corridor Brian
Kernighan and Dennis Ritchie had written
a C compiler and called it CC so they
went oh crap we'll go with DD so that's
what this thing this is a true story now
inside the editor ad you have you have
commands and so one of the problems is P
P would print a line so this is a line
based editor not screen based editor and
you could say this you
globally print and that would print all
the lines like sort of way of catching
globally find some regular expression
and then just print the lines that match
from mmm so we can do this if I want to
find all the words with all the words
with all the vowels in order so here's
what I want to do less use a share on
most unix here's you'll have a word
dictionary tip let's have a look see
what british english okay and this has
all the words and this is as a company
there's a command called spell a spell
this as they have a hash this list of
words WC - L that there's 99 thousand
all the vowels in order so I can do this
I can take grep dot star a dot start
start it e dot start I dot start start
you may find that oops in there oh there
we go
that didn't take very long did it the
annoying thing we're doing the UNIX
demos these days is it's too quick right
so I've seen so there was a I was
working on a quite a big trading app and
one of the things they did was put tons
loads and loads of logging data into a
no sequel database I won't say which one
it'll really matter and then it would in
queries over there's no sequel database
and they found that that was one of the
limiting things was how fast they could
get stuff into this database they then
just dumped it into flat files on the
file system and the query tool they were
using over the logging was grep and it
was faster to store and faster to search
than the built-in query tools with
indexing I'm just saying okay so what
else now it's really good at try
creating and transforming things let's
take a look what like I want to turn
some straw into gold so okay so here's
my remember I had this thing I've got
lots of hay so I'm going to say now I
can say said oh hello how are you
pretty dismiss stop stop making noises
there we go
how many minutes have I got still got 10
minutes fantastic so now I want to turn
some story to ghost like I said and now
here again said gets its name from being
streaming Edie say all these things are
related Edwards the editor s is a
streaming version of that so it takes
input does stuff replace videos output
so I want to turn straw into gold like
that of my haystack I'm still like us I
wouldn't do any things haven't got straw
oh I know what I'll do I'll turn the hay
into straw need to wait a bit right so
substitute the SIRT turn turn
hay into straw and then turn straw into
gold oh it's not much gold is it hmm
okay the reason for that is this I need
to do it everywhere so G at the end says
globally turn all the hands of straw
turn all the storage of gold whoo and so
now I'll have I now have an eel in a pot
of gold which kind of fun what else
we're doing so the other thing it's
really good at transforming as files so
I want to move some stuff around so I
have some files so one of them I talked
yesterday no Monday and I had I use lots
of photos and here's the photos there's
a bunch of photos with JPEG name like
this and then I turn those into a
storyboard buy it by going to CD
like there so they have so 0 0 0 1 each
one of these is a kind of part of the
talk if you like and then there's all
the different photos as they kind of
each line on rolls and I thought well
they're all in flat directory here is
really confusing so what I want to do is
put them into sub directories and on the
zero zero through 16 or something and
then in there put the actual file so I
need to create directories with all
those names but I could do mcdeere one
attends that's pretty dull again because
UNIX is good at text and files I can
think of these file names as just a list
of text so that's it LS type less oh I'm
just looking at text mail so I could do
this I could say LS pipe cut
- D delimiter is a dot and the field I
want is one and I get this you say oh
and now I can say well I don't want that
there's lots and lots of duplication
this I want to pipe that into unique ok
and now oh there we go that's the list
of things so now I can capture that
inside these prints dollar per end and I
can say look dear that a.m. ok so far so
good this is going to write it so now
what I want to do is this I want to for
file in star if I ought to cut f4f in
star do print dollar efforts just start
there done okay so yep that makes sense
I accept I want it in star dot is just
files I don't want directories yep there
we go right so now what I want to do is
I want to move files around what I do
when I'm building up a come on like this
leave echo echo move dollar f2 well
let's see um what's the directory name
well the directory name is d equals
dollar F well there's a lovely bit of
what's called parameter substitution you
can do in the shell and bash does this
as well if I can say it's F and what I
want to do is strip off the end so
percent dot start in fact so oops so
you've got these two operators hash and
percent and what hash does is it strips
a pattern off the front of a variable
and what percent does it rips it off the
back and you can as you've got regular
versions of these which are hashing
percent and you've got greedy versions
hash hash and percent percent so I want
the greedy version of the thing that
strips thing got strips off of the back
so I'm now going to say move editor
dollar D and so that does oh that's not
from where do digs ah thank you
yay right there we go and now because
I'm a bit paranoid I put that slash dot
at the end like this
this is Unix good practice what this
does is back in the day you can have
like 20 files you try to move them into
a place and if you miss type the place
you just blatted all the files into a
new file called whatever the place was
called and that's where the word slash
dot comes from is that you put the slash
guarantee it's a directory cuz if that
doesn't exist it'll to come on will file
so Lenny bin history here so now I'm
going to just take that echo way oops
and I can go and then they uh tree dot
look at that hey you very much so this
is live what else do I want to do how
many minutes I got how are you I know
but I'm an excellent so I want to show
you a little things now managing things
it's really good at managing things so
let's have a look at disk space so I can
find out for instance let's go here
again how it was I've got a bunch of
directive stuff here how much stuff is
is taking something a D u minus H is in
human terms they use disk usage and they
can say show me what's going into these
oops hello what sorry you - s is a
summary there we go so it says right
you've got in bitbucket you've got site
652 k code.google.com/apis/console and
so I can start moving into things and
figuring out what's going on there um
and D F does the same thing but add a
file system level so I can say D F minus
H and it says here's all you always see
there's loads of crap there like right
if you look here there's things like you
dev temper that I don't understand what
those things are
so what I want is it just to tell me
things that aren't based on some kind of
pseudo file system temp temp of s I say
- X temp FS and it says okay here's the
ones that you care about so now you'll
see that I've got home is I've got seven
gigs left on there and home Dan it's
kind of fun that's that's mounted on
home Dan private what does that mean
well let's find out it's got - big t and
it will tell me the types ah equipped FS
so what I've got is an encrypted home
home partition mounted on home Dan but
the actual directory is home down but
private so that means that if I reboot
the machine is not mounted anymore which
is kind of fun so I can use D F and D u
to figure out where stuff's going
processes PSP grip he killed so I can
say well there's a command called H top
or top top shows me what's going on on
my machine it says right now there's not
that much CPU being used you can see Oh
Mongo Wow what do you do got my machine
like this everywhere it's like a virus
um so I had no idea I was running Mongo
he knew so that's their spider okay
let's fire oak and then there's a lovely
thing called H top which is the same
that um and I can move across and I can
see things going on here again more guys
doing that thing so now how do I get the
equivalent thing on the command line
well again I've got a slash prop
remember so I can figure out from /proc
at what resources are being used I can
use P grips like ap grep is like great
for processes who knew right
so P grep a Mongo there it is right
peter out - FL Mongo and it says R okay
I'm this price ID and this is a command
line I'm running with so now I can say
brilliant I can say what else is running
in what else is using the etc filesystem
I can say F user etc oops sudo F user
etc now no one's actually in the Osetra
directory F user dot what's going on
here okay so I said one process is using
the current directory and it's process
three zero four seven one is
me oh look there I am so you can find
out what's going on a bit L SOF is your
friend Anna so tells you any open
resources open files open sockets all
that kind of thing oops
and then files I just wanted to mention
is things like our sink is our sinks
made of magic okay so most things on
UNIX are in and see our sinks made of
magic no one knows what I think got
there or how it works it just magically
moves files around and you can give it
like terabytes of files put one new file
in there and it goes that one yeah how
did you know I'm magic right oh okay so
so people used to our sink stands for
remote sink but you can just use our
sink locally for syncing up local
directories so I'll sink is a fantastic
for instance deployment tool because it
is incremental it'll figure out what's
changed and you can you can make it
incremental bi-directionally so if I
move a file and use the write our sink
flags it'll delete the stuff at the
other end as well as moving stuff across
those but it's pretty cool last thing I
wanted to look at and I'm going to get
there in time hurrah is process
redirection which i think is one my
favorite things about the shell so say
I've got a a program a command and it
only takes files yeah so imagine cat
unless it's not that's not gonna help is
that let's go somewhere that's got files
CD get home who's going to have files
what am I getting that vision me angular
angular paper I don't even know where
this is right okay what's in here read
me there we go cat readme so there's a
readme so cat takes files now what if I
wanted to treat the output of a command
like a file okay because when were
everything is a file so surely I can use
the output of a command as a file and
this I could say that this thing here if
I wrap this remember you saw this thing
and this thing says run this command and
give me the output this thing says run
this command hook it up to a file
descriptor and then give me the
path to that file descriptor under /dev
FD so it's a pseudo file that represents
an open file descriptor of a process
that I'm about to pass to another
process did you get that you're passing
file handles around wrapped in a thing
that looks like a file name file path
but when the kernel unwraps the file
path went when the FO when the open
command I hits the file path it says oh
all right this is in /proc slash FDR I
know what to do with this you really
want bump there it is and it's really
really clever so I can do this I can say
echo that it says oh here we go
proc slash self so for any process flash
I'll prop self
oh it's for me it's a link to their echo
Dola Dola Oh
why are those numbers different any
anyone bingo it was the LS process that
when LS asked for proc self it was
handed a reference to its own process ID
it's that clever it doesn't go you
you're running LS in a shell it says
so for the process for the purposes of
this proc self FD 11 says well I've open
file descriptor 11 on your cat thing so
I can now say this I can say less Jing
whoa so if you look at the bottom i
think i'm Lessing a file called proc
self FD 11 what I'm actually Lessing is
the output of catting that readme so now
what I've done is I've got the readme
into another process so you can do
things like for instance you could SCP a
file across or least cat it you could
curl something straight off the
interwebs so they find that curl HTTP
can search Google com oh there it is
right so grep document and the document
has moved oh and I also got some output
from that from curl which is kind of fun
let me try and get rid of that noise
dollar sign we're missing dog thing ah
what - - silent oh just - s there we go
so I'm now gripping at the output of a
curl great things it's got a file open
it doesn't need to know any different so
you can you can do really quite
sophisticated things bouncing around mmm
so to sum up oops where are we so to sum
up then everything is a file
okay everything running is a process
UNIX is incredibly good at managing
things at finding things and at
transforming things if you start
thinking of everything as a file and as
text okay um and 20 years 25 years ago
if you told me that the text editor that
I opened on my first day computer
science at Brunel University in a thing
called VI it would be the same editor I
was doing a presentation on in in yeah
quite a lot of years later I would have
probably thought you're some kind of