Fisher: Gonna talk to you
about Chrome internals. So I’m Darin Fisher. I’m a member
of the Chrome team. I’ve actually been with
the project since inception, back in 2006. And today is just
an opportunity to go through all of the stuff that’s
underneath the hood. And so there’ll be things
that you may have heard about before
if you’ve read the comic. But, you know,
I just wanted to cover as much ground as I could and discuss
all the different things that go into
Chrome’s architecture. So this was our guiding
principle from day one. We really wanted
to make sure that we were stressing a very
simple interface to Chrome. And, you know,
that would allow people to very easily and naturally get to a very fast browser,
a very powerful browser, leveraging this, you know,
powerful architecture, but they don’t see
or be burdened by the architecture. This sort of reminded us of the way’s
search experience was, and so here’s just
a little image of Chrome, very simple Chrome
that you’ve all seen in a little photo of– taken off of
one of the design docs of– for Chrome’s architecture. Just a little aside, this is actually rendered
using HTML. Actually thought I would
just quickly show this, because it’s interesting. It’s using
a WebKit gradient effect on a straight-off image. So it’s kind of neat that you can do all this
kinds of stuff in WebKit, which Chrome is based on. Okay, so here’s
a little quote that I’m not really sure
who to attribute it to, but back in 2006
it was something that we were all thinking
about it at the time. “Modern browsers
really resemble co-operatively multi-tasked
operating systems of the past.” So what do I mean? You can think back to
older versions of Windows, older versions of Mac OS,
where if any process, any application
on your system decided to go into
an infinite loop or crashed, it could bring down your
whole entire operating system, and you’d lose
all your applications. And this was what the state
of browsers were back then. And it seemed
very unfortunate, especially as web applications
were getting larger and more complicated. I mean, oftentimes
you’d have the experience of going–composing an e-mail
and suddenly deciding, you know, I want
to go do a search, I want to go do something else, and branching off
to go get some information and venturing to a website
that maybe somehow unfortunately triggers
a browser bug, and then you completely lose
what you were working on in that other tab. This just seemed like a very
unfortunate situation to us, and we were just thinking,
you know, modern operating systems
solved this by separation of applications
into different processes. So couldn’t we exploit that? And the goals of– Speaking more about goals
for Chrome back in the day, beyond just a simple interface,
we really wanted to see what we could do
to really move the bar in terms of speed,
stability, and security. We felt like, as applications
were getting more complicated, you know,
speed was very important. As applications
were getting more complicated, it was more likely that you
could have instability issues causing problems, and so how could we solve
all these problems? Again, multi-process
architecture. If we could divide
the application over multiple processes,
you know, it’d be great
if you could have web apps each having their own thread,
and you get this if each web app
has its own process. Separate address spaces. If each web application
has its own address space, its own process,
then there’s some insulation in case one of the applications
happens to trigger a bug and cause a crash. Similarly, if your applications
are consuming a lot of memory and they’re all sharing
one giant heap together, you can have
performance problems. But with separate app processes
and separate address spaces, each one having
its own memory pools, you have a lot
of smaller memory pools, and better performance ensues. Security. So given that we were
actually leveraging operating system processes, and operating systems
already have the idea that a process might have
certain capabilities based on the use
associated with that process. Seems like we could leverage
the operating system’s support for taking away privileges
from a process or limiting the privileges
of a process, and thereby achieve a sandbox
to run web applications. And this would be great
because, well, we know that software
is just enormously complicated. And if there’s
any kind of bug, and that bug
allows an attacker to get code running with
the privileges of the user, then they can do things that would potentially
harm the user, ’cause they have
the power of the user. And so we really
liked the idea of there being
this belt and suspenders, this secondary level
of protection, the sandbox. And we get all this,
it’s all possible because we can divide
the application, the browser into multiple processes. Okay, but I mentioned before
speed. Speed is very important. And so in the early days
we were thinking it’s one thing
to divide the browser into multiple processes, and that allows
the applications to not stomp on one another and to achieve
good performance there, good scaling
across multiple cores. But we just want
the rendering engine to be very fast as well. So that informed our choice
of rendering engine, and we went with WebKit
because it was, well– We really wanted to– We knew that Chrome
was going to be open source, and we chose upon– took a look at all
the open source options and WebKit was just
really fast, and it has
a very small footprint. Back in ’06,
you could see that WebKit was starting to become
the popular choice amongst mobile browsers. And this all had to do
with, you know, just how good WebKit is. But there was one thing
that we saw as a big opportunity, and that was
JavaScript performance. Back in ’06, the JavaScript
engine in WebKit was a very straightforward,
simple interpreted JavaScript engine. And we had a team at Google
who recognized that there was
this great opportunity to apply some of
the more modern approaches to BM technology, BMs,
to build a JavaScript engine that was GIT-based and leveraged a lot
of other techniques to make
dynamic programs like– Dynamic programming languages,
like JavaScript, run so much faster. And the V8 team was very
successful, in my opinion. I mean, compared to where
we were in ’06, they’ve come
order of magnitude, or one or two
orders of magnitude depending on what kinds of tests
you’re talking about. And that’s just outstanding. Don’t you just love
silent auto-update? I do. Okay, so under the hood. The major components
of Chrome. So this is sort of– gives you a big picture
of some of the major modules. And Chromium
is the source base from which Chrome is derived, Chrome being the application. And Chromium depends
on WebKit. I’ve mentioned
WebKit already and V8. And Skia,
Skia’s our graphics engine. This actually comes
from the android project. It’s a 2-D graphics engine that’s highly optimized
for mobile environments. And so it really runs well
on a desktop system. And so
in the Chromium world, how these
are all glued together, Chromium embeds WebKit. WebKit renders to Skia. Chromium also renders
its own UI to Skia. And WebKit talks to V8
to run JavaScript. But Chromium also talks
directly to V8 for things
like proxy auto-config and other usages
of JavaScript in a browser besides just that
related to web content. So Chromium itself,
the code base, includes all the other stuff
that’s not WebKit that’s not V8, not Skia, and other libraries
that we incorporate. But it includes
things like the UI, the tab strip,
and the Omnibox– these are all
native UI elements. We chose to go
with a native UI for Chrome because we really wanted
to make sure that we could get all of
the look and feel just right. We wanted to have a very
lightweight, fast browser. We wanted to have a browser
that was highly tuned for Windows
and then again for Mac and then again for Linux. And so for Windows,
we went about– we went about building
this native tab strip, this native Omnibox. And, however, not all
of our UI is native. The new tab page,
the downloads page, the history panel, these appear
in the tab contents area. And so they are naturally
HTML-based so that you get
the same look and feel that you would
for a web app. And this turned out
to work very well for us. So basically,
the amount of native code is minimized to just that of the frame
or the window manager. So that’s pretty much
all I’m gonna talk about as far as UI is concerned. The other main components
in Chromium though are multi-process
architecture. That’s sort of the thing that–
the goop that binds this whole collection
of processes together. And they history system
is fairly complicated, based around Sequel Light, and there’s
a full text index there. The network stack
runs in the main process. I’ll actually talk more
about processes in detail later. And then there’s a sandbox
piece of module within
the Chromium source base. So moving on. Multi-process architecture. Just getting into this
a little bit more. So I’ve been talking
about processes. So this is what I mean. This picture here shows that this main coordinating
browser process that all of the child processes
talk to. And we have a sequence–
a set of different renderers. Each of the rendering processes are the processes
that embed WebKit to render web pages, and they talk back
to the browser for things like IO. So because of the sandbox, the renderers
actually don’t have access to direct–direct access
to the system, and they must proxy through
the browser for everything. So this is sort of
a hub and spoke design. I’ve shown some other
processes here– worker processes,
plug-in processes. So worker processes are
the new web workers coming of– that you start to see
various browsers implementing. These allow for a web page
to have a background thread or a background set of threads
that do work. And there’s
a very asynchronous way of communicating to those
background threads, which lends itself
very naturally to having those workers
be out of process. And there can be
many of them. Plug-ins–Plug-ins actually
run outside the sandbox in their own process. So when I say plug-ins here,
I don’t mean, like, Firefox extensions
or things like that. I mean the traditional
browser plug-ins like Flash, Java,
Silverlight, and so on that use things like NPAPI
or ActiveX. And so for these–
for each of these plug-ins, we load them
in their own process, so there’s a Flash process
and a Java process and so on. Now I’ve made
some mention here of trust, untrusted,
and so on. So it’s important to note that we really would like
to have sandboxed plug-ins. We would liked to have
included them in the sandbox, but because they are
basically already software that’s on the system,
they have dependencies on being able to actually
access the system directly. They just can’t be
included in the sandbox. A good example is Flash. It actually needs to be able
to auto-update itself. Well, if we denied its ability
to access your system, it wouldn’t be able
to update itself. So it has to run
outside of the sandbox. But it still runs
in its own process so that we get some insulation
from, you know, page faults and so on. So inter-process
communication, just a few words
about that. We decided early on
that we really– You know, there were
many applications that you could point to that
were multi-process already, on your Windows desktop
or wherever. And a lot of times,
performance can– It’s very easy to build
a multi-process application that doesn’t perform well. And one of the key things
we really went for was this so-called
a permanent model, where we want to make sure
that individual processes or really individual threads really can run
as independently as possible and rely on more of
an asynchronous communication, locking between
the processes, locking between the threads
as minimally as possible. So we went for an asynchronous
communication system. Early on, we tried using
existing systems on Windows, like Windows messages that– You can so asynchronous
Windows messages. We looked at using comm. All these systems
were fairly complicated, and in order to achieve– In order to get the kind
of control we wanted over the IPC,
we went with named pipes. Plus named pipes
are much more portable. And we knew we would
eventually be bringing this to Mac and Linux,
and so it made sense to go with named pipes. We do have some support
for limited blocking calls, because that’s required
for some of the– to support some of
the web interfaces that we need to support. And I can talk
about that more later. There’s some use
of shared memory, but we basically
don’t use shared memory unless it really
gives a big benefit, because it just adds
complexity to the system. So basically you can imagine,
if you will, these different processes that are basically having
a stream of events asynchronously flowing
to one process or the other. And it’s sort of like
a big interconnected network of data being passed around. And we’re trying
to keep this thing running as smoothly as possible
and avoiding hiccups. That’s basically the goal of the inter-process
communication. This diagram here just shows that in order to keep things
running smoothly, we dedicate a thread
in each process to run and service the IPC. That way, if the UI thread,
T(UI), in the browser, were to go off
and produce a whole bunch of IPCs that he wants
to send to the renderer but then not yield control
to the system that actually sends
the IPCs, that could be bad. So instead,
there’s a background thread who picks up that queue
of outbound requests and sends them along. Meanwhile, the browser’s
doing other things, possibly expensive things,
but the IPC is flowing. And then the renderer,
the same deal. JavaScript application could be
consuming T(WebKit), but we still have
an IO thread that’s pumping events
and doing things and keeping things
running smoothly. That’s just sort of
a bird’s-eye view of what the inter-process
communication looks like. Process assignments. What do I mean by this? Given that Chrome
has many rendering processes and we really would love,
ideally, to be able to assign
a single web application to a single process,
thereby giving the best separation
between web applications. There’s some realities
to the web that force us to group
some applications or some– What do I mean by applications?
Web pages. Some web pages have to be
grouped together in processes. And then there’s
opportunities to decide when we should actually
create a new process. And coming up with
a good formula here is a little tricky, ’cause there’s
some realistic limitations. Like we don’t want to have
too many processes. At some point,
you start to pay a cost by having many processes. So we have a process limit. If that limit’s reached,
we’ll start reusing processes. If there’s a potential
script connection between two web pages, well, they need to be
in the same process. We could have built
the system so that there was
a complex way of– a bridge or so
between the script running in one page
and the script in another page, but that seemed like it would
be overly complicated and potentially lead to a lot
of performance problems. So instead we thought,
well, a web application– Well, if we think
of a web application that might have many pages as probably sharing
script connections, then it kind of makes sense
for those many pages to all be grouped
in the same process. And so this idea of just
looking at any page that’s open from WebKit
as potentially being a page that should be
in the same process, that seemed to make sense. And it works out pretty well
for a lot of applications. You’d think,
interestingly enough, Target blank–
I put this up here ’cause you’d think that when
a user clicks on target blank, and that means
open a new window, that there should be
no script connection between that and the page
that you opened that the link was clicked on. Well, it turns out that there is
a very real script connection. When that new page
is opened, it’s actually got
a dot opener property, and the dot opener property
allows it to see the guy– the page from which
he was opened. Now that’s okay. So okay, well,
then in that case, if it’s the same origin,
you know, ’cause JavaScript allows
scripting provided origin A
and origin B are the same, then they–
they then, you know– Okay, we would only group them
in the same process if– if the new page
is of the same origin. And that’s actually
not good enough, because it might be that
you are clicking a link from– opening a new window
from origin A to origin B, but origin B
might have a sub frame that’s also on origin A. And origin–
the page over here can actually find
that frame by name. And so sort of after the fact,
after a window’s created, there might be
a script connection that become established. So we need to keep those
in the same process. Okay, but then there’s
some heuristics to try to get out of
these sort of situations. Turns out that a lot
of web applications use
to navigate from the application
to a new page. For example, Gmail,
when you click a link, it wants to take you
to a new page for that link. And the thing though
is that Gmail really has no interest in there
being a script connection between Gmail
and that new page. It even goes so far
to set the opener property of that new page
to null before navigating
that new page to the URL
that you clicked on. Those kinds
of tricks are done to sort of sever
the script connection, at least as far as
that rendering process would be concerned. Well, we sort of use that
as a clue, as a heuristic to determine that–
Oh, I see. There’s no script connection
because it’s been removed. We can just fork that off
into another process. And so that’s the kind
of heuristic that we’ve applied to get sort of a good
distribution of processes when, say, Gmail is the source
of all your new tabs, because you’re
clicking links there, or other applications
like Gmail. All right, another topic
about process assignments. When you are just typing
a URL in the location bar, you’re saying,
“Replace the current tab with new contents
from this new location.” And we recognize
that if what you’re typing is a URL with a new domain, effectively it’s
an opportunity for us to say, “It’s like closing the tab
and opening a new tab.” And so what you achieve there
is an opportunity to switch processes
out from under the system without the user
really being aware that there was
a process switch. And this becomes a really
natural and nice point for garbage collection. So frequently in Chrome, you’re just typing
in the location bar, you go to a–
enter a new URL up there, and unbeknownst to you, we’re actually swapping
processes underneath. Oh, yes,
process per domain. So some people
might have heard about the research project,
the Gisele browser, which basically
implemented this idea of actually having a separate
rendering process per domain, rather than a separate
rendering process per tab. This was something that
we looked at very hard as well, because it seems
very attractive. And our conclusions
were very similar to theirs– that you suffer some– It turns out, you suffer
some web compatibility hit for trying to make
this kind of change. But it still seems
so attractive. Wouldn’t it be nice
if you could have one rendering engine
dedicated for one domain? And then you could
start to apply some of these
multi-process benefits to actual domains. You could say, “All the data
associated to would only be available
to that rendering process.” And you wouldn’t be able
to ever see data for That’s sort of the holy grail
of multi-process architecture. But there’s some real
web compat challenges that we face with that. For example,
third-party cookies are a big problem. It turns out the web page
can dabble with cookies that are not its own. All right, so sandbox. So I’ve been talking
about the sandbox, but here’s a little more
detail about it. Primary goal of the sandbox was really all about protecting
the user against malware. There’s so many things
that you might wish to sandbox. Like you might wish to,
as I was just saying, you might wish to protect
one origin from another origin– from But that wasn’t something
we strove to achieve initially. The first goal of the sandbox was really just
as a catchall to protect the user
against malware. If somehow there was a bug exploited
in the renderering engine, it should not turn itself
into a vehicle to allow people
to distribute malware. So Chrome
has several features, several anti-malware features. It has the sandbox,
of course, but it also has
the safe browsing feature, much like what you find
in other browsers, where once you venture
onto a URL or a page that contains content
known to be bad, we’ll show an interstitial,
an error, allowing the user
to go through or not. That’s sort of a first cut. You know, obviously
if we don’t know– If the service
that you’re using to supply that data
doesn’t know about the bad URL, it’s not gonna work. So you need some kind
of extra protection. It was very interesting–
a very interesting story. One of our engineers
was tracking down a crash that he saw in Chrome
in the renderer, and he was like,
“Oh, it’s hard to reproduce–” It’s hard to debug
when he has the sandbox engaged. So he thought, “I’ll just
take off the sandbox, “and then I’ll load this page up
in my debugger, and go to town
trying to figure it out.” Well, it turns out that
that was actually a site serving malware
that exploited his system and corrupted his XP. And he was like, “Oh, crap,
now I have to re-image,” just ’cause he was trying
to debug this problem. So sandbox
is a really nice tool. What are we trying
to restrict? Well, trying to restrict the processes’ access
to the file system and network and other kinds of devices
on the system. We’re also trying
to restrict his access to the windowing system. So he shouldn’t be able to mess
around with your desktop, shouldn’t be able
to mess around with any of the hardware,
your keyboard or your mouse and so on. The mechanisms
to achieve this– Well, as I said, every process
has some associated user, and the user has
a certain set of capabilities. Under Windows, this is
represented by a user token. And we can strip
all of the rights off of that user token,
thereby denying things like system access
and file system access. And there’s other techniques
in Windows, including job objects
that can be used to further restrict
the capabilities of a process. And even just running it
on a separate virtual desktop, this is a great way
to sort of limit the processes’ ability
to get access to some of the input devices. Okay, but sandbox
doesn’t actually– A browser actually needs
to be able to load file URLs. So how do we achieve that
given that sandbox doesn’t allow you
to access file URLs? Well, two big examples
of this are A,
loading file colon URLs, and the other one is actually
being able to upload a file that is specified
in a web form. So to deal with file uploads,
what we do is when the WebKit
wants to show a file picker, it actually just asks
the browser, “Please show a file picker.” Or a rendering process
asks the browser, “Please show a file picker.” The browser
shows a file picker. The user makes their selection. And then what we do
before we return that result to the rendering engine
is we put it on a whitelist associated
with that rendering engine. And now that rendering engine
is allowed to say that he wants
to upload that file. It’s pretty simple. File colon URLs, well,
what is this? This is, you know,
the rendering engine being able to load
any file URL, because if you go to some
Food.HTML on your hard drive, it can load images from
anywhere else on the hard drive. To make that work,
what we do is we actually just dedicate a separate
process for file colon URLs. So we’re never mixing
web content and file colon URLs in the same process. For example, if you go
into the location bar and you type file colon, and you’re sitting
on a web page somewhere, what we’ll actually do
is leverage that technique I mentioned before of swapping
the rendering engine out underneath the hood, so that really
what we’re doing is loading it
in a fresh rendering engine that is now just dedicated
to local files. Okay, so what isn’t protected? I talked a little bit before
about, like, we’re not separating origins. So what that means
is that cookies– A rendering engine can read all of your cookies
from anywhere. So if somehow there was
a bug in the rendering engine, that guy could then
go and ask for cookies from any domain. And this is just–this is
something we have to support because of document.cookie. There’s a JavaScript API
to allow the web page to read your cookies. The one exception here
is the HTTP-only cookies. We actually don’t have to allow
the renderer to see those. He can make HTTP requests, and then the network stack
running in the browser will add those
HTTP-only cookies. Passwords. Passwords,
much like cookies, it turns out that the web page
can read the passwords. So if you have an input type,
people’s password, and the user enters
a value there, it’s actually supported
by the web that the web page
should be able to actually read that value out
and do something with it. It’s just the way the input type,
people’s password, it’s just a mechanism
for having some sort of secret that’s entered into
the web page. And so this generic mechanism
allows web pages to do things like
do client-side authentication. So it’s not
necessarily the case that this password field
corresponds to data that will be in a forum post. The reason why
that’s interesting, ’cause you might
imagine a system where the browser returns
a dummy password to the web page. And then
when the dummy password appears in the form post, the browser transparently
replaces it with the real password. That’s a very clever way
to avoid exposing the real password
to the web page. But again,
because the web page has the power
to read that value and so some computation on it, it’s–there’s a potential– there’s a huge web
compatibility problem there. So we end up
exposing the passwords to the rendering engine. Other things like
HTML 5 database, local store, session store,
they’re much like cookies– same kind of deal. They’re origin-based, and the renderer
gets access to them. Cross-site attacks,
user data in the cloud, that kind of stuff
is not what the sandbox is trying to protect against. You could imagine
a better sandbox that did do that kind
of stuff. Okay, so moving on
to rendering in a sandbox. So given that we have, like,
almost no access to the system, how do we render? Normally,
a web rendering engine is designed to render
to a native widget, like an hWnd
or an NSView or an X Window. And that kind of thing
doesn’t work in a sandbox renderer
because we literally are trying to deny
access to hWnds. HWnds are in
the Windows desktop. We’re trying to deny that. So instead, very simply,
render to a bitmap. Send the bitmap
to the browser process. The browser puts that bitmap
on the screen. So it’s like
a glorified image viewer. Complexities. Well, the rendering engine can’t get away without
using some system calls. For example, we need a way
to render fonts. And on Windows, to get
the best sort of fidelity and the best sort of
native look and feel, you really need to use
the Windows GDI APIs to render fonts. Well,
an interesting story here. This turned out to be
really challenging, ’cause once you take away
the rendering engine’s ability to access the file system, well, what this actually means
is that you can’t load fonts. And all the font APIs,
transparently under the hood, will load fonts. Interestingly enough,
they don’t do it in user land. They do it in kernel mode. But they use the token of the process
that was calling them to actually decide
whether or not the kernel should load
that font. Well, our token doesn’t have
those privileges. So it turns out that Chrome
implements a glorious hack of when we encounter
a font API call that fails, we pause and let the browser
repeat the API and load the font, thereby populating the kernel’s
cache of all fonts, and then we repeat the exercise
of calling that API in the renderer and this time
hope it succeeds because it hits some cache that the kernel maintains. So fascinating complexities
just trying to deal with some OS APIs. Other things
that we really wanted to try to achieve
with this– We really wanted to make sure
that there’s no way that a hung renderer
can screw up the browser. This really impacts
the rendering model, which I’ll talk about
a little more in detail later. But fundamentally, you don’t
want to be in a situation where the renderer
is acquiring a lock that the browser
then has to also acquire, because the renderer
might never release it. And that–
As an end result, you know, a bad renderer might
bring down the whole browser, and we just really don’t want
to be in that situation. And of course
it needs to be really fast. So speed is very important and somehow there should not
be any situations where the renderer
can lock up the browser. So painting and scrolling, lock free
painting and scrolling. This is a system
that we implement of maintaining
a backingstore in the browser. A recent Pixmap
of what the tab looks like. Very simple idea. And then the renderer, whenever he needs
to update a region, he sends a bitmap
over shared memory to the browser
and he blits that in to his backingstore and then blits that result
onto the screen. Very straightforward. And then when the browser is finished
putting pixels on the screen, he acknowledges
to the renderer, “Hey, look, and now
it’s a good time to produce another bitmap.” So this turns out to work
fairly well, except that, you know,
we recognize that you end up
in this sort of staggering kind of situation where the renderer paints,
then the browser paints, then the renderer paints,
browser paints. And in order to get
better performance, we allow the renderer
to sort of prefetch or do one additional render
ahead of the next– in parallel
to the browser painting. So if you’re on
a multi-core system, you would be able
to light up both CPUs, one of them is the renderer producing the next
rendered output, and the other one
is the browser putting the pixels
from the previous rendering on the screen. Scrolling works
very similarly. In order to achieve
very good scrolling performance, you don’t want to just
repaint the whole thing. You want to tell
the windowing system, “Hey, look,
move your pixels down and backfill
this exposed region.” So we send commands
from the renderer to scroll the backingstore
and fill the exposed region, the then the browser
repeats the exercise of commanding
the windowing system to scroll the pixels and then
backfill the exposed region. And again, it acknowledges
once that’s been done. Actually one more comment
about painting. So whenever you need
to resize the browser window, because we have a backingstore
of the old rendering, we don’t necessarily have–
that backingstore is not the right size
when you resize. And so it turns out
to be necessary to play some games to try
to get good performance there, because we’re in a situation
where a resize happened and we want to update
the backingstore, and we obviously
don’t have it yet. But we have asked the renderer
to produce the data, and we’re just waiting
for the renderer to produce the data,
but Windows has asked us to paint right now,
because it just resized. So we do implement
a little bit of a pause to allow the renderer
to produce pixels. If it can produce them
fast enough, then on resize we will
paint the correct pixels and not the old pixels. But it’s fairly common
in Chrome when you use it, if you resize very quickly,
you will see a little gutter, and that’s
what’s going on there. Resource loading. So as I mentioned before, the browser’s this proxy
for all the IO. He takes great care
to restrict the types of URLs that can be loaded
based on protocols. So if a rendering process
was not asked by the browser to load a file URL,
then the rendering process shall not request file URLs. And if he does,
they’re denied. Chrome colon URLs
are like file colon URLs in that they’re trusted. Chrome colon URLs
are used to load things like the new tab page,
the downloads panel, and other web content
that part of Chrome’s UI. This browser network code
will actually perform all the safe browsing checks
I talked about. He’ll do all kinds of things
like vending the cookies to the renderers
and managing all the data associated
with resource loading. Finally, before WebKit
sees any data, the browser’s actually performed
a lot of different things. He’s handled
HTTP authentication if prompting was necessary. He’s done all
the SSL verification and potentially put up UI to ask users if they’re– You know, put a warning dialog
that’s like, “This, you know– This certificate
is not necessarily valid.” He’s done all these things by the time
WebKit sees any data. He’s also potentially
handled downloads, so content sniffing
happens in the browser. MIME detection happens there
if necessary and so on. Some of the things that’s
interesting about downloads is that if you imagine
a scenario where you were
returning data to WebKit or to the rendering process, and that process
was to determine whether or not the data
should be treated as a download, well, you would be
in an unfortunate situation of having to echo that data
back to the browser so that he could
save it to disk. And so moving the detection
into the browser allows for better performance. All the decisions
happen there, and it can either go– it could either decide
at that time to T it to disk
instead of to the renderer. The history system. So…much like painting, it was important to us
that we could achieve a good way to manage
visited links without locking. We wanted to make sure that– so that the renderer
could very quickly check whether or not
a link was visited so it could color it
as visited or not, and do so in a way
that didn’t require acquiring any kind of lock. So there’s sort of simple idea
of shared memory containing a bitmap that’s treated like
a hash table. We apply a crypto hash
to the URL, take 64 bits of that,
index this bitmap, and then see if
the link’s visited or not. And then the browser, whenever he needs
to update this thing, he’s either setting the bits, or if he finds that
he needs to grow the table, he creates a whole new table, sends the new table
down to the renderer. The renderers
drop the old table and then pick up
the new one. Works great
with the minor exception that it causes us to fail
one of the Acid3 tests. This is why Chrome gets
that link test error in Acid3, because it turns out that this
is a very asynchronous model, of the browser’s the only one who writes
to the visited link table. The renderer’s
the one who reads, but this Acid3 test
actually requires the renderer to have visited
and be able to check right away that this link
is actually visited. But we’re working
on a fix. So after a page loads, data is actually extracted
from the page so that it can be sent
into the full text index, and that’s what drives
things like the Omnibox and the whole history search
mechanism in Chrome. And new tab is populated
with thumbnails. Those thumbnails are captured
at this time and so on. And this data is collected,
sent up to the browser. All this history management
happens up there. Plug-ins. This is probably–Plug-ins– Support Netscape
style plug-ins and even ActiveX controls
through a shim. Talk a little bit more
about that. Supporting plug-ins
turns out to be probably one of the hardest
things about doing Chrome, because we knew
they could not run in the sandbox renderer because they required
a lot of privileges, and we knew that we wanted
to have a sandbox that was very aggressive. And so that meant
pushing the plug-in outside of
the rendering engine, which meant taking an API
that was synchronous and designed to run
inside of a rendering engine and forcing it out of process. This turned out to be
very complicated. But what we tried to do here by allocating a single
plug-in process per plug-in was to give plug-ins sort of
this environment to run in that looked to them much like
an ordinary browser environment. So unbeknownst to them,
hopefully, they don’t realize
that they’re actually in a separate process. And this works– Through a lot of
sweat, tears, and toil, we’ve gotten this to work. So there’s two types
of plug-ins, two modes of rendering
for plug-ins. One is called windowless
and one is windowed. Windowed means that
you actually get– the plug-in actually
has its own native widget, like its own hWnd. And then once it has
its own hWnd, it has its own
render loop basically. And it has a lot of control
and access to the system. But we’re running
that hWnd out of process, and so we are managing
a windowing hierarchy that spans processes, and windows
outside of our control will be doing IPCs
between those windows, and that leads to a lot
of complexity. So we do
some interesting tricks to try to minimize
the synchronous communication that Windows does
underneath the hood so that scrolling a page
with plug-ins will actually perform okay. I think we still have
room to improve here, but it’s gotten
a lot better since the Chrome 1 days. Or I should say
since our initial launch. Caching the rendering
of windowless plug-ins. What we do when there’s
a windowless plug-in– This is a plug-in type that has no hWnd
associated with it, no window. Instead the rendering engine
just asks it, “Please paint
into this buffer.” And it turns out
that windowless plug-ins do something interesting. They actually
do the compositing inside the plug-in. What this means is– And they can be
in the Z-order of CSS. And so what this means
is that as WebKit is painting and it encounters a plug-in, it sort of stops,
asks the plug-in, “Now please draw your pixels,” and then WebKit
continues painting on top. So you get
this interesting stack. And so if we were
trying to achieve, as I mentioned earlier on,
good asynchronous communication between our processes
and asynchronous separation, here, rendering
windowless plug-ins, is a case
where we don’t have that. So our solution
was to keep a cache or what the windowless
plug-in rendered last time in the rendering process, so that when we’re painting
in WebKit, he can quickly just paint
from that cached representation of the plug-in. And once we implemented this,
we got a much better performance for windowless plug-ins. Of course you suffer
a frame rate hit for windowless animations, but overall the results
are much better. Another big challenge
with all of this is porting, because NPAPI is not
a platform-independent API. It’s really just glue,
bridging glue, binding the browser
to other native APIs for rendering. And so the work required
to support Mac and Linux is similar magnitude
to the work that was required
to support Windows, and it’s very challenging. But it’s coming along. The current Chromium
Max and Linux builds do not yet have plug-ins
enabled for this reason, but they’re in development and can be enabled
through a switch, I believe. So I want to talk
a little bit about WebKit, just briefly about WebKit. So WebKit,
for those who don’t know, is comprised of,
what I’d say, is about three major modules. There’s JavaScriptCore, the WebKit’s
JavaScript engine. This is obviously
what Chrome doesn’t use. We use V8 instead. WebCore represents
all of the code to do the HTML, CSS,
and DOM rendering. SVG is at this layer
and other features. A lot of the new
API’s database and so on are at this layer. And then there’s
the WebKit layer to WebKit, which is the API layer. And we don’t use this because
the layers that exist here in the WebKit repository are things like
a COM API to WebKit, an objective CAPI
to WebKit, a GDK API to WebKit, a QT API to WebKit, wxWidgets API to WebKit. None of those
are really appropriate for our use case where we need it
to run the renderer in this insulated world
in the sandbox where it really doesn’t
have access to the system. And so there’s any native
toolkit you can imagine, or any even
cross-platform toolkit, that, under the hood,
really deals in native widgets just wouldn’t work. So, for us,
we were not able to use the existing embedding APIs, but we had to have our own. And so we’re
building out own that we will be putting into
the WebKit repository in good time. So if you were to look
into the WebCore reposit– WebCore directory today, you would see some
interesting pound defines. Things for, like,
PLATFORM(CHROMIUM) which represents changes
that we made to WebKit to support
Chromium-specific things. PLATFORM(SKIA), again for Skia-specific
things. Basically WebCore
has a graphics layer where you are to implement
a graphics context. And so we have
a Skia implementation to graphics contexts. USE(V8) is the pound defined
to select V8. The V8 bindings
all live inside of WebCore. Mostly. WebKit versions. So for when we shipped
Chrome 1, when we did
our initial launch, we were very nervous
about our selection of the version of WebKit. We wanted to make sure
that we didn’t inadvertently introduce web compatibility
issues. So Safari 3 had shipped
earlier in the year, and we chose to continue
shipping WebKit based on the version of WebKit
that Safari 3 shipped, mostly just to achieve
compatibility. We didn’t want to worry about
being on a different WebKit that had changed. Again for Chrome 2, we’re pretty lined up
with Safari 4’s WebKit. I think we’ve taken
a slightly older– a slightly newer version
of WebKit than they shipped. And one of the big hurdles
for Chrome 2 was actually moving
from this older WebKit to the newer WebKit. And so that has a lot to do with what I want
to talk about next. It’s WebKit development. So members
of the Chromium team are very active
in the WebKit community now. We have a number of reviewers,
a number of contributors, and we’re just trying
to increase our presence and increase our contribution
to the WebKit project. So we believe strongly
in WebKit being– We’re very happy
to have used WebKit. We’re just very much enjoying
all the things it does for us, and we want to continue
to contribute to it. I say status: unforked. So when we initially launched, because we were on
an older version of WebKit and we needed
to fix problems, we accumulated a large number
of forks to WebKit. And indeed to even
support V8 in WebKit, we needed to make
modifications. So in order to get
to a point where we could actually work
directly on WebKit and be a first-class citizen
in the WebKit world, we needed to unfork. And so we went through
this massive, I call massive, undertaking to sort of push all
of our changes into WebKit and refactor things so that
our changes could live there and finally end up in a world
where we’re able to work with Chrome,
develop Chrome, on the tip of tree
of WebKit alongside of other developers who are developing
tip of tree WebKit using Safari. So our focus of WebKit
development going forward is, as I mentioned before, we want to establish
a WebKit API for Chromium. Our goals here
are very simple We want to build a very simple
C++ API to WebKit that is not dependent
on any other toolkit other than WebCore,
all right? That’s basically just about
a very thin layer to WebKit so that if anybody else
wants to embed WebKit, they have
an easy path forward. It’s not a matter of picking
a certain flavor of toolkit that is compatible
with your world, but here,
suppose you just want a simple API
that’s not particular, you could use this as well. We have a lot of folks
working on open web platform, HTML 5, et cetera. Web workers
is included in that and a number of other things, like the video feature
that was just recently launched in the Chrome developer
channel. And of course
we’re very interested in improving web compat. I think web compat is probably
one of the number one challenges for any fledging browser, certainly for
any rendering engine that doesn’t have
the kind of market share that MSHTML has
or Gekko has. And so the things
that we can do here to make a difference
I think would go a long way. I think WebKit is fortunately
getting to the point now where it’s– you know, it’s approaching
10% market share if you add up all the browser,
mobile included, that embed WebKit. And so once–
my experience with Firefox tells me that, you know,
once you get to that 10% point, that’s when people
don’t have a choice but to care. Because if one out of ten
users to their website can’t view their website or can’t purchase something
off of their site, well, then they’ll feel that in a financial manner
at least. And so the hope is that
once you reach that threshold, suddenly it’ll be
a snowball effect and people will care
and people will– and compatibility problems
will disappear faster. And of course we continue
to be very interested in how we can
improve performance. Performance is– Speed is important to the whole essence
of Chrome. Open web platform. Just a few
more notes about that, some of the things
that are in progress. Audio/video, well,
you saw the recent release. And we’re very excited
about that. There’s lots more
that can be done to improve HTML 5
video support in Chrome. So we’ll be working on that. Application cache,
database, local storage,
session storage. These are all features about
enabling offline applications. So very excited to adopt
the same APIs as all the other
major browsers. Some of these things are
already enabled in WebKit. So you might say, “Well, how come Chrome
doesn’t have them already? Safari 4 has it.” And, you know,
it turns out that when you throw a sandbox
around a rendering engine and you divide it up
into multiple processes, that some of these things
become much more complicated to support. What do these have in common? Well, application cache,
database, local storage, they’re all touching
the disk. So there’s file system access. File system access
that needs to be proxied or managed in some way so that these
rendering processes can securely access
these APIs. So these are all
in development. And the way we tend to develop
new features in Chrome is that we put them
into the main line but behind a command line flag so that they’re not
actually enabled by default until they’re ready. But by having them
in the main line, it means that as Dev Channel
releases go out, Beta channel releases go out, or stable pushes even happen,
these features are there, and people who are interested can set
the command line option to turn it on to try it out
to give us feedback. So we don’t have
to require people to do custom builds in order
to try out new features and give us feedback. They can just take
the standard build, try it out,
give us feedback. And this is huge for supporting
a community of testers. Notifications
is a new thing that has a lot of interest. It’s all about trying
to make it possible for web applications
to do better notifications to the user,
better than window.alert. Window.alert
is very, you know, a very unfortunate API. It sure would be nice
if Calendar could have a less annoying,
a less obtrusive way of notifying the user that
there is a meeting coming up, but a notification
that is still effective at getting your attention. So that’s the kind of stuff
that we’re working on. And of course web workers
I talked about before. There’s this idea
of shared workers, which is really interesting
that’s still being worked out, details are being worked out,
but sure would be– It would be very powerful
if you could have a worker that lives on beyond– Well, I should back up
and make sure everybody’s familiar with workers. This is the idea
of background threads where you can run
JavaScript on. A dedicated worker
is one that a web page just creates for itself. So it wants
a background thread, he can create
a dedicated worker. When the page goes away,
the dedicated worker goes away. Persistent workers. Well, there the worker
can live on a little longer. But shared workers are kind of
like persistent workers, but different
in that they have a name. And you can find them lazily. You can find them
and connect to them. And so it could give
web applications a very powerful way
to have a context, a JavaScript context that’s hidden
and off to the side that they can
later connect to. So you can imagine
a web application might run a portion
of its back end there in a shared worker and let the actual web pages
just be front ends that render the data that
that shared worker manages. There’s a lot of really
interesting ideas in HTML 5 and an open web platform that we’re very excited
to be implementing. So I just want to briefly
talk about the network stack since I mentioned before that that’s
a very important element of browser architecture. Making a better wheel. So one wonders, why go and invent
another networking stack? There’s so many out there. Indeed, when we first
started the project, we thought, well,
let’s not do this– let’s not write this code
if we don’t have to. In fact, can’t we just use
the networking stacks that ship with
the various operating systems that we wish to target? So on Windows,
the natural choice was probably Wininet. That’s the networking stack under the hood
of Internet Explorer. And there’s an API to that
that’s well established, and, well, let’s use that. And we started
down that path. And interestingly enough,
we learned that we really needed
to have control over our own web cache
and our own cookie store. Okay, so Wininet,
you tell it, “Don’t use your cookies.
We’ll supply our own.” Turning off the cache though turned out to be
kind of difficult. Well, we wanted to be able
to turn off the cache because we needed to be able
to support incognito mode. We didn’t want it to be able
to turn off the cache because if the cache
contained data that was fetched
by Internet Explorer, then it might be data
specific to Internet Explorer’s user agent string. And so that
could be a problem. So we really are in
a situation where we have to manage
our own cache. And it turns out
that there was no way to tell Wininet,
“Hey, look, I don’t want you to supply
data from your cache.” And so we then switched
to Winhttp, which is another library
that Microsoft ships that just provides
the HTTP layer. You bring your own cache.
You bring your own cookies. Okay, that sounds great. And that’s actually what
we shipped with in Chrome 1. But, you know,
we recognized that there’s– we were missing opportunities
to improve performance, missing opportunities
to improve this layer and even to fix
certain bugs. And so we set about
developing our own library, and that now lives
in src/net/http/ in the Chromium code base. And that’s what shipped
in Chrome 2. So DNS prefetching is something
we shipped in Chrome 1, you’ve probably heard about,
which is all about predicting which host names
you might visit so that we can get
the DNS results started earlier. This has a very measurable
impact on performance. And we’ve been spending
a lot of energy– You know, whenever anybody
checks the little thing, helps supply anonymous
usage data to Google to make Google Chrome better, that anonymous data
is used to help support tailoring these kinds
of algorithms, understanding
when this is working, when it’s not working, and various other aspects
of Chrome as well. So in development,
there’s lots of things still to be done
for the new network stack– feature parity. And there’s things missing
like sock support, IPv6 literals. Regular IPv6,
if a host name results, IPv6 would work fine, but if you actually use
an IPv6 literal, there’s some little,
subtle bugs. We’re working out
all those kinks or things that are happening
for future releases. Other cool new features
that might help– that will have a big impact
on performance. Sparse caching. So video support
really demands the idea that you should be able
to advance the movie to a certain location and start downloading
from there and cache all that data. Well, we want to still
be able to jump back to the old location
and hit the cache again. Sparse caching is gonna help
with this a lot. This will also improve the way
Acrobat Reader runs inside of Chrome,
because Acrobat Reader, when you advance to a page,
like the 27th page, Acrobat Reader
will ask the browser, “Please fetch me the data
at this range.” And so we want to be able
to cache that. Other browsers don’t cache
the ranges like that because managing
a sparse cache is actually
pretty complicated. There’s a bunch of other ideas
we have that we’re brewing and trying to actually
test out. Looking at ways
that we can actually employ HTTP pipelining
in a manner that’s safe. Looking at ways
that we can improve the establishment
of TCP connections, ’cause these are– TCP connection establishment
is a point at which you can really lose
a lot of time. And if anything you can do
to kinda like hide that time or avoid paying that cost,
you know, when the user actually tries to go somewhere
would be beneficial. But of course
we have to be very careful not to put too much burden
on servers and things like that. So parallel proxy auto-config. Well, if you’re ever using
proxy auto-config, you know that performance
can be a real problem with it. And so all of us at Google
use a proxy auto-config, and so we feel this pain
every day, and so we really want
to do something about it. And hopefully it benefits
other people as well. And that’s it. I want to open
the floor to questions. And let me know
what you’re interested in. man: Hi. As a script developer
in JavaScript, I’m not that familiar
with the Chrome yet. When I write
an XML HTTP request, I can do it synchronous
or asynchronous. And the synchronous
is usually blocking based on IE. The new Firefox, I realized,
it’s non-blocking. Is it the same with Chrome? Fisher:
So it turns out that that’s a very interesting
thing you mentioned. I’m actually the developer
who implemented Firefox’s behavior. And I kind of regret it,
and I’ll explain why. Because JavaScript
has the assumption of run to completion,
being single-threaded. But if you– If JavaScript blocks on
XML HTTP requests to do a synchronous network IO, it still looks like
a function call to JavaScript. And if you interrupt
that function call with, like, running
another JavaScript– some other thread
of JavaScript somewhere– I mean, it’s all
on the same real thread, but to the programmer it looks
like somehow JavaScript, in another context,
was running in the middle
of this function call. And so it breaks
the run to completion behavior that you come to expect. And so really I worry that
the Firefox implementation– You know, I was a fan of it
at the time, because it frees the UI
from being locked up. But I worry
that it would result in some very strange bugs
for web developers, because when
they expect their program to just run
like a sequential program, it’s suddenly not. And so it is
a very big concern to me. In Chrome,
we took the approach of, well, the rendering engine
is a child process, and we have a bitmap of the recent
representation of it. So when the web page says XML HTTP request send,
in a synchronous manner, we just let that process
suspend itself. And during that IO time, we show the old representation
of the page. Of course the page is not
interactive at this time. But I feel like that’s
probably the right way for it to behave. And basically my belief
is that synchronous XHR is really a pretty evil API. It’s not one that works well
with the web. The web is asynchronous. Asynchronous APIs
work much more naturally in a browser environment. So–So that’s– I can imagine Chrome
doing smarter things though in the future. Like, for example,
giving the user some UI so that they could interrupt
a synchronous XHR that’s taking too long. Today, the only way
to interrupt a synchronous XHR in Chrome
that’s taking too long is to close the tab. Close the tab,
you kill the process. But if you could interrupt it and return control
to a JavaScript with an error, that seems like it’d be better. Any other questions? man: You know,
I almost want to encourage you to do some alternative
to NPAPI. I’ve struggled
with trying to port across the different browsers
on PGP. And we’ve been trying
to do something to look at secure text,
encrypted text, encrypted– you know, like an embedded
encrypted content and display it. And it’s just–
it’s really difficult to go across
the different browsers. There’s a lot
of little fallbacks. NPAPI–Well, put it this way. If you’re gonna do it,
could you, like, at least put out some sample code,
one that compiles and actually works
in the different browsers and does something– Fisher: I agree with you. NPAPI is really the bane of developers
who have to deal with it. I wish there was
something better as well. man: Well, I mean,
even the Safari– The Safari, you know,
WebKit plug-in thing is way– at least that works
on Safari. Maybe leverage some of that. Fisher: Yeah,
I’m very interested in a better solution and one that is well documented
and full of samples that actually make sense. Next question. man: I have a question
about your strategy for increasing compatibility, like, with the rendering
in Safari. It seems like, in Safari, when they’re about
to make a release, they do, like,
a WebKit tag, and then maybe they commit
into that tag. What do you do to, like,
ensure that your– whatever you do for a release is compatible
with their rendering? Fisher: If I understood
the question right, you’re saying what happens
if, like, a new tag is introduced
that’s sort of half-baked? man:
Or what do you do, like– I guess what’s your strategy
before– When you’re deciding on,
you know, where to– what revision to go off of. Fisher: Yes. Okay, yeah, so it’s
a very tough question. Like, because we are trying
to release early release often and not just be beholden to when Apple decides
to launch WebKit, it does mean
that we potentially pick up half-baked APIs,
half-baked things. But fortunately the way
WebKit community tends to work is that they hide those things
behind a pound define. And so it’s up to us
to decide when to add that pound define
so that it’s compiled in. And that’s basically
been our approach. So we don’t expose them
until we’re ready. man: I was curious
a little bit more about the sandboxing. You were saying in Windows it kind of strips
the credentials away from it, and I was curious
about other operating systems, if you’re like–
if it would be more of, like, a memory bounds checking
at the code level or if you’re familiar
with, like, Valgrind, which is, like, object code. Like, how would you
be implementing that? Fisher:
There’s a lot of– a lot of crazy ideas,
you know, afoot for how to sandbox
on Linux. You have a lot
of traditional tools like chroot,
different user– totally different user. You know, there’s a lot
of things. But all of those tools
seem to require some change or some support
from the host system, so something that’s installed
by admin or whatnot. I would encourage you
to get up on Chromium Dev and chat with the engineers who are actually looking
into this problem, ’cause there’s just so many
different approaches you could take
with sandboxing. man: Are you guys interested
in any sort of, like, it’s not really portable,
but it’s, like, an object type– like, how Valgrind does it, where it just
reads object code and just runs that inside
of, like, an emulator, like a user space emulation? Fisher: Big concern
is performance really with anything we do. man: Okay, thank you. another man:
One more thing about plug-ins. You know,
it would also help if maybe you streamlined the installation
process a little bit. I understand the security
implications of having plug-ins. It’d be nice,
from a user’s point of view, if they could click somewhere
and actually get a plug-in without having
to reboot the browser and– Fisher: Well, one of the things
we’re trying to solve with Chrome extension system
is to actually make it easy for extensions to deliver
all manner of things that extend the browser. And so just like an XPI
can be used on Mozilla to deliver an NPAPI plug-in, a Chrome extension bundle
should be able to do the same and give the user
very good control over managing
that plug-in. man: Yeah,
we’ll talk offline. Fisher: So anyways,
that’s it. I’m afraid we’re
all out of time for questions, but I’m happy to talk
to folks afterwards. Thank you so much
for coming, and I’m to remind you of this. Please leave your comments.