Hello, my name is Pablo Soler and I will be speaking about attacking embedded languages.
Well first of all, I'm a security researcher in immunity.
Well my major job is researching, do some development of clients I exploit for the Canvas
tech framework and doing some custom binaries, binary analysis tools.
Thanks for that.
Well because I'm part of the immunity debugger developer team, so this most of all what they
do.
First of all, we will start speaking about some introduction and background of embedded
existing languages and we use two different PDF deflates tool as a case study, the acrobat
reader and the foxy reader and then we use immunity debugger to decode some JavaScript
objects, some properties, methods and some arguments.
And using this information we'll do a spike script which is a faster language and we will
finish showing how to find and exploit already patched bugs which is a collect email info
bug.
Well first of all, vulnerability assessment is becoming really a really specific task
because the generic foosers, the blind foosers and the source analysis tools are already
used by the vendors itself in his testbeds so we can't use the same tools here for attacking.
So we have to create application specific tools to get into the structures behind the
things that a blind fooser can really find.
So we create a immunity debugger to achieve this task.
We add a Python script and language inside and a complete library to help the static
and dynamic analysis.
So we have to create this reverse engineering script using this immunity debugger library
which uses itself the core immunity debugger.
So this kind of application we will see now are almost like browsers.
They have user interaction, they have plugins, multimedia access, all that the same thing
that you can get and of course scripting, all the same thing you can get in a browser.
So recovering interesting and important information about the works and how it works and how the
JavaScript information is stored inside the application is really complex.
That flow is really complex inside.
So we made this analysis tool that allow us to get inside of the JavaScript core and inside
the plugins and inside the binary objects.
So we can get information from each of its parts and mix them to do a real nice fuzzer.
Bugs, kind of bugs that you will find in this application are sandbox scaping inside a JavaScript
script, inside a PDF document you are inside of a sandbox.
You can't access the complete API.
You just access a restricted section.
While other kind of bugs you will find object utilization.
You have the freeze and all the kinds of JavaScript object utilization bugs.
And there's also buffer overflows inside of the native methods and also some shell scape
bugs.
This is a fertile ground to research because as we said before you need very specialized
tools to get across the scripting level.
You have to mix the binary level and the JavaScript level to get all the picture.
And sometimes the team making the JavaScript or Visual Basic whatever scripting engine
is different from the team doing the rest of the application.
So sometimes there's some kind of lack of communication between both teams and well
there's some bugs there.
Also the data flow is really complex to follow because all these are big huge objects with
tons of function tables and that stuff and it's really painful.
So first of all this case of toy.
Adops, Adops implemented a CMI script engine which is a kind of standardized JavaScript
engine and inside Adops all the JavaScripts are executed as a response to a particular
event.
So when you open a document, when you open a page, when you click a button there's just
your chance to put some JavaScript.
Here we have some example.
And each event is a table.
You have an event type which is application, bash console, document, page, menu and you
have name in it, etc.
So each event type define a security context.
You have the document security context, the application context and so you have two level
of privileges.
We have privileged context which are the application level context so on the bash and console
events and you have the non-privileged context which is almost everything else.
Everything you can touch from a document is non-privileged.
So you don't have access to the complete API.
And you can raise of course your security context using creating a trusted function
but to create a trusted function you need to be in a privileged context.
It's not an option in the way of an attack.
So the first kind of treat that you can find is breaking non-privileged sandbox.
For example, how to break sandbox using the check for date function which is an undocumented
function.
This function raises its own privileges and then it calls a user-controlled callback.
So this user-controlled callback function is under privileged context and can run everything.
There's an example where we define the my callback function which is user-controlled
and you call a new doc function which is a privileged and then we call the check for
date using the old callback argument and set it to my callback, the function we created
before.
Also there are some other forms and building functions.
There's a stack overflow in the collect-email-info function which is undocumented also.
Using the message argument you can override the complete stack which will show, will use
this collect-email-info back as an example later.
And also there's an extensive undocumented AIP.
These undocumented functions tend to receive less testing because they're for internal
use.
So well, as an example, the collab object inside the Adobe Acrobat Reader AIP has only
three documented methods and no properties to solve when it actually has 48 members.
That's a lot of non-documented functions.
So some summary.
As part of the initialization process, Adobe Acrobat Reader registers a series of JavaScript
objects.
These JavaScript objects are just containers for different properties and methods and many
other objects.
And well, a property has two associated functions which is a getter that you use to get the
property value and a setter that you use when you want to change the property value.
Also methods have one associated function which is what you execute when you call the
method.
An object has a constructor function which is executed as part of the instantiation process.
The free we see is in the eScript plugin, there's a static list of objects that sets
its properties here.
You have first a pointer to the object's name.
In this case it's app.
And then you have a pointer to an array of members which has 52 hexadecimal members.
Then you have this, this is the structure for each member.
I use a browser doc method as an example.
You have the pointer to the method names and you have a pointer to a security structure
that we see now.
This security privileges structure sets the allowed method events, sorry, where you can
execute this method.
In this case the browser doc method you can execute it only in the application installation
and console execution events.
So if you try to execute this restricted method inside a document open, it's not going to
work.
This is just a summary I put here in case you want to try something.
Also each plugin associates a function pointer to each method.
You have here the collect email info registering which is pointing to an internal function,
a native function.
And also do the same for the property getters and setters.
We have here the default store property.
This is the setter and the getter for the default store property.
And also some methods sets some security restrictions inside the function itself.
How they make this?
They make a list of allowed events runtime and compare it with the executing event.
And that's how they decide if you can execute or not that method.
Well every method's call is made from a single dispatcher.
Here the last instruction is a call to the real native function and pass as arguments
the function names and a JavaScript object that has an array of arguments.
So all the JavaScript arguments that you see there are passed as a pointer to the function.
And also inside each method there's a generic argument parser.
Each method that you call expects the argument in a very specific format and this format
is here.
This argument parser structure says first the name of the argument in this case is two,
the type of the argument is double integer or whatever.
In this case it's six which is a unicode string.
Then you have flags.
In this case this is an optional argument.
You have the buffer which is set by the parser itself.
So after the parser executes you get the buffer that you use in the JavaScript inside your
binary structure.
And you have this use it field which is set by the parser if the argument was correctly
parsed.
Well Foxit Reader uses almost the same approach of registering their objects and methods and
properties runtime as part of the initialization.
But here the arguments are checked manually so you don't have a central argument parser
that each method implements, re-implements the argument parser.
But you have a central dispatcher for all the methods called as in the adopts case.
Here we see how to register a new object.
The last argument is the object name.
How to register a new property.
You have the setter function, the getter function, the property name and the parent object.
This parent object is the object identified that we get here.
So you first here get an object identified that you use here and you use here when you
register a new method.
And your method needs how many arguments this method are expecting.
Also the function pointer, the function name and this parent object.
And here you have the method called dispatcher.
So this dispatcher sends the arguments arrays and the arguments counter to the real function
that is going to execute our function.
This is an example of a decoder I made using the immunity debugger which show all the registration
process when you register all the methods or the properties and whatever.
You have there the function pointer for the getter, for the setters, for the methods,
the argument counters, et cetera.
So we'll try to decode all these adopts JavaScript engine using the immunity debugger.
We have two approach to this.
You have the static analysis and dynamic analysis.
And the static analysis approach is going to try to decode all this structure we saw
before, trying to get the object names, the methods and property names and the security
privileges.
This is an example.
This is all Python using the immunity debugger library.
First we read the pointer to the object name, then we read the strings of the object name
and we have the pointer to the members and the members counters.
This is all Python.
Then we decode the member structures.
It always is the same way.
You get the pointer to the member name, you then get the name of the other structure.
Here we get the allow events array, which is this, this application initialization and
version execution.
This is another one.
This is another snapshot doing all this in an automatic fashion.
And well, in the dynamic approach, we put some hooks on the method dispatcher and the
argument parser.
The hooks is a break point that allows us to execute the Python scripts.
Inside this Python script, we can control everything of the program state.
We can read the register state, we can change the memory, whatever we need to do is using
Python.
That's awesome.
We created an extensive API for all this, which is called immunity debugger library.
We have a heap library, a control data flow library, lots of tools to help to assist this
dynamic analysis.
So we put some hooks in on methods dispatcher, just in the code structure.
And the stack, the second de-word of the stack is our function name.
So putting this hook, we know exactly what is getting executed in the application.
If you try to put this hook on the application initialization, you see that adopts itself
execute tons of JavaScript methods that you never know.
Well, this is a script itself.
We have this method called hook, which is a class, a Python class.
We get the ESP bailware, well, then we get the string from there.
And we do a print in the lab window.
We have this hook in the argument parser.
We have to put our hooks in the parser itself at the end of the function, so we get all
these buffer and user fields actually filled with the information that's going to get into
the real function.
So this is how you see the argument parser call.
And this is the script itself.
You can decode the name of the argument, the type, and the most important part, you can
get the buffer that's going to get executed by the function.
And if some argument is used or not, we have an example.
Here we have a call to the collect email info function.
Well, this is actually a snapshot of the exploit of the collect email info bug.
Fossing with the spike.
First, using this static approach, we get a list of all the methods we can execute in
the adopt JavaScript.
And then using the dynamic strategy, we get all the arguments of each method.
And finally, we can fuzz each argument because we know the type of the argument using spike
and do a huge JavaScript that's fuzz every method.
So we get all the methods and we make PDF files calling each function to get all these
arguments in a try catch statement.
And then we attach to a new debugger.
We execute this Python hook to get the argument parser stuff.
And then we make a spike script like this, fuzzing all the ASCII or Unicode strings with
something like this.
And this is obviously, this is automatic generally.
And the collect email info bug.
Using this fuzzing strategy, we can find this bug.
We put our hook in the registering function.
We execute the collect email info method so we get this argument.
We fuzz each argument.
Well, this method open a new window for each successful execution.
So you need some Python or whatever to close automatically this window.
This is the snapshot of the function that is actually failed in the Acro reader.
At the beginning, it allocates 2,000 exodesimals bytes to put the buffer there.
But then it checks the boundaries as 2,000 Unicode charts.
So that's wrong, so you can overflow 4,000 bytes of memory and not just the allocated
2,000.
So you can override the safe execution handler with an arbitrary value.
That's what we did in the exploit.
And heap spread all the memory with our share code.
So when the safe exception handler changes, it's precision, we get execution.
This is a proof of concept.
This is here we first heap spread with our int tree shellcode and then we overflow the
collect email info function.
Okay, well, some conclusions.
And that JavaScript and genes are very fertile ground yet.
The data flow is really painful to follow, so there's not many tools useful and there's
not many work done in this area.
But blind fencing is not an option because the vendor is using this kind of fencing,
so it's not an option at all.
Well, immunity debugger has tons of tools, of scripts already done for this kind of jobs,
of tasks.
And well, I don't think that's all.
Thank you for your time.
And well, if you have some question, this is the moment.
Yeah.
What's the only question to is how public-level you mean for this version of debugger which
is based on that one?
Do you understand the other option?
No, no, it's a different branch.
It's our debugger, we already changed tons of things inside, so it's not much.
Okay.
Well, thank you for your time.