Hello, so my talk will be about network and stack overflow exploitation.
It will mostly be only about network and exploitation because there is nothing much to say about
stack overflow and heap overflow. But the interesting part is about what to do when
you have a stack overflow. In this case, a remote one because in the
network world, no one really reversed it before, no one really knows how the kernel works.
So the interesting part was to apply all the previous techniques to network.
So a quick overview of the talk, a quick introduction about network, what the OS is all about, about
a debugger for network. After that, I will present a kernel mode stager which is a reverse
TCP one. And I will present two kernel mode stages, one connect back shell code and one
add user payload. And I, finally I will conclude.
So this is network, a beautiful GUI in Java. It's painfully slow. It's almost everything
you can say about network actually. So why are we reversing or trying to exploit
it well? You could ask yourself, isn't this OS still alive? Actually, yes it is, especially
in big corporate networks in the US. I tried to, when I first started working on
network, it was due to a lot of crash reports from the customers. Like when we released
the plug-in just to detect a service which was absolutely not related to network. And
one day after that, we have like 20 customers who called us saying all the network system
crashed. And the point was, okay it crashed, but what can we do with that? Another point
is no one published anything about reversing in network. So it was another fun part of
the challenge. Another point is I had to reverse everything on the kernel part because there
is absolutely nothing, no publication except the NDK, the network development kit, but
it's mostly just for the basic structure, but nothing really low level. And finally,
is it possible to apply all previous x86 techniques to network? So network, it's kind of modern
OS. It's based on x86 CPUs. It has support for multiple processors. Since version 5.0,
there is a separation between kernel and user-land. It's now a bit old, so it's why it can be
considered to be modern. There is support for NX in the user space. And since the latest
version, 6.5, it's even compiled with Xen support. So in network, the equivalent of P and L is
NLM for network loadable module. You can find the definition of the NLM format in the network
SDK, the novel network SDK. And the file format is fully supported by IDEA. So it's a modern
OS, but at the same time, it's a bit old because to launch itself, it uses a modified version
of DOS. So it starts in real mode. DOS launches server.exe, and server.exe creates the full
network user space and kernel space and extracts both server.NLM and loader.NLM, which is the
core of the network kernel. When it's done, the CPU is switched in protected mode, and
the network system starts. There are lots of network versions, 4.5, 6.0, and 6.5, and
multiple service packet. For example, for the latest version, 6.5, there are six service
packets. So the challenge was to make both payloads free generic or generic enough to
work on all those versions. However, I did not make them work on 4.0 because it's really
too old and everything is different. 5.0, I'm not sure it may work, but the problem
is you can't really use a 5.0 system those days because it should crash like for just
a post scan. So the good point with network is that it's coming with a kernel debugger.
It's fully integrated in the kernel and in the user space. And all systems and NLMs are
compiled with debug symbols, so the kernel knows everything about all network modules.
And to activate the debugger, we have to do left, alt, left shift, right shift, and escape.
Quick copy of the kernel debugger. You can see all the registers, basic commands, and
each time you see one line of disassembly. Useful commands, help, really useful. It's
the only way to really know how to use this debugger. And after that, you have, I will
just bring five or six basic commands. So cd, it's to change a dword. So it's memory
manipulations part. So you have cd or c or cw, which is to change a dword, a byte or
a word. The format is cd, the address, equal, and the value you want to set at this address.
Same thing for dd, which is to dump the value. Another useful command is m. So you use m
to search a pattern in the memory. So you have the start address, the size of the blocks
you want to search for, and the patterns you are looking for. Last one is b for break point.
So it's pretty easy to use. b equal the address you want to set the break point. And after
that, you have an optional condition, like ex equal two. You have other commands, which
is more related to the systems, like.m to find a module in memory, or dm to dump this
module. It's useful to find addresses in the systems. You have.g and.i and other related
x86 commands. But the problem is there is no command to dump the memory to a file. It
was a problem because the kernel core, which are server.nlm and loader.nlm are not available
directly in the system, but they are inside server.exe. So my first approach was to dump
them from memory, but it's not possible, so you have to extract them from the exe.
So for that, I needed a remote exploit on that one. So I used a stack overflow in the
RPC stack. It was in LSA. The RPC stack runs in the kernel space. It took me one minute
to find it. Problem is you need a stable return address across all the versions of the network
and all service pack. It's not really easy to find. The exploit is actually available
in Metasploit. You can find the exploit, the reverse TCP stager, and the shell code. I
will add the user payload soon. But at this time, this flow is still not fixed, so you
can still use it. But if you want, you can find many other flows in just two minutes.
So I present the kernel mode stager, so the reverse TCP one. Quick overview. How to resolve
the kernel function addresses by finding the debug symbols. How to resolve those symbols
or to migrate the payload, which is useful to not crush the flow S, and how to receive
the future stage. So to resolve kernel function addresses, it's useful because we need to
create a TCP connection. We need to restore the system, and we need to execute commands.
There is no interruption we can use to do that, so we have to know how the kernel works,
and how to resolve the kernel functions. The problem is that NetWare, at startup, it destroys
the kernel symbols, so we can't find them easily. But the thing is, if you go to the
debugger, it is able to resolve those symbols, so we can do it too. So how to do that? By
reversing the kernel. So after some time, I've shown most information, and especially
on the debugger. So the debugger is in kernel.nlm, and it stores all the symbols in the hash
table debugger symbol hash table. So we need to look at this table in memory, and the way
we find this address, it must be generic enough to work everywhere. So the function remove
all temp debug symbols is table of all NetWare versions, and it contains a reference to this
table. But we have the same problem. Now we know how to find a reference to the hash table,
but we need to find the functions which has a reference to this table. To do that, I have
three different techniques. By using archery address for server.nlm, the problem is all
service pack versions will change the address space. It's because all library loaded in
memory after each others. So if there is a change in a service pack in one library, all
the address space will change. Another solution is to hook the C center EIP in the MSR. It
works well, but only in NetWare 6.5. But the opposite, not the opposite, but we can also
hook the GDT system called gate, and this particular solution works really well on version
6.5. So the GDT system called hook is like any other ones you can find in x86. So we
just get the address in the GDT, and we scan up to find the debugger hash table, and that's
it. So after that, we know where the hash table with all the debug symbols is, but we
need to pass this hash table to find a particular function we are looking for. To optimize that,
the payload only use function names, but not the module names, because in the kernel space
there is no conflict between two function names. So if we look at the hash table, it's
a chain list of debug symbols with the next element, a pointer to the symbol, a pointer
to the name, and a pointer to the module of the symbol. The problem is that the symbol
names is actually encrypted. It's not already encryption because it's more hash functions
that is used because the full symbol tables is actually a hash table, and the encryption
function is used to improve the hash table. So in the payload, we must directly use the
encrypted functions to make the payload smaller as possible. Here you can see the crypt function
used by NetWare, pretty easy, just an extra function. So now that we are able to resolve
kernel symbols, we need to find how to migrate the payload to a safer place because it's
a stack overflow and the stack will be erased, so we have to move it somewhere, or we can
continue. To do that, there are two main techniques. First, the first technique consists of copying
the payload to the GDT because in NetWare there are lots of free spots in the GDT. Another
solution is to allocate a new memory and to copy the payload in the memory. I used the
second one not because it's safer, but because it will make everything smaller, especially
to receive a new, the next stage. So to allocate the kernel memory, there are lots of functions.
It's kind of complex in the kernel itself, but the good thing is there is a lot of rubbers
around those functions like LBMalloc. We just have to use this function to allocate a huge
chunk of memory and copy the payload and go to this memory. After that, to receive the
second stage, we need network functions. All network functions in NetWare are in TCP.NLM
and TCP.IP.NLM. The problem is that those functions are already complex. It's a callback
system and we could use them, but it would make the payload too big, so we have to find
a better solution. And a better solution is actually to find rubbers around those functions.
Those rubbers are in BSDsock, which is the BSD equivalent of socket, receive, and send.
The problem is those functions are not exported inside the debug hash symbol, but there is
another wrapper in libc around the BSD functions, which are itself wrappers around the kernel
functions, and we can use the libc reference in the debug symbol hash symbols. Final part
of the stages we have to recover to not crash the whole system. It's often the more complex
part, not the more complex, but it's often a not generic part in kernel exploitation.
In the case of NetWare, I had first to remove a lock on the file system. I'm not sure if
it's generic, if it was only related to this exploit, but I had to do it. But a good thing
with NetWare is to restore the whole system, all we have to do is to call k worker thread,
and NetWare will do everything for us, restore everything.
So now I will present two different kernel mode stages. Basically, most of the time what
we do is we switch back to the user space for common exploitation techniques, but I
decided not to do that and to stay in the kernel just for fun. So the first stage is
a connect back shell code. Most common technique to get a shell code is actually to spawn a
new user shell and redirect both the input and output to the socket. The problem with
NetWare is actually there is no local user, so there is no shell at all. So how can we
get a shell when there is no user and no shell? The thing is there is no shell, but there
is a system console, and this console allows to manage all the NetWare system. But it's
not a shell, it's a console, it's integrated in the kernel, so there is nothing to interact
with the console. There is no file descriptor. Actually, there are file descriptors in the
libc, but only in the user space, and we are still in the kernel space. So we need to find
something else. Another problem with the console is that it's not a scrollable output, it's
a bitmap. So we have to under it. Like in the payload, we can under the bitmap and convert
it to a kind of scrollable output, or we can do better and create a client that will display
a bitmap. I first did that in Metasploit, but it was not generic enough, so I came back
to the early part, and the payload actually converts the bitmap to a scrollable output
in the kernel. So reading the console script, after some reversing, I found the kernel functions
to do that by revancing actually the libc wrappers on everything, and came back to the
kernel, I found most functions to do that. So you have get system console screen to get
the main system ID, get screen size, which returns the screen of the bitmap, the size
of the bitmap, sorry, and read screen into buffer to convert the bitmap into readable
text. There is no new line inside it, so it has to be handled by the payload too. Writing
to the console, like you've got your shell, you want to send a command, it's a complex
task too. And the solution I use is to inject a key code inside the console by emulating
a keystroke. It's the solution, and I think the only one. And the kernel actually provides
the function add key to insert a key like A to Z, or 1 to 3, to 9, or whatever, except
when you want to send an enter, you have to use a special code that I found from a reference
to this in the kernel. You can find that in the paper if you want. Another problem with
emulating a keyboard is that the input buffer is, there is a limit of 32 characters, so
the payload has to handle that to send a special character to before each command, so we can
send actually long commands if needed. So another thing is, like it's a bitmap, we want
to turn it into a scrollable output, so we have to know what changed. And for that, I
decided to inject a special character in the screen by using the direct output to screen
function. And the new line, I'll talk about that. So the main shellcode loop. The problem
is we are still hijacking the kernel loop, so everything we do is actually critical.
And we need at the same time to be in our loop and to restore the kernel loop. So there
are multiple solutions for that, like moving the shellcode loop somewhere else, but in
my case, I just get lucky and I found that first solution was when I received the stage
of the commands, sorry, if you directly call receive, it will actually lock the whole system
because receive came back partially in the kernel loop, but still waiting for an event
and this event will never arrive even if you receive something on the circuit because you
are actually locking the whole system, so it cannot detect that something arrived on
the circuit. So a solution is to use IOControl to know if there is something to read on the
circuit before a call to receive. But like we are still in our loop and hijacking the
kernel one, we need from time to time to give back the control to the kernel so it can under
everything else like all the services and the GUI. The solution, actually, Renaud told
me that I could try that, so why not? So I tried to use a select call with new arguments
and the good thing is when you do a select call, it gives the full control back to the
kernel until the select call returns. So all you have to do in your loop is to add a new
select call with new arguments and it will fully hide the shellcode in the memory. So
to resume, the shellcode is we result kernel symbols, we get the screen console information,
we do a select to give the control back to the kernel and at the end of the select the
kernel gives us the control back. We call update screen which actually detects if the
console buffer has changed. If it's the case, we send everything to the circuit. After that,
we check if there is something to read on the circuit. If it's the case, we read it
and we inject all the data in the console screen and we loop and see if there is anything
else to read. So I have a quick demo to actually show you that NetWare works. So you can see
NetWare here and the exploit to get the console. I hope it will work. It's a bit slow because
it's Ruby. It will work better if I disable the firewall. So I can see the file. So I
can see the file. Let's try again. Okay. Okay. We got a remote console shell with NetWare.
So at the prompt, you can try everything you want. Good function help in the console. Lots
of really, really, really useful commands. If you got a NetWare console, you can see
the
equipment. And this is the
所以 cancel a function with, not welling. The
어서 so as a prompt, you can try everything you want. Hit on chun. Help. The counsel.Let's
So the main part, the second part is what can we do?
There is something to manage NetWare remotely.
It's the NetWare remote manager.
It's a web interface.
We can do everything with that.
And we can even have a console through the web browser.
So instead of getting a console
that we absolutely don't know how to manage,
a better idea would be to inject a new user in NetWare
and to log on to this web interface to manage the server.
So it's the second stage, ad user stage.
On Unix or Windows, it's pretty easy to do that
most of the time.
You just use ad user, user add, or the net command
to create a new user.
On NetWare, it's really a problem
because there is no user at all on NetWare.
There was, in version before, 5.0,
but since 5.0, there is no more users.
So the thing is, there is no local users,
but NetWare is used to manage e-directory.
So you have users in the e-directory or the LDAP tree.
So the thing is, we cannot create local users,
but we may be able to add user in the LDAP directory
and then manage the system through e-directory.
So as I said, the payload is just to create a user
in the e-directory tree.
Contrary to the connect-batch shellcode,
we cannot use the debug symbol htable
to resolve the functions to manage e-directory,
and to do that, we need to resolve
library function addresses.
To resolve them, it's pretty easy.
We just have to work inside the model list,
which is exported by the kernel.
It is stored in internal module list pointer,
and as we know how to resolve kernel symbols,
we can resolve internal model list,
so we know the address of all modules.
So once we have the list address,
we must check each module of this list
and each exported symbol list of each module
to find the function we are looking for.
And here we need to match both the function
and the module name, because you have lots of conflict
between names in different modules,
like in libc or in bsde, you have the same function name,
but it's two different modules.
And you want to be sure you are cleaning the right one.
So basically, it's almost the same algorithm
for the kernel symbols, except the hash is split
in two parts, two bytes for the module name
and two bytes for the function name.
Same thing as with kernel symbols,
the module name, the function name is actually encrypted,
so we have to use already-crypted hash in the payload.
So when we phone the right hash
for the function name in the symbol list,
sorry,
so we know to find the good functions,
and now we have to create a new user.
And to create a new user in EDIrctory,
we must first connect to the EDIrctory service,
add a new user object, and grant this object
a supervisor write to be admin or root, whatever you want.
So the step one connect to the EDIrctory service,
it could have been a complicated task,
but the thing is, network, there is a really cool function
to log in as a server in the EDIrctory tree.
So if you are locally, you can call these functions
to log on in the LDAP services,
and you will have full write on all the LDAP tree.
So you don't even have to authenticate
to the services locally.
The secret to add a supervisor user is quite simple.
So you create a network EDIrctory context,
you log in as a server, so no authentication.
You allocate a new buffer, you initialize this buffer.
In this buffer, you put, it's an object class type,
you add a user field, it need a surname,
you can add multiple other information if you want,
but you just need those basic fields to add a user.
So after that, you add this user object in EDIrctory.
Now you need to give this user some rights to something.
So you create an SEL, and on the SEL,
you give him a full supervisor right to the root three.
And when you have added those rights to this object,
you must generate a password for this user.
It can be achieved with generate object keeper.
It's easy, it's clean, but the problem is those functions,
they require, they need access, sorry,
to a CLEAB context, which is a kind of a Lipsy context,
and this CLEAB context is not available for all threads.
And it's actually too complex to add a CLEAB context
directly in the main kernel loop,
because all kernel threads do absolutely not have access
to the CLEAB context, it would have been too easy.
So the solution is a kind of injecting in,
it's kind of injecting as a payload to another process
like if you go to the user space,
but in this case, we inject it in a kernel thread,
another one, but with CLEAB context.
To do that, we need to know the structure of the thread,
and especially we have the, everything here
is maybe not the right stuff, it's based on
what I reversed in the kernel,
so I think it's almost exact, but I can be sure.
So important part as a weight state flag,
stack pointer, the state and the suspend reason,
and the CLEAB data which will tell us
if the current thread have a CLEAB context.
So we must first resolve the process list address
where we have access to all the thread and all the process.
After that, we must work down in this list
to find a thread with a CLEAB context,
as I told before, it's the CLEAB data field,
and the thread must resume shortly,
because if we choose to inject ourselves in a thread
and this thread never resume, we will never add a user.
So the first things I tried was to get this stack pointer,
replace the return address with the return address
of the payload, but it was not really a good idea,
because when your thread is suspended,
it will resume not directly in the thread itself,
but in the kernel context, and the kernel context
will see there is a problem and will lock everything down.
So a better solution is actually to rely on the use of Java
in mostly, I would say 50% of the network processes
and services, and you can even find kernel drivers
that actually use Java. Why Java?
Because, I'm not sure why, but on network,
almost all Java processes generate a lot,
but really a lot of options.
Like if you try to check in 6.5,
but break points on the page fault exception handler,
it will never stop, like it's crazy.
So we need to find a thread with a slip context,
which is a driver, because all kernel threads
don't have slip context.
I think, I'm not sure, this is defined by the type attribute
in the thread structure.
I'm not sure, but I think it's the right definitions.
So the idea is to hook the page fault handler,
which is Interruption 14, in the IDT,
and check the current type, and know if we can
execute the payload.
It's actually to hook the IDT.
So the code, tells the current Interruption 14 address,
replaces it with the payload address,
and gives back execution control to the kernel
by calling the worker thread function.
When a thread generates page fault exceptions,
the new Interruption 14 handler, which is a payload,
is called.
A payload checks if the thread, if the current thread
has a slip context, and is a driver.
It's pretty easy to know the current thread
with the current thread kernel function.
So if the thread is not good,
we go back to the original page fault handler,
else it's time to restore the original one,
execute our previous ad user code,
and give back control to the Interruption 14.
So this is to resume that.
Our ad user payload is called.
We first check there is a slip context in the thread.
If it's not the case, we go back to the original
Interruption 14.
Else, if our thread is a driver, so type 3,
we continue, else we go back to the original one.
So if everything is okay, we first restore
the page fault handler, we add a new supervisor user,
and we go back to the original Interruption 14.
So try to show you a demo.
So if we check in a network,
let me check the user first.
Recon, recon, okay.
So if we try to log on as recon, recon,
it should not work.
Okay.
Oh, loading.
Okay.
I hope the previous exploit did not.
Okay, here we go.
So we cannot log on with recon, recon,
so we try the exploit.
Here we go.
So it should be pretty fast.
As I said, with 6.5 to generate a page fault,
but with 6.0, it's a bit more slow,
so we have to wait like one or two seconds.
Okay.
But in this case, it should be okay.
And now we added a new user, recon, recon,
recon with the password recon,
and this user has full supervisor rights
to the E directory tree.
So to conclude, a full kernel exploitation
in networks is not too difficult.
It just takes time to understand everything
about how the kernel works.
In the case of network, it's more reliable
to exploit everything in the kernel
than in the user learn, especially due to
written addresses, because it's really difficult
to find a stable one.
What needs to be done?
Create a complete framework for exploitation
with bind stage and command execution stage.
I'm not planning to do it right now.
I'm not sure it's really useful.
Another fun thing I tried to do,
actually, if you remember, I said that
the kernel itself is loaded in memory by using DOS.
So the thing is, I first, I tried to create a new stage
to actually inject myself in the DOS,
which is still in memory somewhere.
It's pretty easy to find it.
I tried to do that and to, at the same time,
change the build code, which can be found in memory too,
and then came back in real mode.
So we would have been totally idle from the kernel.
It first worked.
I was able to do that, go back in real mode,
but the problem is when you're in real mode,
it's kind of difficult to debug everything,
and I stopped at this point, because I never was able to,
after I came back to real mode in DOS,
and with the build, I never was able to
come back in network, actually.
So this was pretty useless.
So that's all, if you have any questions.
Thank you.