Search This Blog

Wednesday 27 July 2016

Kernel Tracing Qmax on Solaris – Part 1



Time to poke around the kernel and do a simple bit of reversing.

Whether you are a sysadmin, penetration tester, or reverse engineer, if you don't know about Solaris DTrace you will want to. It allows for low-latency instrumentation of the system. This includes function boundary tracing (FBT) of pretty much any function in the kernel and any application running on the server.

People have been able to solve complex problems in applications on other platforms simply because the application also ran on Solaris and they could use DTrace on that port of the application.

You can inspect memory in the kernel and application based on events, and much more. You can even get the kernel to lie to userland processes (with filtering on what it will lie to).

But we are getting ahead of ourselves.

As an example, which I'll cover a very small part here, I had an application that had multiple serious issues that even the vendor was struggling to solve.

To identify the problem areas the most powerful script I wrote was a single DTrace script. This single script was able to trace all the individual requests from the arrival of the initial connection and the three-way handshake using kernel FBT (including when Q = Qmax), through to the system call (syscall tracing), on to the listening thread within the Java app, and then the handover to the worker thread (user FBT) and keep an eye on all Java safepoints and then tell me what was blocking the threads (filesystem issues – sometimes closing a file descriptor took several seconds). All these stages had issues, and sometimes the Java safepoints aggravated the situation; but the picture it painted was invaluable – we could prove what was the cause before fixing it, not just say something was the probable cause, and do this directly on the live system. Tracing all of this without impacting the live application.

To start we will look at monitoring the incoming connection (the first syn packet), and whilst we are in the listener perimeter (mutual exclusion of the listener in interrupt context) we will report on the value of q0/q/qmax. i.e. we know if a connection is ignored because we can prove q==qmax. Using ndd only gives you a point in time, here we are provably showing what the settings are at the time that specific connection is evaluated within the kernel.

It helps to have the Solaris source code (or a version of it) – see https://github.com/illumos/illumos-gate.

Our first DTrace scripts

In source file usr/src/stand/lib/tcp/tcp.c we have tcp_conn_request(). It's first parameter is a tcp_t, which is defined in /usr/include/inet/tcp.h. As a reverse engineer we could figure this out without the source code, but I will leave that as an exercise.

The tcp_t has tcp_conn_req_cnt_q0, tcp_conn_req_cnt_q, and tcp_conn_req_max for Q0, Q and Qmax respectively.

Within this structure we also have a struct conn_s (tcp_connp) pointer, which is defined in /usr/include/inet/ipclassifier.h. Then in tcp_connp we have a union of structs, such that the local port is at u_port.tcpu_ports.tcp_lport.
So, the script would be:

#!/usr/sbin/dtrace -Cs

#pragma D option quiet

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/ip.h>

fbt:ip:tcp_conn_request:entry
{
        self->tcpq = (tcp_t*)arg0;

        printf("tcpq: %p\n", self->tcpq );

        printf("q0/q/qmax: %d/%d/%d\n",
                self->tcpq->tcp_conn_req_cnt_q0,
                self->tcpq->tcp_conn_req_cnt_q,
                self->tcpq->tcp_conn_req_max
        );
        printf("conn_lport: %d\n", (uint32_t)self->tcpq->tcp_connp->u_port.tcpu_ports.tcpu_lport );
}

If we run the script and then create a new connection to 22/tcp (e.g. using nc or telnet with a dst port) we see this:

# ./tcpq_ex1.d
dtrace: error on enabled probe ID 1 (ID 29873: fbt:ip:tcp_conn_request:entry): invalid alignment (0x603184280103) in action #5 at DIF offset 32

So, what is wrong. First, lets comment out all the printf's except the one that prints the address of arg0 and re-try:

root@sol10-u9-t4# ./tcpq_ex2.d
tcpq: ffffffff879e9b00

If we then look for the address using ndd we see that this doesn't look right:

root@sol10-u9-t4# ndd /dev/tcp tcp_listen_hash | egrep '(TCP|00022)'
    TCP            zone IP addr         port  seqnum   backlog (q0/q/max)
022 ffffffff879e9d00 0 :: 00022 00000081 0/0/8

After a bit of diagnostics it appears that we may have made a false assumption. After all, we are basing the functionality on an open source version of Solaris and a major version of Solaris has many major changes between updates, and I'm running update 9 (on Intel).

root@sol10-u9-t4# head -1 /etc/release
                    Oracle Solaris 10 9/10 s10x_u9wos_14a X86

If we look at the open source code we see that all versions have the following as the first test. It is therefore reasonable to expect that the closed source version will have that first (or near the first) test.

if (tcp->tcp_conn_req_cnt_q >= tcp->tcp_conn_req_max) {

So, lets run up a kernel debugger and have a look at the assembly of tcp_conn_request(). Here is the truncated and annotated output.

root@sol10-u9-t4# mdb -k
Loading modules: ..
> ::dis tcp_conn_request
...
tcp_conn_request+0xc:           movq   %rdi,%r15           ; arg0 into r15
tcp_conn_request+0xf:           movq   %r12,-0x20(%rbp)
tcp_conn_request+0x13:          movq   0x28(%r15),%r12     ; arg0 offset 0x28 into r12
tcp_conn_request+0x39:          cmpl   $-0x3,0x20(%r12)     ; some other test
tcp_conn_request+0x3f:          je     +0x21    <tcp_conn_request+0x60>
tcp_conn_request+0x41:          call   +0x7e70c2f       <freemsg>
tcp_conn_request+0x74:          movl   0x1d0(%r12),%edx
tcp_conn_request+0x7c:          cmpl   %edx,0x1cc(%r12)     ; cmp q and qmax?
tcp_conn_request+0x84:          jge    +0x8fd   <tcp_conn_request+0x981>

It appears that arg0 may not be tcp_t after all, but if we look at the offset 0x28 from arg0 we will find a pointer to it. In this case, there may be a public structure this maps to, there may not. I will leave that as an exercise for the reader.

Lets re-write the setting of self->tcpq to be the following and try again.

self->tcpq = *(tcp_t**)((char*)arg0+0x28);

This time it looks better:

root@sol10-u9-t4# ./tcpq_ex3.d
tcpq: ffffffff879e9d00
q0/q/qmax: 0/0/8
conn_lport: 0

So, Qmax etc look good but conn_lport doesn't.

We know that the tcp_t address is right from ndd, and that from tcp_lookup_listener_ipv4() we are looking in the right place for the local port, so in this case it may be that the metadata DTrace is using is not the actual structure (something is different). In this case lets go back to the kernel debugger to see if we can find something that could be a “port 22” in the conn_s structure.

> ffffffff879e9d00::print -t tcp_t tcp_connp
struct conn_s *tcp_connp = 0xffffffff879e9b00
> 0xffffffff879e9b00/300B
0xffffffff879e9b00:             0       0       0       0       0       0       0       0       3       0       0       0      
                0       0       0       0       0       0       0       0       0       0       0       0       9       0      
                0       0       0       0       0       0       0       0       0       0       0       0       0       0      
                0       0       0       16      6       0       0       0       0       0       0       0       0       0

This is at offset 266 (0x10a); so lets change this value to the following and re-run.

printf("conn_lport: %d\n", *(uint16_t*)((char*)self->tcpq->tcp_connp+0x10a) );

This time we get a value, but it doesn't look right:

root@sol10-u9-t4# ./tcpq_ex4.d
tcpq: ffffffff879e9d00
q0/q/qmax: 0/0/8
conn_lport: 5632

As we are on Intel and we are printing a network port, there is a good change this is in network byte order. Simple to test; we pass it through htons().

printf("conn_lport: %d\n", htons(*(uint16_t*)((char*)self->tcpq->tcp_connp+0x10a)) );

This time all the values look right:

root@sol10-u9-t4# ./tcpq_ex5.d
tcpq: ffffffff879e9d00
q0/q/qmax: 0/0/8
conn_lport: 22

As this is event driven and we haven't added any predicates we can just try another port to confirm things look ok. e.g. 111/tcp:

tcpq: ffffffff800d8240
q0/q/qmax: 0/0/64
conn_lport: 111

And a quick check with ndd confirms this:

root@sol10-u9-t4# ndd /dev/tcp tcp_listen_hash | egrep '(TCP|00111)'
    TCP            zone IP addr         port  seqnum   backlog (q0/q/max)
367 ffffffff800d8240 0 ::ffff:0.0.0.0 00111 00000010 0/0/64

A quick note, the reason q0 is not at least one. We are evaluating the parameters on entry to the function that will update them.

If we take a step back and think what we have done. We are now dynamically tracing incoming SYN packets caused by a network interrupt whilst within the mutual exclusion of the listener perimeter in interrupt context on a running (live) system. We are therefore certain as to the values of q0/q/qmax at that point in time. How awesome is that.

In the next part we will update the script so we also report on the origin of the packet. We will also add some predicates to only look at a particular listener.

I would appreciate some feedback on my articles, at least let me know you are reading them. Ideally, let me know where I could improve the Blog.

Wednesday 20 July 2016

Samsung Self Signed Certs



With this blog entry I thought I would ask a question.

As you would expect, someone with my background has a home network that is someone different to that of the average home user. Part of that is the use of a web proxy server, so all devices need to go via that to access the outside world over http-type protocols; including devices such as phones.

This can be very interesting; learning what devices actually want to talk to; which obey your proxy settings and which attempt to go direct.

If we take a Samsung android phone, for example. I see invalid requests which show up as the following on the web proxy (these aren't valid proxy requests if you look at a packet capture):

NONE/400 4130 NONE error:invalid-request - HIER_NONE/- text/html

We also see the usually 'big data' connections you now get with large corporates, such as:

TCP_DENIED/403 3713 CONNECT ew1.reg.bigdata.ssp.samsung.com:80 - HIER_NONE/- text/html

What sparked by interest here is that it is an SSL connection over port 80. I quick nc and s_client connect prove it is SSL. This then showed it was self-signed.

CONNECTED(00000003)
depth=0 C = KR, ST = Gyeonggi-do, L = Suwon, O = "SAMSUNG ELECTRONICS CO., LTD", OU = "Platform dev team., MSC", CN = *.bigdata.ssp.samsung.com, emailAddress = bigjoe@samsung.com
verify error:num=18:self signed certificate
verify return:1
depth=0 C = KR, ST = Gyeonggi-do, L = Suwon, O = "SAMSUNG ELECTRONICS CO., LTD", OU = "Platform dev team., MSC", CN = *.bigdata.ssp.samsung.com, emailAddress = bigjoe@samsung.com
verify return:1

Now, in of itself self-signed may not be a problem. If the app already knows what to expect via other channels, it just infers we don't know about the trust.

What is interesting is the depth; no cert chain and it uses a wildcard address.

If we then look at the cert itself, things get even more intriguing.

$ openssl x509 -text -noout -in samsung-self-sign.crt
Certificate:
    Data:
...
    Signature Algorithm: sha1WithRSAEncryption
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (1024 bit)
            X509v3 Basic Constraints:
                CA:TRUE

Yes; the certificate on the remote server is a CA with a weak key.

Before going down the rabbit hole, I'm sure others have looked at this. Not just the specific questions with Samsung android devices and what they connect to, but the general question of production servers (i.e. those used by paying customers, and/or have live data) using such suspect setups. I certainly recall the issues with Samsung TVs not encrypting customer data.

Thoughts?