qcap Documentation

qcap is designed to read application data out of network traffic. Development has focused on TCP protocols (specifically HTTP, FTP, and SMTP), but mechanisms exist to read other protocols as well.

For any topics that are not discussed here, please post to the project mailing list.

Getting Started

Design and Layout

libqcap is designed to be similar to libpcap. Like libpcap, qcap uses a session object, that is associated with each open trace. Also like libpcap, qcap uses callbacks to inform the application of events. To improve performance, qcap allows the application to request notifications for specific events; enabling qcap to limit processing to data the applications cares about.

Unlike libpcap, qcap provides a number of types specific to stream-based protocols for handling data. Those are:

A TCP stream. Every stream being tracked has one of these structures. As new segments arrive in the stream, they are added to the stream. A qcap_tcpstr_t can be queried for its contents with qcap_tcpstr_pos_t.
A position within a TCP stream. Positions are used to track the start and end of fields within a stream.
Represents a protocol grammar. Streams can be parsed with grammars. Normally, applications should not need to deal directly with grammars, unless they are dealing with a new protocol.

qcap also provides a family of callbacks that may be installed into the protocol parsing engine, to detect events and modify how parsing occurs:

Called when a new packet arrives. No reordering is performed on the packet.
Called when a new TCP stream is detected. The handler controls how segments are treated: whether they should automatically be stored, which direction(s) of the stream should be monitored, etc.
Called for every packet a TCP stream receives. Unless specified otherwise when the callback is registered, packets are only passed to this handler in order.
Called during protocol parsing to indicate that new data has arrived, also indicating which protocol field the data falls into.

As a rule, any typedef that ends with "handler_t" is a callback.

Idioms in qcap

qcap has a fairly large API. To try and minimize the amount of thinking the programmer must do, there are a number of idioms that are followed fairly consistently throughout the project.

API Return Values

Function calls are one of two forms: those that cannot fail are declared as void, while those that can fail are declared as int. Any function declared as int will return QCAP_ERR_OKAY when it succeeds, or some other value if it fails.

Function Naming

Functions are of the form qcap_type_op. type is the short form of one of the types described above, and should be considered the type that the function operates upon. op is the action being performed on the type.

Handler Control

Callbacks can disassociate themselves from repeating events with their return value. They do this by returning QCAP_CONTROL_DETACH. Alternatively, they can signal an error with QCAP_CONTROL_ERROR or indicate that they wish to receive future notifications with QCAP_CONTROL_CONTINUE.

Downloading qcap

qcap is currently only available via SVN. See the SourceForge CVS page for qcap for instructions on how to download the source.

Generating Documentation

qcap uses Doxygen for code documentation. Assuming that you have Doxygen installed, you can generate this documentation by running make doc in lib/src/libqcap/.

Opening Traces

qcap currently only operates in an offline mode. A trace can be opened with

char *ERRBUF[QCAP_ERRBUF_SIZE]; int main(int argc, char *argv[]) { qcap_t *qp = NULL; int r = qcap_open_offline(&qp, "trace.pcap", ERRBUF); if (r != QCAP_ERR_OKAY) { printf("Error: %s\n", ERRBUF); return -1; } // Do stuff qcap_close(qp); }

If all goes according to plan, qcap_open_offline() returns QCAP_ERR_OKAY, indicating that the file was properly opened, and the qcap_t returned through the first argument has been properly initialized. If any other value is returned, then the last argument will have a string written into it, describing the nature of the error.

After the trace has been read, the resources allocated by the stream must be banished back to the heap. This is done with a call to qcap_close().

Reading Streams

Following Individual Streams

qcap assumes that applications will not with to treat all streams equally. The application may wish to track certain streams, or may wish to handle different classes of stream in differing ways. qcap supports this by dividing stream handlers into two. When a new stream is detected, instances of qcap_stream_handler_t are called, and are given the ability to attach one or more qcap_segment_handler_ts to the stream. The segment handlers are called for each subsequent packet received in the stream.

The application must take three steps:

First, a stream handler must be registered. It is responsible for telling qcap which streams are interesting to the application, and how qcap should treat them. Any number of stream handlers may be assigned to a qcap instance.

Second, qcap_loop() is called. qcap_loop() reads packets and processes them. If one or more stream handlers have been registered, then qcap performs IP defragmentation and TCP reconstruction. For each new stream discovered, each stream handler is called.

Third, each stream handler may decide to track the newly found stream. They do so by registering a segment handler with the new stream. The segment handler will be called, in order, for every new segment ACK'd into the stream.

The process is somewhat involved. We shall illustrate this behaviour through an example.

int main(int argc, char *argv[]) { qcap_t *qp = NULL; int r = qcap_open_offline(&qp, "trace.pcap", NULL); if (r != QCAP_ERR_OKAY) { return -1; } r = qcap_stream_handler_add(qp, cb_stream, NULL, ERRBUF); if (r != QCAP_ERR_OKAY) { return -1; } qcap_loop(qp, 0, NULL); qcap_close(qp); }

The code above opens a trace file, assigns the callback cb_stream to the qcap handle, and then tells qcap to read in all packets with qcap_loop(). At this point, cb_stream remains undefined. Let's define it:

qcap_stream_control_t cb_stream(struct qcap_packet *p, struct tcp_stream *a_tcp, int isOriginalSyn, void *data) { struct tcphdr *hdr = (struct tcphdr *)(p->ip_packet + 4 * ((struct ip*)p->ip_packet)->ip_hl); u_short dstPort = ntohs(hdr->th_dport); if (dstPort == 80) { qcap_tcphdr_add_handler(p->qcap, a_tcp, cb_seg, QCAP_SEGMENT_OPTS_ALL, NULL, NULL); } return QCAP_CONTROL_CONTINUE; }

cb_stream() decides if a stream is interesting by checking the server port. If the server port is 80, it associates the callback cb_seg() with the stream. For every packet (including the one cb_stream() was called on), cb_seg() will be triggered. It does this by calling qcap_tcphdr_add_handler().

cb_seg() can do anything at this point. For now, let's tell it to print the sequence numbers of the segments it receives.

qcap_control_t seg_cb( struct qcap_packet *p, struct tcp_stream *stream, struct half_stream *sender, qcap_direction_t dir, u_char *segment, u_int segmentLen, qcap_stream_state_t *state, void *data ) { struct tcphdr *hdr = (struct tcphdr *)(p->ip_packet + 4 * ((struct ip*)p->ip_packet)->ip_hl); int seq = ntohl(hdr->th_seq); printf("Got sequence number %x\n", seq); return QCAP_CONTROL_CONTINUE; }

Getting Text Out of Streams

If an application wants to see the TCP payload of a stream, the have two options. They can reconstruct it by hand, or they can let qcap do the work. The first option is left as an exercise to the reader. The second option is described here.

In order to query a stream, qcap_tcpstr_pos_t structures must be used. Each structure contains a tuple of the packet identity, and the position within the segment. To query a stream for a substring, you must create two positions: one at the start, and one at the end of the substring. Then you can call qcap_tcpstr_pos_get_between_str() to pull the text out.

The following sample code will read the first 60 bytes out of a stream. It builds upon the examples shown above, by replacing the stream and segment handlers. The stream handler asks qcap to maintain a qcap_tcpstr_t for this stream, and creates two qcap_tcpstr_pos_t structures to point to the start and end of the portion of the stream that we're interested in. Subsequently, the segment handler checks to see if the qcap_tcpstr_t has accumulated enough data to provide us with our answer. If it has, it prints the first 60 bytes, and disassociates itself from the stream.

qcap_control_t seg_cb( struct qcap_packet *p, struct tcp_stream *stream, struct half_stream *sender, qcap_direction_t dir, u_char *segment, u_int segmentLen, qcap_stream_state_t *state, void *data ) { struct tcphdr *hdr = (struct tcphdr *)(p->ip_packet + 4 * ((struct ip*)p->ip_packet)->ip_hl); int seq = ntohl(hdr->th_seq); printf("Got sequence number %x\n", seq); return QCAP_CONTROL_CONTINUE; } qcap_control_t seg_cb( struct qcap_packet *p, struct tcp_stream *stream, struct half_stream *sender, qcap_direction_t dir, u_char *segment, u_int segmentLen, qcap_stream_state_t *state, void *user ) { struct tcphdr *hdr = (struct tcphdr *)(p->ip_packet + 4 * ((struct ip*)p->ip_packet)->ip_hl); int seq = ntohl(hdr->th_seq); struct buffer *buf = user; int r = qcap_tcpstr_pos_get_between(buf->start, buf->end, buf->data); printf("Got sequence number %x\n", seq); return QCAP_CONTROL_CONTINUE; }

Reading Specific Fields from Specific Protocols


Yes, we know it's very easy to confuse "qcap" with "pcap". The HCI issues of the name were not considered at design time. We're sorry.