docs/npcap-internals.xml

<sect1 id="npcap-internals">
  <title>Npcap internals</title>
  <sect1info>
    <abstract>
      <para>Describes the internal structure and interfaces of Npcap: the NPF
        driver and Packet.dll</para>
    </abstract>
  </sect1info>
  <para>This portion of the manual describes the internal structure and
    interfaces of Npcap, starting from the lowest-level module. It is targeted
    at people who must extend or modify this software, or to the ones
    interested in how it works. Therefore, developers who just want to use
    Npcap in their software don't need to read it.</para>

  <sect2 id="npcap-structure">
    <title>Npcap structure</title>

    <para>Npcap is an architecture for packet capture and network analysis for the
      Win32 platforms. It includes a kernel-level packet filter, a
      low-level dynamic link library (packet.dll), and a high-level and
      system-independent library (wpcap.dll).</para>

    <para>Why do we use the term <wordasword>architecture</wordasword> rather
      than <wordasword>library</wordasword>? Because packet capture is a low
      level mechanism that requires a strict interaction with the network
      adapter and with the operating system, in particular with its networking
      implementation, so a simple library is not sufficient.</para>

    <!-- TODO: update this markup with a diagram
    <para>The following figure shows the various components of Npcap:</para>
    <p align="center">
      <img src="internals-arch.gif" width="280" height="355">
        <p align="center"> -->
    <sect3>
      <title>Main components of Npcap.</title>

      <para>First, a capture system needs to bypass the operating systems's
        protocol stack in order to access the raw data transiting on the
        network. This requires a portion running inside the kernel of OS,
        interacting directly with the network interface drivers. This portion
        is very system dependent, and in our solution it is realized as a
        device driver, called Netgroup Packet Filter (NPF); This driver offers
        basic features like packet capture and injection, as well as more
        advanced ones like a programmable filtering system and a monitoring
        engine. The filtering system can be used to restrict a capture session
        to a subset of the network traffic (e.g. it is possible to capture only
        the ftp traffic generated by a particular host); the monitoring engine
        provides a powerful but simple to use mechanism to obtain statistics on
        the traffic (e.g. it is possible to obtain the network load or the
        amount of data exchanged between two hosts).</para>

      <para>Second, the capture system must export an interface that user-level
        applications will use to take advantage of the features provided by the
        kernel driver. Npcap provides two different libraries:
        <filename>packet.dll</filename> and
        <filename>wpcap.dll</filename>.</para>

      <para> Packet.dll offers a low-level API that can be used to directly
        access the functions of the driver, with a programming interface
        independent from the Microsoft OS.</para>

      <!-- TODO: Should we acknowledge libpcap more significantly here? Wpcap.dll *IS* libpcap, not just compatible. -->
      <para>Wpcap.dll exports a more powerful set of high level capture
        primitives that are compatible with libpcap, the well known Unix
        capture library. These functions enable packet capture in a manner that
        is independent of the underlying network hardware and operating
        system.</para>
    </sect3>
  </sect2>

  <sect2 id="npcap-internals-driver">
    <title>Npcap driver internals</title>

    <para>This section documents the internals of the Netgroup Packet Filter
      (NPF), the kernel portion of Npcap. Normal users are probably interested
      in how to use Npcap and not in its internal structure. Therefore the
      information present in this module is destined mainly to Npcap developers
      and maintainers, or to the people interested in how the driver works. In
      particular, a good knowledge of OSes, networking and Windows kernel
      programming and device drivers development is required to profitably read
      this section.</para>

    <para>NPF is the Npcap component that does the hard work, processing the
      packets that transit on the network and exporting capture, injection and
      analysis capabilities to user-level.</para>

    <para>The following paragraphs will describe the interaction of NPF with
      the OS and its basic structure.</para>

    <sect3 id="npcap-internals-driver-ndis">
      <title>NPF and NDIS</title>

      <para>NDIS (Network Driver Interface Specification) is a standard that
        defines the communication between a network adapter (or, better, the
        driver that manages it) and the protocol drivers (that implement for
        example TCP/IP). Main NDIS purpose is to act as a wrapper that allows
        protocol drivers to send and receive packets onto a network (LAN or
        WAN) without caring either the particular adapter or the particular
        Win32 operating system.</para>

      <para>NDIS supports three types of network drivers:</para>

      <orderedlist>
        <listitem>
          <para><emphasis>Miniport drivers</emphasis>. Miniport drivers
            directly manage network interface cards, referred to as NICs. The
            miniport drivers interface directly to the hardware at their lower
            edge and at their upper edge present an interface to allow upper
            layers to send packets on the network, to handle interrupts, to
            reset the NIC, to halt the NIC and to query and set the operational
            characteristics of the driver.</para>
          <para>Miniport drivers implement only the hardware-specific
            operations necessary to manage a NIC, including sending and
            receiving data on the NIC. Operations common to all lowest level
            NIC drivers, such as synchronization, is provided by NDIS.
            Miniports do not call operating system routines directly; their
            interface to the operating system is NDIS.</para>
          <para>A miniport does not keep track of bindings. It merely passes
            packets up to NDIS and NDIS makes sure that these packets are
            passed to the correct protocols.</para>
        </listitem>
        <listitem>
          <para><emphasis>Intermediate drivers</emphasis>. Intermediate drivers
            interface between an upper-level driver such as a protocol driver
            and a miniport. To the upper-level driver, an intermediate driver
            looks like a miniport. To a miniport, the intermediate driver looks
            like a protocol driver. An intermediate protocol driver can layer
            on top of another intermediate driver although such layering could
            have a negative effect on system performance. A typical reason for
            developing an intermediate driver is to perform media translation
            between an existing legacy protocol driver and a miniport that
            manages a NIC for a new media type unknown to the protocol driver.
            For instance, an intermediate driver could translate from LAN
            protocol to ATM protocol. An intermediate driver cannot communicate
            with user-mode applications, but only with other NDIS drivers.</para>
        </listitem>
        <listitem>
          <para><emphasis>Filter drivers</emphasis>. Filter drivers can monitor
            and modify traffic between protocol drivers and miniport drivers
            like an intermediate driver, but are much simpler. They have less
            processing overhead than intermediate drivers.</para>
        </listitem>
        <listitem>
          <para><emphasis>Transport drivers or protocol drivers</emphasis>. A
            protocol driver implements a network protocol stack such as IPX/SPX
            or TCP/IP, offering its services over one or more network interface
            cards. A protocol driver services application-layer clients at its
            upper edge and connects to one or more NIC driver(s) or
            intermediate NDIS driver(s) at its lower edge.</para>
        </listitem>
      </orderedlist>

      <para>NPF is implemented as a filter driver. In order to provide complete
        access to the raw traffic and allow injection of packets, it is
        registered as a modifying filter driver in the compression
        <literal>FilterClass</literal>.</para>

      <para>Notice that the various Windows operating systems have different
        versions of NDIS: NPF is NDIS 6.0 compliant, and so requires a Windows
        OS that supports NDIS 6.0: Windows Vista or later.</para>

      <!-- TODO: Update this figure for Npcap
      <para>Next figure shows the position of NPF inside the NDIS stack:</para>
      <p align="center"><img border="0" src="npf-ndis.gif"></para>
        <p align="center"><b>Figure 1: NPF inside NDIS.</b></para>
      -->
      <!-- TODO: Verify that this documentation is still correct for NDIS 6.0 LWF
      <para>The interaction with the OS is normally asynchronous. This means
        that the driver provides a set of callback functions that are invoked
        by the system when some operation is required to NPF. NPF exports
        callback functions for all the I/O operations of the applications:
        open, close, read, write, ioctl, etc.</para>

      <para>The interaction with NDIS is asynchronous as well: events like the
        arrival of a new packet are notified to NPF through a callback function
        (Packet_tap() in this case). Furthermore, the interaction with NDIS and
        the NIC driver takes always place by means of non blocking functions:
        when NPF invokes a NDIS function, the call returns immediately; when
        the processing ends, NDIS invokes a specific NPF callback to inform
        that the function has finished. The driver exports a callback for any
        low-level operation, like sending packets, setting or requesting
        parameters on the NIC, etc.</para>
       -->
    </sect3>
    <sect3 id="npcap-internals-structure">
      <title>NPF structure basics</title>

      <!-- TODO: Update this figure
      <para>Next figure shows the structure of Npcap, with particular reference
        to the NPF driver.</para>

        <p align="center"><img border="0" src="npf-npf.gif" width="500" height="412"></para>

          <p align="center"><b>Figure 2: NPF device driver.</b>
        -->

      <para>NPF is able to perform a number of different operations: capture,
        monitoring, dump to disk, packet injection. The following paragraphs
        will describe shortly each of these operations.</para>
      <sect4 id="npcap-internals-capture">
        <title>Packet Capture</title>

        <para>The most important operation of NPF is packet capture.  During a
          capture, the driver sniffs the packets using a network interface and
          delivers them intact to the user-level applications.</para>
        <para>The capture process relies on two main components:</para>
        <itemizedlist>
          <listitem><para>A packet filter that decides if an incoming packet
              has to be accepted and copied to the listening application.  Most
              applications using NPF reject far more packets than those
              accepted, therefore a versatile and efficient packet filter is
              critical for good over-all performance. A packet filter is a
              function with boolean output that is applied to a packet. If the
              value of the function is true the capture driver copies the
              packet to the application; if it is false the packet is
              discarded. NPF packet filter is a bit more complex, because it
              determines not only if the packet should be kept, but also the
              amount of bytes to keep. The filtering system adopted by NPF
              derives from the <emphasis>BSD Packet Filter</emphasis> (BPF), a
              virtual processor able to execute filtering programs expressed in
              a pseudo-assembler and created at user level. The application
              takes a user-defined filter (e.g. <quote>pick up all UDP
                packets</quote>) and, using wpcap.dll, compiles them into a BPF
              program (e.g.  <quote>if the packet is IP and the
                <literal>protocol type</literal>  field is equal to 17, then
                return true</quote>). Then, the application uses the
              <literal>BIOCSETF</literal> IOCTL to inject the filter in the
              kernel. At this point, the program is executed for every incoming
              packet, and only the conformant packets are accepted. Unlike
              traditional solutions, NPF does not
              <emphasis>interpret</emphasis> the filters, but it
              <emphasis>executes</emphasis> them. For performance reasons,
              before using the filter NPF feeds it to a JIT compiler that
              translates it into a native 80x86 function. When a packet is
              captured, NPF calls this native function instead of invoking the
              filter interpreter, and this makes the process very fast.  The
              concept behind this optimization is very similar to the one of
              Java jitters.</para>
          </listitem>
          <listitem>
            <para>A circular buffer to store the packets and avoid loss. A
              packet is stored in the buffer with a header that maintains
              information like the timestamp and the size of the packet.
              Moreover, an alignment padding is inserted between the packets in
              order to speed-up the access to their data by the applications.
              Groups of packets can be copied with a single operation from the
              NPF buffer to the applications. This improves performances
              because it minimizes the number of reads. If the buffer is full
              when a new packet arrives, the packet is discarded and hence it's
              lost. Both kernel and user buffer can be changed at runtime for
              maximum versatility: packet.dll and wpcap.dll provide functions
              for this purpose.</para>
          </listitem>
        </itemizedlist>

        <para>The size of the user buffer is very important because it determines
          the <emphasis>maximum</emphasis> amount of data that can be copied from
          kernel space to user space within a single system call. On the other
          hand, it can be noticed that also the <emphasis>minimum</emphasis>
          amount of data that can be copied in a single call is extremely
          important. In presence of a large value for this variable, the kernel
          waits for the arrival of several packets before copying the data to the
          user. This guarantees a low number of system calls, i.e. low processor
          usage, which is a good setting for applications like sniffers. On the
          other side, a small value means that the kernel will copy the packets
          as soon as the application is ready to receive them. This is excellent
          for real time applications (like, for example, ARP redirectors or
          bridges) that need the better responsiveness from the kernel.  From
          this point of view, NPF has a configurable behavior, that allows users
          to choose between best efficiency or best responsiveness (or any
          intermediate situation).</para>

        <para>The wpcap library includes a couple of system calls that can be
          used both to set the timeout after which a read expires and the minimum
          amount of data that can be transferred to the application. By default,
          the read timeout is 1 second, and the minimum amount of data copied
          between the kernel and the application is 16K.</para>
      </sect4>
      <sect4 id="npcap-internals-injection">
        <title>Packet injection</title>

        <para>NPF allows to write raw packets to the network. To send data, a
          user-level application performs a WriteFile() system call on the NPF
          device file. The data is sent to the network as is, without
          encapsulating it in any protocol, therefore the application will have
          to build the various headers for each packet. The application usually
          does not need to generate the FCS because it is calculated by the
          network adapter hardware and it is attached automatically at the end of
          a packet before sending it to the network.</para>

        <para>In normal situations, the sending rate of the packets to the
          network is not very high because of the need of a system call for each
          packet. For this reason, the possibility to send a single packet more
          than once with a single write system call has been added. The
          user-level application can set, with an IOCTL call
          (<literal>BIOCSWRITEREP</literal>), the number of times a single packet
          will be repeated: for example, if this value is set to 1000, every raw
          packet written by the application on the driver's device file will be
          sent 1000 times.  This feature can be used to generate high speed
          traffic for testing purposes: the overload of context switches is no
          longer present, so performance is remarkably better.</para>
      </sect4>

      <sect4 id="npcap-internals-monitoring">
        <title>Network monitoring</title>

        <para>Npcap offers a kernel-level programmable monitoring module, able to
          calculate simple statistics on the network traffic.  Statistics can be
          gathered without the need to copy the packets to the application, that
          simply receives and displays the results obtained from the monitoring
          engine.  This allows to avoid great part of the capture overhead in
          terms of memory and CPU clocks.</para>

        <para>The monitoring engine is made of a <emphasis>classifier</emphasis>
          followed by a <emphasis>counter</emphasis>. The packets are classified
          using the filtering engine of NPF, that provides a configurable way to
          select a subset of the traffic. The data that pass the filter go to the
          counter, that keeps some variables like the number of packets and the
          amount of bytes accepted by the filter and updates them with the data
          of the incoming packets. These variables are passed to the user-level
          application at regular intervals whose period can be configured by the
          user. No buffers are allocated at kernel and user level.</para>
      </sect4>

      <!-- This functionality does not work in Npcap and did not work in the latest WinPcap either.
      <sect4 id="npcap-internals-dump">
        <title>Dump to disk</title>
        <para>The dump to disk capability can be used to save the network data to
          disk directly from kernel mode.</para>
	--><!-- TODO: update this figure
            <p align="center"><img border="0" src="npf-dump.gif" width="400" height="187">
              </para>
              <p align="center"><b>Figure 3: packet capture versus kernel-level dump.</b>
              </para>
	--><!-- kernel dump doesn't work
        <para>In traditional systems, every packet is copied several times, and
          normally 4 buffers are allocated: the one of the capture driver, the
          one in the application that keeps the captured data, the one of the
          stdio functions (or similar) that are used by the application to write
          on file, and finally the one of the file system.</para>

        <para>When the kernel-level traffic logging feature of NPF is enabled,
          the capture driver addresses the file system directly.  Only two
          buffers and a single copy are necessary, the number of system call is
          drastically reduced, therefore the performance is considerably
          better.</para>

        <para>Current implementation dumps the to disk in the widely used libpcap
          format. It gives also the possibility to filter the traffic before the
          dump process in order to select the packet that will go to the disk.</para>
      </sect4>
-->
    </sect3>
  </sect2>
  <sect2 id="npcap-internals-references">
    <title>Further reading</title>
    <para>The structure of NPF and its filtering engine derive directly from
      the one of the BSD Packet Filter (BPF), so if you are interested the
      subject you can read the following papers:</para>
    <itemizedlist>
      <listitem><para>S. McCanne and V. Jacobson, <ulink
            url="ftp://ftp.ee.lbl.gov/papers/bpf-usenix93.ps.Z">The BSD Packet
            Filter: A New Architecture for User-level Packet Capture</ulink>.
          Proceedings of the 1993 Winter USENIX Technical Conference (San
          Diego, CA, Jan.  1993), USENIX.</para>
      </listitem>
      <listitem><para>A. Begel, S. McCanne, S.L.Graham, BPF+: <ulink
            url="http://www.acm.org/pubs/articles/proceedings/comm/316188/p123-begel/p123-begel.pdf">Exploiting
            Global Data-flow Optimization in a Generalized Packet Filter
            Architecture</ulink>, Proceedings of ACM SIGCOMM '99, pages 123-134,
          Conference on Applications, technologies, architectures, and
          protocols for computer communications, August 30 - September 3, 1999,
          Cambridge, USA</para>
      </listitem>
    </itemizedlist>
  </sect2>

</sect1>