From: Daniel Miller <bonsaiviking () gmail com>
Date: Sun, 18 Dec 2022 10:06:24 -0600
Matt, Thanks for your interest in Npcap! These are very good questions, and we hope to be able to improve Npcap's documentation to answer them soon. In the meantime, here are some answers that may help you: A recent survey of our log files from the field indicates that we are
missing packets. Specifically, converting our log files to .pcapng and opening them in Wireshark, we see about 1% of the packets showing the [TCP Previous segment not captured] message. Due to the nature of this data, this 1% loss is unacceptable to our users. As expected, this loss gets more dramatic with additional network traffic. Testing in the lab shows that Wireshark v3.6.7 captures the packets from a stress test with no apparent packet loss, so I know the problem is on our end.
Wireshark's "TCP Previous segment not captured" message does not necessarily mean that Npcap or your application was unable to capture a packet that otherwise made it to the system, though the direct stress test you mention does make it more likely that is the case. It is also possible that the packets were dropped by some other participant in the data path, such as an upstream router, switch, or another component of the NDIS stack like a firewall. A better measurement is Npcap's own internal stats, which can be obtained with the pcap_stats() function. This will return a struct pcap_stat with the ps_recv member showing the number of packets which have been delivered on the adapter (regardless of whether they are captured by your application, due to BPF filtering or buffer size limitations, etc.), and the ps_drop member showing the number of packets which have been dropped by this capture handle, due usually to buffer size limits but also potentially due to memory allocation failures.
- We call pcap_open() with a snap length of 65536, promiscuous mode enabled, and a read timeout of 500ms.
The recommended functions to open a capture handle are pcap_create() and pcap_activate(), which allow better fine-grained control over capture parameters via a number of pcap_set_*() functions. Modern systems with Receive Side Coalescing can indicate packets larger than the MTU/MSS of the adapter, so if your intent is to capture the entire packet, do not set a snaplen at all, which will set the maximum value. Promiscuous mode may not be supported on all adapters and, for most switched networks, will not necessarily result in more data captured. Review your application's needs to be sure this is appropriate. The read timeout can be tuned based on your application's needs, which may change depending on other changes you make based on this guidance.
- We call pcap_setbuff() to increase the size of the kernel buffer to 16MB.
pcap_setbuff() is a WinPcap extension that should not be used in new programs. Use pcap_set_buffer_size() instead.
- We then call pcap_next_ex() inside a for loop to get the next capture. - Upon successful return, we allocate a byte array using pkt_header.caplen, copy the pkt_data into the byte array, and add the byte array to a pre-allocated list. - We execute this for loop until the pre-allocated list is filled (to avoid reallocation) or a predetermined timeout is exceeded on the application side. - When either of these conditions is satisfied, we hand the pre-allocated list off to another thread, allocate a new list, and do the loop again. Here are my questions. 1. Is pcap_next_ex() the most efficient way of transferring captures to the application? It looks like pcap_loop() or pcap_dispatch() might allow multiple captures to be returned via a single callback. Is that correct? And if so, would that be the recommended way to get the captures in a high data rate environment?
The advantage of pcap_loop or pcap_dispatch() is that they handle the looping and offer better control over when to stop processing packets. pcap_dispatch(), in particular, will process packets until it is time to issue another request for packet data to the kernel (the Npcap driver in this case). This can be combined with the Windows Event returned by the pcap_getevent() function, which is signaled when a batch of packets is "ready" for the application to process, as defined by parameters set via pcap_setmintocopy(), pcap_set_timeout(), pcap_set_immediate(), etc. So an application will typically WaitForSingleObject (or other API function for synchronizing on an Event) until the event is signaled, then call pcap_dispatch() to run the callback on all received packets.
2. Our understanding is that the kernel buffer *IS* the ring buffer that must be read from at least as fast as the data is received in order to minimize/eliminate the occurrence of dropped packets. We understand the size of the buffer won't prevent dropped packets if the application can't keep up (it merely delays the moment when that occurs). But a bigger ring buffer can accommodate data spikes, allowing the application to catch up during data lulls.
This is correct. To this end, how big can we make the kernel buffer via pcap_setbuff()? Is
there a practical or rule-of-thumb limit?
The kernel buffer space is allocated from the NonPagedPool, which is a very precious resource. On my laptop currently running Windows 11 with 4GB of RAM, the NonPagedPool is 768MB. Fortunately, since Npcap 1.00 the "kernel buffer size" is interpreted as a limit, not allocated all at once as it was in WinPcap. This means that setting a ridiculously large buffer size will not immediately crash the system, and as long as you continue to read from it, it will likely never attain the full size. However, it does open up the possibility of running out of resources later, especially if you stop processing packets without closing the handle. Is the ring buffer associated with each handle???
The size limit is tracked per-handle, referring to the amount of packet data that particular handle is waiting on. If multiple handles are waiting on the same data, each one will account the storage towards its own limit, but the data will not be actually duplicated, and it will not be freed until the last handle retrieves the associated packet.
If we collect simultaneously with Npcap on multiple NICs, does the size of each ring buffer need to be limited in any way?
Each NIC (technically: each Npcap filter module, which is an instance of Npcap in a particular NDIS stack) stores packet data that a handle is waiting to retrieve. Multiple handles on the same NIC can share that data as described above, meaning that the actual amount of NonPagedPool used is most likely less than the sum of their buffer sizes. Handles on multiple NICs will not share data in this way, so it is more likely that they can consume the amount of NonPagedPool equal to the sum of their buffer sizes. Measurement is the best way to determine how much network data your application can process. Setting a small snaplen and putting as much filtering logic into the kernel BPF filter as possible are good ways to reduce the amount of kernel buffer that is needed.
Should we use pcap_set_buffer_size() instead of pcap_setbuff()? Can pcap_set_buffer_size() be called after pcap_open() like pcap_setbuff() can? Or do we need to use the create-set-activate pattern?
Both of these functions achieve the same result, but pcap_set_buffer_size() is preferred because it is standard libpcap API and will work on non-Npcap platforms.
3. We are confused about the difference between the kernel buffer and the user buffer. How does the user buffer work with pcap_next_ex()? Since pcap_next_ex() only returns a single packet at a time, does the user buffer even matter? Perhaps it comes into play with pcap_loop() or pcap_dispatch() being able to return more data in the callback?
The "user buffer" is used to transfer packet data between the kernel (Npcap's NDIS filter driver) and userspace (wpcap.dll). Tuning this parameter may help reduce overhead, but other parameters should probably be adjusted first. pcap_next_ex() reads from this buffer until it is empty, then it issues a Read call to retrieve more packets.
4. How does pcap_mintocopy() work with pcap_next_ex()? Again, since pcap_next_ex() only returns a single packet at a time, does this even apply? Perhaps with pcap_loop() or pcap_dispatch()?
The same mechanism retrieves packets for all of these functions. The MinToCopy parameter serves to reduce the number of Read calls, which each requires the user buffer be mapped and locked. It works best using the Windows Event synchronization mechanism along with pcap_dispatch().
5. Finally, can stats be enabled for the same handle that is capturing the data? If we want to monitor, for example, the number of dropped packets seen during a capture, how do we do that with the pcap_t returned by pcap_open()? If we need two pcap_t handles, one for capture and one for stats, does that imply a single ring buffer under the hood for a given NIC?
"Statistics mode" using pcap_setmode() with MODE_STAT is a WinPcap extension, and we have not done much to alter or fix it. It is mutually exclusive with capture mode, so two handles are required, but it does not buffer packets, only counts them. The preferred way to check statistics is the pcap_stats() function mentioned earlier. This can be used on any open capture handle regardless of mode. Hopefully this response, though a bit late, will be helpful.
_______________________________________________ Sent through the dev mailing list https://nmap.org/mailman/listinfo/dev Archived at https://seclists.org/nmap-dev/
Current thread:
- Re: Live Capture Performance to Rival Wireshark Daniel Miller (Dec 18)