Every touchpad gesture passes through at least four distinct layers before it triggers an action on screen — and each layer can change what that gesture means. Getting the layers wrong is why some pads feel sluggish, why gestures vanish after a driver update, and why a web app can detect a scroll but not a pinch. Understanding the pipeline demystifies all of it.
Layer 1: the sensing hardware and firmware
A touchpad surface is covered in a grid of capacitive electrodes. When a finger hovers close, it disturbs the electric field between rows and columns of that grid, and the touchpad's controller chip measures those disturbances many times per second — at least 100 times per second per the Windows Precision Touchpad specification, climbing to 125 Hz for a single contact. The chip converts the raw capacitance readings into a list of contact objects: each one gets a coordinate pair (x, y), an estimated contact area, and a contact ID that persists as long as that finger stays on the surface.
This firmware stage is entirely analog-to-digital translation. The chip does not know what a "swipe" or a "tap" is. It knows only that contact ID 2 is currently at (381, 204) and was at (376, 204) one scan ago. That raw stream is what travels up the USB or I2C bus to the operating system.

Layer 2: the HID report and the OS gesture engine
On a Windows Precision Touchpad (PTP), the firmware packages each scan's contact list into a standardized HID (Human Interface Device) report and sends it to Windows. Windows receives the raw contacts — not interpreted events — and its own gesture engine (part of the input stack in hidclass.sys and the precision touchpad driver) does all the recognition work. This is the architectural split that makes Precision Touchpads behave consistently across brands: the interpretation lives in the OS, not in vendor firmware.
On a standard (legacy) touchpad, that split does not exist. The vendor driver — Synaptics, Elan, or ALPS — processes the raw capacitance data itself and hands Windows only finished mouse-style events: cursor-moved, button-down, scroll-delta. Windows never sees the individual contacts.
The OS gesture engine applies its classification rules on the raw contacts it receives:
- Tap detection — a contact that appears, moves less than a threshold distance, and disappears quickly is classified as a tap. The maximum movement before a contact is disqualified from being a tap, a swipe, or a long-press is a small distance in physical units; the OS maps this to screen pixels based on the pad's reported dimensions.
- Two-finger scroll — two contacts moving in the same direction at comparable speed produce scroll delta events, not pointer movement. The OS forwards these as wheel events to whichever window has focus.
- Three- and four-finger gestures — routed directly to Windows shell actions (Mission Control equivalent, virtual desktops) or intercepted by apps that register for them via the Windows Gesture API.
- Palm rejection — contacts that arrive from the edge zones of the pad, or that coincide with keyboard activity, are suppressed before any gesture logic even runs (covered separately under palm rejection).
Layer 3: the browser's event model
By the time a gesture reaches a web page, it has already been processed twice — once by firmware and once by the OS. The browser receives only the events Windows chose to forward. For a Precision Touchpad, those are PointerEvents with a pointerType of "mouse" for single-contact movement, plus WheelEvents for two-finger scroll and pinch-to-zoom. The browser never sees the individual finger contacts from a desktop trackpad — that raw data stays inside the OS layer.
This means the gestures a web page can detect from a trackpad are structurally different from what a touchscreen sends. A touchscreen delivers individual finger contacts as pointerType: "touch" events, and a page can implement its own pinch logic by tracking the distance between two simultaneous pointers. A trackpad delivers a pre-computed zoom delta as a WheelEvent with ctrlKey: true — no two-pointer math required on the web side.
Layer 4: the application's own thresholds
Applications — including JavaScript running in a browser — apply a final layer of classification on top of the events they receive. The tester above illustrates this concretely: its gesture engine defines a TAP_THRESHOLD of 10 px — if a pointer travels more than 10 pixels between pointerdown and pointerup, it is disqualified from being a tap. A LONG_PRESS_THRESHOLD of 500 ms triggers if the contact is held that long without moving past the 10 px limit. A DOUBLE_TAP_THRESHOLD of 300 ms means two taps must land within 300 milliseconds of each other to count as a double-tap. Swipes require both a minimum distance of 50 px and a minimum velocity of 0.5 px/ms, so slow drags are not misclassified as swipes.
These thresholds exist at every layer of the pipeline. A gesture that survives the firmware's contact-area filter, passes the OS's movement-distance check, and still produces the right event sequence at the browser level can still be rejected by the application if it falls outside the app's own timing or distance window. Tuning any one layer without considering the others is why gesture sensitivity can feel off even after a driver update changes nothing visible to the user.
Why the layered model matters for diagnostics
When a gesture stops working, the layer model points you toward the right suspect. If no gestures work — not even single-tap — the problem is at layer 1 (hardware) or early layer 2 (driver not loaded). If two-finger scroll works but three-finger swipe does not, the OS gesture engine is receiving contacts but a specific gesture rule has changed — check Windows Settings for disabled gestures. If scroll works in native apps but not in a specific website, the app's event handlers are the issue, not the driver. Microsoft's Precision Touchpad tuning guidelines document the registry values that control the OS layer's thresholds, making it possible to adjust sensitivity without touching firmware.
Check yourself: the tool above logs every pointer and wheel event the browser receives, and its Gesture Detection panel applies its own threshold rules on top of those events. If the OS layer is forwarding contacts correctly, you will see taps, double-taps, and long-presses register in the panel — each one representing a gesture that survived every layer of the pipeline from capacitive grid to browser application.