Open Source Hand Gesture Controls for Maps in the Browser

3 Apr 2026(Updated 10 May 2026)

Screen recording of the map gesture control demo showing an OpenLayers map with a small webcam preview in the corner. A user pans the map by making a fist and moving their hand, then zooms by spreading two open hands apart. All hand tracking and gesture detection runs in the browser using MediaPipe WASM with no backend.

Ever wanted to pan, zoom, rotate, and reset a map without touching anything? I built an open-source library that lets you control web maps using just your hands. No mouse, no touchscreen, no backend. Everything runs in the browser. It now supports OpenLayers, Google Maps, and Leaflet.

Why I built this#

I've been into webcam-based interactions for a while now (some of you might know Eyebrow Tetris. The idea of controlling things with your body instead of a mouse or keyboard just clicks with me. So when I started thinking about maps in kiosk setups, museum exhibits, and accessibility scenarios, it felt like a natural next step.

What if you could just hold up your fist and drag a map around? Or spread two open hands apart to zoom in? That's exactly what map-gesture-controls does.

How it works#

The library uses MediaPipe hand tracking, running as WASM entirely in the browser. Your webcam feed never leaves the device. Privacy is built in, not bolted on.

Here's the basic flow:

  1. Webcam capture opens your camera and feeds each frame to the MediaPipe Hand Landmarker, which returns 21 3D landmarks per detected hand.
  2. Gesture classification A fist or pinch with your left hand means pan. Same gesture with your right hand means zoom. Both hands tilting means rotate. Hands together in a prayer pose means reset. A state machine with dwell timers and grace periods makes sure brief tracking hiccups don't trigger accidental actions.
  3. Map library integration translates the hand movement into actual map interactions, with dead-zone filtering and smooth transitions so it doesn't feel jittery.

There's a configurable dwell timer to prevent accidental triggers and a grace period that smooths out those moments when tracking briefly drops. It feels surprisingly natural once you try it.

The gestures#

Demo of map-gesture-controls: a person uses hand gestures in front of a webcam to control an OpenLayers map. A small overlay in the corner shows the webcam feed and a legend listing the available gestures: pan, zoom, rotate, reset, and idle
Demo of map-gesture-controls: a person uses hand gestures in front of a webcam to control an OpenLayers map. A small overlay in the corner shows the webcam feed and a legend listing the available gestures: pan, zoom, rotate, reset, and idle
  • Pan: Make a fist or pinch with your left hand, then move it around. The map follows.
  • Zoom: Make a fist or pinch with your right hand, then move it up to zoom in or down to zoom out.
  • Rotate: Hold up both hands (fist or pinch) and tilt your wrists. The map rotates with you.
  • Reset: Bring both hands together like a prayer and hold for one second. The map snaps back to where it started.

Fist or pinch both work for every gesture, so just do whatever feels natural to you.

Getting started#

If you want to try it yourself:

OpenLayers#

Then wire it up to your OpenLayers map:

That's it. A few lines and you've got gesture-controlled maps. The start() call needs to happen from a user interaction (like a button click) because of browser webcam permission rules.

Google Maps#

You need a Google Maps API key and a Map ID from the Google Cloud Console. Create the Map ID with the Vector map type to enable rotation support.

Leaflet#

No API key needed. Leaflet uses OpenStreetMap tiles by default. Rotation is supported via CSS transforms on the map pane.
The API is the same across all three libraries. Pick the package that matches your map, wire up GestureMapController, and call start().

Use cases#

This isn't just a fun demo (though it is fun). There are real situations where touchless map interaction makes a difference:

  • Kiosks and exhibits: museums, visitor centers, trade shows. Nobody wants to touch a shared screen.
  • Accessibility: for users with limited mobility who can move their hands but can't grip a mouse or use a touchscreen precisely.
  • Touchless environments: think medical settings, clean rooms, or food service areas where you don't want to touch shared devices.
  • Creative installations: interactive art, data visualizations, or any project where the UI itself is part of the experience.

What's under the hood#

The project is split into four packages:

  • @map-gesture-controls/core: the gesture detection engine. It's map-agnostic, so you could hook it up to anything.
  • @map-gesture-controls/ol: the OpenLayers integration.
  • @map-gesture-controls/google-maps: the Google Maps integration.
  • @map-gesture-controls/leaflet: the Leaflet integration.
Most people just need the npm package for their map library, since each one re-exports everything from core. It's written in TypeScript, fully typed, and the bundle is small. The MediaPipe WASM model (~10 MB) loads on first start, and after that it's all local processing.

What changed#

After the project hit 20 stars on GitHub, I figured there were enough people finding it useful to justify expanding beyond OpenLayers. I already expected that more map libraries would follow, so I set it up as a monorepo from the start. I built the Google Maps adapter first. Not long after, someone opened a GitHub issue asking for Leaflet support, so I built that too. All three integrations share the same gesture engine and the same API surface.

This project didn't stay the way I first shipped it. The original zoom gesture used two spread hands moving apart, like a pinch in the air. It worked, but people told me it felt tiring and unnatural. So I reworked the controls to use simpler fist and pinch gestures with single-hand movements. The reset gesture was also a feature asked by the community. People wanted a quick way to snap back to the starting view without reloading.

The project now has 90+ stars and all of this happened in the open on GitHub. It's open source and MIT-licensed. If something feels off, open an issue or send a PR.

Resources

Related Projects