Methodology
Replacing averaged-Leq aircraft noise regulation — DNL, the FAA Part 150 framework, AEDT modeling — with an event-based noise standard requires field-grade measurement instrumentation that meets three criteria:
- Traceable to a primary acoustic reference — so absolute SPL claims survive technical challenge.
- Validated across aircraft categories and field conditions — so generalization claims are empirical, not modeled.
- Capable of per-event acoustic and psychoacoustic detail — so the dataset captures what averaged-Leq metrics by design discard.
This page documents how the TrueNoise methodology delivers all three. It also documents — transparently — a 2.9 dB drift we identified in our original reference meter on 1 June 2026, the chain-of-custody steps we took to characterize and correct it, and how we treat pre-correction observations in our analytics. We believe surfacing this kind of instrument hygiene is what regulatory-grade measurement looks like, and we expect future critique of the dataset will be measured against the rigor of this record rather than its absence.
Hardware
Microphone
- ModelPoP Voice Professional Lavalier omnidirectional condenser
- Frequency range20 Hz – 16 kHz
- Signal-to-noise ratio80 dB
- ConnectionWired (no Bluetooth codec compression or variable latency)
- Wind protectionFoam + fur windshield (per ISO 1996 guidance)
- Polar patternOmnidirectional
The frequency range fully covers the A-weighted measurement band: A-weighting rolls off by ~50 dB below 20 Hz and ~7 dB above 16 kHz, so this microphone captures everything the A-weighting curve cares about. The omnidirectional pattern is deliberate — aircraft can fly in from any bearing, and equal sensitivity in all directions avoids the orientation bias a cardioid pattern would introduce.
This is a consumer-grade microphone, not an IEC 61672-certified measurement microphone. We address that directly in §6 Calibration and §8 Limitations.
Phone
- DeviceApple iPhone with A-series silicon
- Audio session mode.measurement (disables all processing — see §2)
- CalibrationPer-device offset stored locally and applied to every measurement
iOS audio capture
We use Apple's AVAudioEngine framework with the audio session category set
to .playAndRecord and mode set to .measurement.
This mode is the critical detail. By default, iOS aggressively processes
microphone input for voice calls and video — automatic gain control, noise suppression,
voice isolation, and echo cancellation are all applied. None of these are acceptable for
an instrument. .measurement mode disables all of them, exposing the raw
uncompressed PCM signal exactly as the microphone capsule delivers it.
- Tap buffer size1024 frames (~21 ms at 48 kHz)
- Sample format32-bit float
- Sample rateHardware-native (typically 48 kHz)
Critics sometimes assume "phone" implies "consumer audio pipeline." With
.measurement mode, it does not.
A-weighting filter
We implement A-weighting per IEC 61672, the international standard for sound level meters.
The filter is constructed as three cascaded biquad sections derived via bilinear transform from the analog s-domain transfer function. The four canonical IEC pole frequencies are used directly:
- f₁20.598997 Hz
- f₂107.65265 Hz
- f₃737.86223 Hz
- f₄12194.217 Hz
Section 1 is a 2nd-order high-pass at f₁. Section 2 is a 2nd-order high-pass combining the f₂ and f₃ poles. Section 3 is a 2nd-order low-pass at f₄. The overall response is normalized so the gain at 1 kHz is exactly 0 dB — the IEC reference point.
These are the same pole frequencies used in every certified sound level meter manufactured. A-weighting compensates for the frequency response of the human ear, which is roughly 30 dB less sensitive at 100 Hz than at 1 kHz. Measuring aircraft noise without A-weighting would dramatically overstate low-frequency content that humans don't perceive as loud.
FFT and ⅓-octave band analysis
A-weighted SPL is the headline number, but it collapses the entire audible spectrum into a single decibel value. For finer analysis, we compute a ⅓-octave spectrum on every measurement window.
- FFT engineApple Accelerate / vDSP — the same SIMD-optimized DSP library Apple uses in Logic Pro and that ships in every professional iOS audio plugin
- FFT size4096 samples (~85 ms window at 48 kHz)
- Window functionHann window with energy normalization — band levels are calibrated in absolute SPL, not arbitrary spectral units
- Frequency resolution~11.7 Hz per bin
- Bands28 standard ⅓-octave bands, 25 Hz – 12.5 kHz (ANSI S1.11 / IEC 61260 center frequencies)
- Band powerSummed across all FFT bins within each band's bounds (edge factor 2^(1/6))
The A-series chip's vector and neural processing capabilities exceed what real-time spectral analysis requires by orders of magnitude. A 4096-point FFT executes in well under a millisecond. Doubts about "can a phone do FFT accurately" reflect assumptions about pre-2010 phone hardware.
Psychoacoustic analysis
A-weighted SPL is a 1936 model of how humans perceive loudness. It is the regulatory standard worldwide, but it is known to be incomplete — particularly for impulsive sounds, tonal components, and the spectral character that makes some noises feel sharper or more annoying than their dBA value would suggest. See The measurement gap for more information.
We compute, on top of A-weighting, three psychoacoustic metrics from the acoustics literature:
- Zwicker loudness (sone)Simplified ISO 532 B. Computes specific loudness per Bark critical band, integrates across the Bark scale. Uses ISO 226 threshold-in-quiet values per band.
- Loudness level (phon)Derived from total loudness in sones. Numerically equals the dB SPL of a 1 kHz tone perceived as equally loud.
- Sharpness (acum)Per DIN 45692. Weights high-frequency specific loudness to capture the perceptual difference between a low rumble and a piercing whine at the same dBA.
- Psychoacoustic annoyanceZwicker's combined model synthesizing loudness and sharpness into a single annoyance index.
These metrics matter because two aircraft producing the same dBA can produce very different real-world annoyance — and annoyance is the dose-response variable that WHO uses for sleep disturbance and cardiovascular health endpoints. We provide psychoacoustic analysis in addition to A-weighted SPL, not instead of it.
Empirical finding — loudness/dBA inversion in aft-departure overflights
The TrueNoise dataset contains a finding that directly demonstrates criterion 3 from the methodology intro — that per-event psychoacoustic detail captures what averaged-Leq metrics by design discard. In the aft portion of a departure overflight (aircraft receding, exhaust jet mixing noise dominant), dBA continues to fall while Zwicker loudness in sone temporarily rises or plateaus. The A-weighted level drops because A-weighting suppresses the low-frequency exhaust mixing energy as the aircraft moves away. Perceived loudness rises because the human auditory system — and the Zwicker model — gives more weight to that low-frequency content than A-weighting does.
This is not a measurement artifact. It is a real psychoacoustic phenomenon: the aft-departure sound is perceived as louder than dBA alone would predict. It is also the strongest in-house empirical evidence that a standard relying on dBA alone systematically undercounts the health-relevant acoustic load of jet departures. The C-A delta and sone metrics together capture the inversion; dBA alone cannot see it.
Calibration
Every iPhone unit, microphone unit, and microphone connection has slightly different acoustic gain. A measurement system that does not address this is not a measurement system.
We calibrate per device. The app stores a calibration offset (in dB) that converts the device's raw signal level to absolute SPL. Two offset profiles are supported:
- Internal mic offset+108 dB. Averaged across multiple calibration sessions on the validation iPhone. NOT yet re-validated against the AZ8930-verified reference chain — internal microphone is not used for primary field measurement; external microphone is the reference for all field sessions.
- External mic offset+96 dB (PoP Voice lavalier) — corrected from +99 dB on 1 June 2026; see §7b
- Session-to-session variability±3 dB observed — offset is averaged across multiple calibration sessions
- Reference instrumentWintact SLM-30B IEC 61672 Class 2 sound level meter
The offset is determined by simultaneous measurement against a reference instrument and is stored locally in the app. This is the same approach used by any professional sound level meter: every unit is individually calibrated, and calibration is checked periodically against a reference.
Windshield insertion loss characterization
Windshields attenuate real acoustic signal in addition to suppressing wind noise — more material in the signal path means more signal loss, particularly at high frequencies where wavelengths approach the windshield's structural dimensions. This attenuation has been characterized by simultaneous comparison against the TA657A Class 2 reference meter across multiple noise types (broadband, tonal, impulsive):
| Configuration | Measured insertion loss | Published industry range | Notes |
|---|---|---|---|
| Bare mic (no windshield) | 0.0 dBA | 0 dB reference | Matched SLM-30B across all noise types |
| Foam windshield | 0.6–0.8 dBA | 0.5–1.0 dBA | Within published band (Rycote, Brüel & Kjær) |
| Fur windshield (deadcat) | 1.8 dBA | 1.5–3.0 dBA | Used for all outdoor monitoring sessions |
The fact that insertion loss is consistent across all noise types confirms that windshield attenuation is broadly frequency-flat across the audible band, as expected from published specifications. Different source spectra all see similar dBA reductions — there are no frequency-dependent surprises in the measurement chain.
All outdoor monitoring sessions are conducted with the fur windshield installed. Published measurements therefore carry a known conservative correction of 1.8 dBA — actual noise levels at the receptor are 1.8 dB higher than reported values. This correction will be applied automatically in future dataset versions once per-session windshield configuration is logged by the app. Researchers requiring the corrected values may apply the 1.8 dBA offset to all current outdoor observations.
The bare-mic indoor test demonstrated zero insertion loss from the microphone capsule itself — establishing that any offset between the app and the reference meter could be attributed to the calibration coefficient, not the microphone's spectral response. That established methodology was what later allowed us to identify and correct the TA657A drift (see §7b). The post-correction +96 dB offset has been validated by aircraft comparison against the AZ8930-traceable Wintact SLM-30B at −0.1 dB agreement.
Two-operating-point validation
The calibrated system has been validated at two independent operating points, each verifying a different sub-chain of the measurement system:
| Scenario | Reference | Signal | Delta | What it validates |
|---|---|---|---|---|
| A — Calibrator | AZ8930 (IEC 60942 Class 2) · 94.0 dBA · 1 kHz | Pure tone · bare mic · no windshield · no correction applied | −0.3 dB | Microphone + +96 dB offset sub-chain only |
| B — Aircraft | Wintact SLM-30B (IEC 61672 Class 2, AZ8930-verified) · Boeing 767 overflight | Broadband aircraft noise · fur windshield · +1.8 dB insertion-loss correction applied | −0.1 dB | Full field chain: mic + offset + windshield correction |
The two scenarios are stronger together than either alone. Scenario A confirms the microphone and offset are correctly calibrated against a primary acoustic reference. Scenario B confirms the full field chain — including the fur windshield and its +1.8 dB insertion-loss correction — against an independently verified Class 2 reference meter on a real aircraft overflight. The system is calibrated within IEC 61672 Class 2 ±2 dB tolerance and shows agreement substantially tighter than that tolerance across both validated operating points.
Validation chain summary: AZ8930 calibrator (IEC 60942 Class 2, primary reference) → Wintact SLM-30B (IEC 61672 Class 2, −0.0 dB against AZ8930) → TrueNoise app + external mic + +96 dB offset + fur windshield + 1.8 dB correction (−0.3 dB Scenario A, −0.1 dB Scenario B). The TA657A reference meter has been retired following identification of frequency-dependent drift that could not be corrected by trimming.
Measurement position classification
Where the microphone is placed relative to the acoustic environment determines what the measurement represents. TrueNoise uses a five-category measurement type framework, implemented in the iOS app and serialized in every CSV row. The categories distinguish between measurements designed for policy comparability and measurements that characterize real-world residential exposure as actually experienced.
- Standardized Receptor 1.2 m height above ground · 1.5 m from any reflective surface · open-field geometry. This is the WHO/ISO 1996 receptor position used in European noise mapping and epidemiological studies including HYENA and RANCH. Use for: regulatory comparison, cross-location comparability, policy submissions. The TrueNoise calibration validation comparisons against the Wintact SLM-30B were performed at this position.
- Community Receptor Any outdoor residential position that does not meet Standardized Receptor geometry — a backyard, a front porch, a deck chair, a child's play area. Represents actual residential exposure as experienced, not a regulated abstraction. Values at Community Receptor positions may differ from Standardized Receptor values at the same location due to reflective surfaces, terrain, and vegetation. Use for: community impact documentation, lived-experience characterization.
- Façade-Level Microphone placed at or near an exterior building surface — a window frame, a wall face, a balcony railing. Captures the sound level at the building envelope where transmission into interior spaces begins. Façade measurements are typically 2–6 dB higher than free-field measurements at the same location due to building reflection. Use for: indoor noise intrusion estimation, building envelope characterization.
- Field Characterization Exploratory measurement at a non-residential location — a park, a school playground, a community green space, a roadside. Used to characterize acoustic conditions at locations not covered by residential receptor categories. Position documented in the Position Description field.
- Hand-held Microphone held by the observer rather than mounted or positioned on a fixed support. Introduces variability from hand position, body reflection, and movement. Results are indicative rather than calibration-grade. Flagged in the dataset; not used for threshold comparison or regulatory claims.
The Measurement Type field is present in every downloaded CSV row. When interpreting the dataset, filter to Standardized Receptor for policy-comparable absolute SPL claims. Community Receptor and Façade-Level measurements document real residential exposure and are appropriate for community impact characterization but should not be directly compared to Standardized Receptor values without noting the positional difference.
Most consumer noise measurement apps record a level without documenting where or how the microphone was placed. The five-category Measurement Type framework — with explicit Standardized vs Community Receptor distinction — is policy-grade measurement discipline that treats geometry as a first-class variable rather than an afterthought.
Wind contamination screening — C-A Delta
A-weighting and C-weighting
Sound level meters apply a frequency-weighting filter to raw acoustic measurements. Two standardized weightings are relevant here, both defined in IEC 61672:
- A-weighting (dBA)Approximates how the human ear perceives loudness at typical environmental sound levels. Heavily attenuates frequencies below 500 Hz. Used in nearly all environmental noise regulation — FAA contours, WHO guidelines, EPA standards.
- C-weighting (dBC)Much closer to a flat frequency response — includes far more low-frequency content than A-weighting. Used as a diagnostic complement to A-weighting.
Why the C-A delta detects wind
Wind noise on a microphone is not acoustic — it is turbulent air pressure directly on the diaphragm, overwhelmingly low-frequency. A-weighting heavily suppresses energy below 200 Hz; C-weighting does not. A large C-A delta during a loud event is therefore a strong indicator of wind contamination. Thresholds apply only when dBA ≥ 55 — during meaningful aircraft events:
- C-A Delta < 15 dBClean — measurement dominated by genuine acoustic content
- C-A Delta 15–25 dBPossible wind contamination — flag for review; see aft-aspect note below
- C-A Delta ≥ 25 dBStrong wind contamination — exclude from SPL analysis by default
Four documented patterns beyond wind contamination
1. Aft-aspect aircraft noise. When an aircraft is receding, jet exhaust mixing noise dominates — a deep, low-frequency rumble. C-A delta naturally climbs to 15–20 dB in the aft tail. This is real aircraft noise, not contamination. Distinguish from wind by timing (elevated delta appears after dBA peak, while aircraft confirmed receding via ADS-B) and by the presence of elevated sone at the same timestep.
2. Quiet ambient background. Below 55 dBA, the microphone captures the natural low-frequency character of the environment — HVAC, distant traffic, building hum. Large C-A delta at low dBA is ambient bass character, not contamination. The SPL floor prevents this from triggering false exclusions.
3. Atmospheric high-frequency absorption at distance. A fourth cause identified empirically: as an aircraft moves away, the atmosphere preferentially absorbs high-frequency content (ISO 9613-1). This produces a gradually rising C-A delta in track-event rows as slant range increases and dBA falls — a propagation effect, not a source or contamination effect. It should not be treated as grounds for exclusion.
Practical rules
- 15–20 dB, receding aircraft, calm dayDo not exclude — real aft-aspect aircraft noise
- 15–20 dB, breezy day, no specific overflightProbably wind contamination — exclude
- ≥ 25 dB during any loud eventLikely wind — exclude
- Large delta, dBA < 55Ambient bass character — not contamination, not excluded
All outdoor sessions use a fur windshield (characterized insertion loss: 1.8 dBA), which suppresses wind noise significantly. The iOS app's Review segment displays C-A delta alongside ADS-B bearing and slant range, giving the observer the context needed to apply these rules correctly. Observations marked excluded in the app are filtered before upload and do not appear in the public dataset.
Glossary
- A-weighting (dBA)A frequency filter applied to measurements to match human hearing sensitivity; used in nearly all environmental noise regulations.
- C-weighting (dBC)A nearly-flat frequency filter that includes more low-frequency content; used as a diagnostic complement to A-weighting.
- C-A Delta (dB)The difference between a C-weighted and A-weighted measurement of the same signal; large values during loud measurements (≥55 dBA) indicate wind contamination — but context (ADS-B, weather, timing) is required to distinguish wind from aft-aspect aircraft noise.
- Aft-aspect noiseJet exhaust mixing noise heard when an aircraft is receding. Naturally low-frequency dominated, producing a large C-A delta that is real aircraft noise, not a measurement artifact.
Psychoacoustic metrics and health evidence
A natural question about the TrueNoise measurement approach is why it captures psychoacoustic metrics — loudness in sone, sharpness in acum, annoyance index — rather than relying solely on dBA. The answer is grounded in the health evidence.
The health effects of aircraft noise are primarily mediated through the annoyance response — and annoyance is determined not by loudness alone, but by the spectral and temporal character of the sound. A longitudinal study found that nearly 66% of the effect on self-reported health was mediated by annoyance, not by the noise level directly (Cousson et al., 2024). The pathway from aircraft noise to cardiovascular disease runs through the subjective experience of the sound — which psychoacoustic metrics capture and dBA alone does not.
After controlling for loudness, sharpness and tonality independently predict annoyance to aircraft noise (McKinley et al., 2023; Caillet et al., 2016). Two sounds at identical dBA levels can produce substantially different annoyance depending on their spectral character. dBA misses this distinction entirely.
The psychoacoustic metrics TrueNoise captures are therefore not merely descriptive — they are the upstream acoustic drivers of the health outcomes documented in the HYENA and RANCH epidemiological studies.
References
- Fastl, H. & Zwicker, E. (2007). Psychoacoustics: Facts and Models (3rd ed.). Springer-Verlag.
- Cousson, P.Y. et al. (2024). Effects of aircraft noise exposure on self-reported health through aircraft noise annoyance. Environmental Research. PMC11349086
- McKinley, R. et al. (2023). Sound quality metric indicators of rotorcraft noise annoyance. JASA, 153(2), 867.
- Caillet, G. et al. (2016). Aircraft noise annoyance modelling. Applied Acoustics, 111, 253–263.
- Berglund, B. et al. (1995). Community Noise. WHO, Geneva.
- Munzel, T. et al. (2018). Adverse effects of environmental noise on oxidative stress and cardiovascular risk. Antioxidants & Redox Signaling, 28(9), 873–908.
- Dratva, J. et al. (2016). Cardiovascular and stress responses to short-term noise exposures. Environment International, 97, 224–233.
Validation and field calibration
Reference meter field calibration
The Wintact SLM-30B reference meter (IEC 61672 Class 2) is field-calibrated to 94.0 dBA at the start of each comparison session using an AZ8930 acoustic calibrator (IEC 60942:2018 Class 2). A pre-session calibration log is maintained. The calibration chain is independently traceable, verified per session, and documented for audit.
Overflight comparison validation
Following the 1 June 2026 calibration correction, the system was validated against the calibrator-verified Wintact SLM-30B. The B767 post-correction result is the cleanest single validation data point in the dataset:
| Instrument | Peak SPL | Notes |
|---|---|---|
| Wintact SLM-30B (Class 2, AZ8930-verified) | 76.6 dBA | Field-calibrated to 94.0 dBA before session · Boeing 767 takeoff overflight |
| TrueNoise app (external mic, +96 dB offset, fur windshield) | 76.5 dBA | PoP Voice lavalier · standardized receptor · same overflight |
| Delta | −0.1 dB | Within IEC 61672 Class 2 ±2 dB unit-to-unit tolerance ✓ |
This is one validated comparison against the SLM-30B post-correction. Ongoing simultaneous comparisons will be performed across different aircraft types and weather conditions. We will publish the running validation table as it grows.
† The 9 May 2026 A320 comparison was made against the TA657A before its +2.9 dB drift was identified. The reported agreement (−1.4 dB) was anchored to a drifted reference. Corrected interpretation: app read +1.5 dB high relative to true SPL. This reading does not support any calibration claim and is retained as a historical data point only. See Section 7b for the full calibration correction record.
Calibration history and the 2026-06-01 offset correction
On 1 June 2026, we verified our prior reference meter — a TA657A (IEC 61672 Class 2) — against the AZ8930 calibrator with a tight coupler seal. The TA657A read 96.9 dBA against the calibrator's 94.0 dBA tone: a +2.9 dB drift, outside its IEC 61672 Class 2 ±2 dB tolerance. The drift had propagated into the application's external-microphone calibration offset. After identifying the drift, we replaced the reference meter with the calibrator-verified Wintact SLM-30B and corrected the offset from +99 dB to +96 dB.
| Aircraft | App reading | Reference reading | Delta | Status |
|---|---|---|---|---|
| B767 takeoff | 76.5 dBA | 76.6 dBA | −0.1 dB ✓ | Post-correction |
| A321 takeoff | 79.4 dBA | 76.3 dBA | +3.1 dB | Pre-correction — confirmed offset magnitude |
| 737-700 takeoff | 78.4 dBA | 75.6 dBA | +2.8 dB | Pre-correction — confirmed offset magnitude |
This is a strength, not a weakness. Identifying and correcting a 2.9 dB drift by verifying against a primary acoustic reference is exactly the chain-of-custody discipline regulatory-grade measurement requires. Most regulatory-grade noise datasets never verify their reference meters against a primary calibrator. The corrected dataset is now traceable to an IEC 60942:2018 Class 2 acoustic calibrator.
Chronological calibration record:
- 9 May 2026A320 approach comparison against TA657A (internal mic, home patio). Reported delta −1.4 dB. Corrected interpretation: app +1.5 dB high relative to true SPL. Reference meter subsequently found drifted +2.9 dB. Retained as historical data point only — does not support any calibration claim.
- May 2026External microphone calibration offset anchored to TA657A via indoor bare-microphone match test. Offset set to +99 dB. Drift inherited unknowingly into all May 2026 measurements.
- 1 June 2026TA657A verified against AZ8930 with tight coupler seal. Reading: 96.9 dBA against 94.0 dBA reference — +2.9 dB drift confirmed, outside Class 2 ±2 dB tolerance. TA657A retired as reference instrument.
- 1 June 2026Wintact SLM-30B verified against AZ8930. Reading: 94.0 dBA — confirmed clean. SLM-30B adopted as primary reference meter.
- 1 June 2026Application external-mic offset corrected from +99 dB to +96 dB. Three-aircraft validation performed at standardized receptor. B767 post-correction delta: −0.1 dB. ✓
- 2 June 2026External microphone placed directly in AZ8930 calibrator with snug coupler fit; windshield setting changed to bare/None in app. App reading: 93.7 dBA against 94.0 dBA reference. Delta: −0.3 dB. Independent confirmation of the +96 dB offset at 1 kHz pure tone — validates the microphone + offset sub-chain in isolation from the windshield and aircraft signal characteristics.
Affected data range: Measurements taken before 1 June 2026 carry a known +3 dB systematic bias in absolute SPL terms. Derived psychoacoustic metrics inherit a proportional bias. See Data Treatment for how pre-correction data is handled in analytics.
Pre-correction data is not invalid. Relative patterns, spectral character, aft-aspect geometry, atmospheric HF absorption signatures, and event-to-event comparisons within a single session are preserved. What changes is the absolute calibration of peak SPL claims and health threshold exceedance counts, which now use post-correction data only.
Limitations
Honest disclosure is part of credibility. The following are known limitations of the current system:
Summary and data availability
TrueNoise measures community aircraft noise using a calibrated smartphone-based system anchored to a primary acoustic reference (IEC 60942:2018 Class 2 calibrator), validated against a Class 2 reference meter that is itself field-verified against that calibrator before each comparison session, with the calibration validated empirically on multiple aircraft overflights including a post-correction agreement of −0.1 dB on a Boeing 767.
The system is not a substitute for a laboratory-grade measurement microphone. It is a measurement system whose empirical performance has been characterized against traceable references, whose limitations are documented, and whose data is published openly for independent evaluation.
All session data is available for download at
truenoise.org/data.html. The dataset includes raw
observations, psychoacoustic metrics, ADS-B attribution, calibration epoch tags,
and windshield configuration for every recorded event. Pre-correction data is tagged
pre_2026_06_01; post-correction data is tagged
post_2026_06_01.