This step-by-step guide for the Android app Room Acoustics Meter, more precisely for reading frequency response, phase, impulse response, and related quantities. is intended to provide an introduction to the field. As with any simple guide to a complex topic sometimes counterexamples can be thought up. The goal is to help with measurements of a typical living room to "get your foot in the door" in the first place. For smaller rooms such as car hifi or larger rooms such as event halls, the conditions assumed here do not apply. At the end, some measurement curves of private living rooms are interpreted for illustration.

Instructions for performing the measurement

The prerequisite of the following sections is that the data originate from a correctly executed measurement. Before starting the measurement, the elementary rules for setting up loudspeakers should in turn have been taken into account.

Hifi apps guide you step by step through the measurement. It is possible to decide how many microphone positions should be measured. In an average living room, where the listening positions are no more than 2 m apart, two or three measurements two or three measurements often already give an overview of which measures make sense. The microphone should be located where the listeners' ears would otherwise be.

What will be better with more measurements?

If possible, several setups of the speakers and listening positions should be compared. For each setup again at least 2 to 3 measurements should be made. Sometimes significant improvements already arise from changes in the range of 50 cm.

A better, but also more complex way is a so-called sound field measurement: Instead of rounding within a measurement over adjacent frequencies (smoothing) several (over 10) measurements in a range of perhaps 50 cm are made and the (less strongly rounded) frequency responses are interpreted together. The result will certainly not knock down the results of other measurements, but will provide additional precision for fine-tuning a sound processor or equalizer.

More microphone positions naturally make the result more precise when more listening positions are involved. For rating the listening experience, measurements are generally taken only at (potential) listening positions. Measurements at other microphone positions have an indirect benefit: If there is more interest, a complete matrix of the room, e.g. at ear level with 1x1 m grid, can be recorded. In this way, one obtains a deeper understanding of the entire wave field in the room and can plan specific damping measures and the positioning of sound sources and listening positions.

Hifi-Apps are written in such a way that, with little effort, a high number of measurements can also be made: Determination of the microphone position and saving the results are automated. The Android devices can easily be carried around the room, since there is no need for the external power supply as well as the external audio interface.

When you start the app, you will be prompted to connect the Android device to the playback system via cable, if possible. Practical experiences for different types of connections:

Initially, it is also possible to work without measurement microphone, see here. This introduction assumes that the measurement was made with an omnidirectional microphone. This has become established because different measurements would otherwise hardly be comparable. For those interested, there is of course nothing to stop them from experimenting anyway, e.g. with so-called 3D measurement methods [Protheroe 2013]: For the app, it does not matter where the data come from. In principle, directional information is of course useful in practice: The search for interfering reflections can and should be supported by clapping and (directional) listening (see below).

In the following steps, the app guides you dialog-based until the measurement is started. The Farina algorithm used is relatively insensitive to ambient noise during the measurement. It In practice, it is even used shortly before the start of a concert with a waiting audience. Nevertheless, if necessary it is a good idea to bring loudly rattling objects under control and repeat the measurement.

The measurement must be performed at moderate volume: The measurement signal puts considerably more strain on the ears and loudspeakers than music of the same volume. Some professional acousticians wear hearing protection for every measurement.

Setup settings (for advanced users)

In the setup menu the duration of the log sweep and its cutoff frequencies can be set. Since the sweep is faded in and out slowly for technical reasons (Blackman windowing 0.05 s leadin, 0.005 s leadout), the frequency range that can be used later is somewhat narrower. 20 Hz as start is sufficient to find resonances in rooms up to 8 m length. The upper cutoff frequency helps with various length measurements from the sound runtime, 20 kHz are proven in practice. Whether the loudspeaker covers a wider frequency range according to the manufacturer does not matter for the important part of the measured values.

The duration determines the precision of the measurement. It should be long enough to give interfering room resonances a chance to resonate and thus be found in the results. To excite a room resonance, the sweep must be long enough in its frequency range. A good start is 2 seconds. Example for technicians: For a room mode with 300 ms rise time and 1/3 octave bandwidth vs. 10 octaves for the sweep, you get 300 ms * 1/(1/3) * 10 ≃ 10 seconds. This is the highest value that can be set. If someone deliberately uses short sweeps to create optically nice straight frequency responses with the accompanying poor resolution, they should probably look around for another field of activity.

The sample rate can be specified in the setup menu. The app receives the offered values from the Android system of the device, as well as the information "native" for a certain value, usually 48,000 per second. This value should not be changed without concrete reasons (see below in the section "Impulse response"). High values ("a lot helps a lot") are not helpful per se.

Evaluation of the frequency response

By "frequency response", (FR) and SPL (Sound Pressure Level) is meant here the sum frequency response, i.e. the sound pressure with which a certain frequency at some future time arrives at the microphone after it has been fed into the system. Thus, both sound emitted directly from the loudspeaker and reflections from the walls, etc. are included. To get an overview of the room properties, this is a good start.

Side thoughts on the sum frequency response and

The view of the sum frequency response is unusual at first: Compared to the manufacturer's loudspeaker measurements, much larger fluctuations occur due to the room properties. Manufacturer's loudspeaker measurements usually show only the direct sound - the user's room is, of course, not known at the time of the measurement. Sometimes the measurement is really done outdoors. The sum frequency response in a listening room has hardly anything to do with the characteristics of the loudspeaker. Both frequency responses have their justification and are reflected in the listening impression: A balanced behavior in direct sound can be seen as a basic requirement, similar to the way the light source in a projector must provide white light. In Background knowledge is described, which sonic characteristics are to be expected with which characteristics of the room. Our sense of hearing is apparently able to separate the first direct sound from the loudspeaker and the reflections in such a way that not all these fluctuations are important.
The sum frequency response should be oriented to a target curve that drops a few dB to higher frequencies [Møller 1974], [Toole 2015], [Web search "Target Curve spl" or "House curve spl"]. The app offers the possibility to show some established target curves.

Prerequisite for the interpretation are correctly performed measurements (see above). The results should resemble the curves shown here.

Unfiltered1/24 octav smoothing1/8 octav smoothing 1/3 octav smoothing
Click images to enlarge
Sum frequency response of two mid-range floorstanding loudspeakers (Quadral Chromnium Style 50) without sound correction in a 5m x 7m room. The microphone positions of all measurements are less than 20 cm apart, i.e. within a range of ear distance and head movements. The upper blue curves and lower red curves are from the left and right channels, respectively. The curves have a shift of 20 dB each. The measurement was made with a calibrated Audix TM-1 measurement microphone connected to a Samsung SM-T510 tablet via a Shure X2u XLR-to-USB adapter. The wide light blue curve is the preferred target curve of trained listeners according to [Toole 2015].
Die selben Daten, ohne Shift.

Left column - (almost) universal: For an intuition where the data comes from and what accuracy can be expected, it certainly doesn't hurt to look at a few unfiltered frequency responses at the beginning. Later this step can be skipped. The extreme fluctuations above about 300 kHz are caused by reflections. In open air measurements and in the near field of the loudspeaker they are very reduced or not visible at all. The strength of this comb filter effect gives an idea of the respective uncertainty in the results: It is debatable to what extent the auditory sense is oriented to the peaks or to the averages. In the case of large fluctuations, a drop in treble after averaging can therefore be a consequence of the rounding method and completely meaningless for the auditory impression. In the example measurements below, these raw data are therefore also attached.

"1/... Octave smoothing" columns - (almost) universal: Comparing (in the lower diagrams) first the different measurements (each of a stereo channel), one sees that the values diverge at about 300-400 Hz (i.e. wavelengths of 1 m). This is plausible because for standing waves maximum and node are 1/4 of the wavelength apart, i.e. in the range in which the microphone was moved. The uncertainties are about 5 dB.
Zwischen eng (≪ 1/3 Oktave) benachbarten Frequenzen innerhalb jeder einzelnen Messung treten ebenfalls Schwankungen von ungefähr 5 dB auf. Obviously, above about 300 Hz, the wavelength-dependent gains and cancellations due to small changes in microphone position are very comparable to those due to small changes in frequency. In the range below 300 Hz the measurements are more similar. However, the individual measurement curves fluctuate more strongly at some points. Whether this is caused by reflections or room modes cannot always be seen in the sum frequency response.

These effects occur with practically all sum frequency responses of normal living rooms. With a higher direct sound component (see below), the fluctuation range can fall below 5 dB, which is desirable. With favorably placed (and driven) subwoofers or fewer reflections due to a more suitable floor plan, the peaks in the range below 300 Hz should be much more moderate. The additional effort for sound field measurements, i.e. averaging over many closely neighboring microphone positions, is only worthwhile in rare cases. Averaging over neighboring frequencies leads to very similar results in the upper frequency range, and in the lower frequency range the measurements are similar anyway. Some sound engineers, for example, use this method only for a final last adjustment of the sound correction for a specific listening position.

Qualitatively generalizable: Both the 5 dB fluctuations of neighboring frequencies and the much more pronounced room modes or reflections in the bass range show that the room still has "some room for improvement" acoustically, but is not "catastrophic". As with many wave patterns in nature, it is the overall picture that counts; the individual fluctuations have little significance as long as they are evenly distributed. Basically, not every reflection is bad. However, the frequency response smoothed over 1/3 octave should definitely have fluctuations below 10 or maximum 15 dB. If they are evenly distributed, they are less of a problem. Setups in which a few frequencies stand out very prominently should be subjected to a listening test. Positioning of loudspeakers, listening positions and damping with a balanced response will probably sound better.

The broader increases and decreases in the frequency response can be assigned to certain sound characters. This can contribute to the understanding of a listening test. If neutrality is not the goal, it can be incorporated into corrective measures. Search engine queries "sound character vs. frequency response" yield interesting findings, some of which are summarized here from the author's personal point of view:

All frequencies in Hz20-4040-200200-500500-2k2k-6k6k-12k12k-20k
Important toFeel vibrations100: beginning of the deepest human voices 150: warmth of sound, body of voices200: foundation of the human voice sensitive area of hearing. For guitars and voicesmost sensitive area of hearing. Clarity, proximity of voices.brilliance, openness. Cymbals, stringsfor sparkling brilliant transparent treble. Cymbals.
Sound at boost +10 dBDepending on listening experience and taste40-60 boomy, 100: bass muddy, but 60-150: boost for fullness in the bass range, fat sound, 125-250: "as if there's a blanket over it"300: potted... 600: dull, honky, hollow 500-2k "old kitchen radio". 1k: nasal, thin, 2k: thin, harsh2k-6k: "megaphone sound". 5k: sharp, hissing 7k: unpleasant "s" and "t" sounds, web search: "de-esser"
Sound at -10 dB loweringToo bad150: bass seems disconnected250 bass seems hollow, 450: voices seem hollow1k: lower about 5 dB to "de-mulmerize"2k: clarity lacks 3k: "attack" Missing, cloying, romanticized 4k: finesse lacks.
Blauert 300-600: front700-1800: behind2,2k-6k: front8k-11k: above 11k-17k: behind

The sensations in lines 2-4 do not satisfy any scientific requirement, they are certainly not systematically reproducible. If single frequencies are indicated, a range of 1 to 2 octaves FWHM is meant. The last line refers to the so-called Blauert bands: By raising certain frequency ranges, one can create the auditory impression that the sound comes from the front, above, behind or below [Blauert 1974, Wikipedia].

Side thoughts about smoothing

The question which smoothing is the most meaningful for what fills many forum pages. Basically (for sound field measurements - see above) can be averaged over several measurements, or the averaging is done over the frequency response, where again there are several methods. Before reading, one's own intuition should be created by switching (especially between 1/3 and 1/8 octave smoothing) what is to be seen in each case. As a basis, the raw data should also be viewed. Both the average sound pressure and the maximum sound pressure of each small frequency range go into the perceived loudness.

For the listening impression, the trends in the superposition of direct and reflected signals at the microphone or ear are important. Even in the smoothed data, the resulting individual peaks should not be overinterpreted: Their position also changes depending on the microphone location. In addition, listeners change their sitting position and have about 14 cm distance between the eardrums. It is important that this does not change the sound too much. There are two basic ways to realize this:

  • durch einen hohen Direktschallanteil. Dafür gibt es in der Studiotechnik Nahfeldmonitore. Der Hörabstand ist meist unter 2 m. Je stärker der Raum gedämmt ist, desto unkritischer ist der Hörabstand. Hornlautsprecher ermöglichen ebenfalls größere Hörabstände.
  • durch Ergodizität, d.h. eine gute Durchmischung, so dass möglichst alle Frequenzen zu allen Zeiten aus den richtigen Richtungen reflektiert werden. Rund- oder dipolartig abstrahlende Lautsprecher sorgen sicherlich "für einen guten Anfang", wenn eine solche Schallverteilung bevorzugt ist.

Selten erwähnt wird die Kanalgleichheit. Sie hat mindestens die gleiche Priorität wie der Frequenzgang: Ein System bei dem Sänger nicht, oder schlimmer an einer völlig falschen Stelle geortet werden, kann nicht ernsthaft als "eingemessen" bezeichnet werden. Die Kanalgleichheit hat mit der Gleichheit der Frequenzgänge des rechten und linken Kanals zu tun, viele weitere Effekte gehen aber ebenso ein. Anfangs sind Hörtests sicher der bessere Weg zur Wertung.

Auswertung von Phase, Delay, Gruppenlaufzeit

Während der Frequenzgang zeigt, wie stark eine bestimmte Frequenz wiedergegeben wird, zeigen diese Kurven in verschiedenen Formen, wann sie wiedergegeben wird. Nicht alle Frequenzen kommen immer gleich schnell an: Ein Beispiel aus der Natur ist das Knacken von Eis auf einem zugefrorenem See: Aus einiger Entfernung ändert sich das "Knack"-Geräusch zu einer Art "Piu", weil die höheren Frequenzen von Eis schneller übertragen werden. Auch beim Einschlag eines Blitzes kommt der tieffrequente Donner oft später als der Knall des Einschlags an. Im Gegensatz zum brechenden Eis, dessen Verhalten sich physikalisch nachvollziehen lässt, spielen hier allerdings viele nicht genau bekannte Faktoren zusammen. Bei entfernten Explosionen ist der zeitliche Verlauf beispielsweise völlig anders.

Das Verhalten von Schall in Wohnräumen entspricht eher dem zweiten Beispiel. Nur Teilaspekte scheinen manchmal durch Konzepte wie Bassreflexkanäle erklärbar. Ohne Maßnahmen an der Raumakustik braucht man sich wohl nicht damit zu beschäftigen. Danach sicherlich im High End Bereich, ansonsten vielleicht. Ein Rezept zum Verständnis der Kurven steht bereits an anderer Stelle.

Evaluation of the impulse response

Roughly speaking, impulse response (IR) is what you hear when a sharp bang is played. It characterizes echoes, reflections, reverberation, etc. in the listening room as well as the reaction of the loudspeaker cabinets including the associated electronics. Instead of the bang, Hifi-Apps play a more tolerable logsweep and reconstruct the data of the bang from it.

Before dealing with the measured impulse response, it should be intuitively clear what it is all about. For this, you can clap your hands in different rooms and hear how the reverb decays. The ideal listening room must be neither "completely dead" nor reverberant like a bathroom. The reverb should also be evenly distributed in the room (e.g. not bounce back and forth between facing walls). With a little practice, this can be heard well. It can help to turn your head in different directions while clapping.

The measured impulse response shows the time course of the sound pressure or a closely related quantity. The x-axis always shows the time, in the case of hi-fi apps optionally also as a product with the speed of sound in meters. The curve can be divided into three parts:

While direct sound is a single event that can be well identified, the separation between early reflections and late reverberation is less obvious: They may overlap or there may be a gap "transition time" between the two [Defrance 2009]. The following figure always shows the same 4 impulse responses or a selection of them in different representations. Practical goal could be to bring the bad light blue/beige impulse response into the range of the good dark blue/red one by damping, placing diffusors, changing listening places etc.

ImpulsantwortZoom auf AnfangsbereichLineare Darstellung zugehöriger Frequenzgang
Bilder zum Vergrössern anklicken
Spalte 1: Vergleich der Impulsantworten eins leerem quadratischen Raums und eines gedämmten Hörraums, jeweils linker und rechter Kanal.
Spalte 2: Ausschnitt aus dem ersten Bild, die beiden prominentesten Peaks von a) mit Werten markiert.
Spalte 3: Lineare Darstellung der selben Messung, zur Übersichtlichkeit nur die linken Kanäle (blau).
Spalte 4: Zugehöriger Frequenzgang von a). Die roten Kurven sind durch berechnete Werte: Sie zeigen die Schallintensität bei Überlagerung zweiter um den jeweils angegebenen Betrag verzögerten Wellen.
Spalte 1: Wie oben (Zeitachse verlängert 500 ms), zusammen mit Schroeder Kurven und davon abgenommenen Werten für
- EDT (Early Decay Time gem. ISO 3382), d.h. Dauer des Abfalls vom dB-Wert bei t=0 bis -10 dB, multipliziert mit 6
- T20, d.h. die Dauer des Abfalls von -5 dB auf -20 dB, multipliziert mit 4.
Die Schroederkurve startet hier bei -10 ms, deshalb ist der Wert bei t=0 etwas unter 0 dB. Dadurch ändern sich die Ergebnisse nicht. Alle Nachhallzeiten beziehen sich auf einheitlich 60 dB Abfall, d.h. die abgelesenen Werte für 20 dB bzw. 15 dB sind bereits mit 3 bzw. 4 multipliziert.
Spalte 2 und 3: gefilterte Impulsantworten von jeweils 2 ausgewählten Oktavbändern mit Schroederkurven. Spalte 2: ungedämmter Raum (hellblau), Spalte 3: gedämmt (dunkelblau). Erkennbar ist die starke Methodenabhängigkeit der T60 Werte: Im Bassbereich des ungedämmten Raumes sind die Schwankungen so stark, dass eine Festlegung des Nachhalls unabhängig von der Messmethode sinnlos erscheint. Der Anstieg bei ca. 300 ms ist psychoakustisch sicher völlig anders zu beurteilen als der Beginn der Kurve. Die Entscheidung ihn in die Bestimmung von T60 aufzunehmen ist damit subjektiv. Vergleicht der EDT Werte beider Spalten zeigt, dass auch hier Ausreißer auftreten können, die das Hörerlebnis wohl nicht widerspiegeln.
Spalte 4: Die in Terz-Schritten gefilterten Impulsantworten mit Schroederkurven und T20 Werten. Wie für Spalte 2 erläutert, sollten den Werten für den Bassbereich nicht naiv vertraut werden.

Hellblau: Die Impulsantwort des leeren Raums ist erwartungsgemäß so schlecht, dass die Einzelheiten kaum interpretiert werden müssen. Trotzdem - Spalte 2: In der Vergrößerung stechen insbesondere der Einschwingvorgang in den ersten 3 ms und die Peaks danach heraus. Die ersten beiden Peaks wurde mit der Schall Laufzeit gekennzeichnet. Offenbar gibt es zwei Schall spiegelnde Flächen, die den Schall mit 2,84 bzw. 4,56 m Umweg von linken Lautsprecher zum Mikrofon reflektieren. Die akustische Behandlung könnte mit der Suche nach diesen Flächen per Schnur-Methode und ihrer Behandlung beginnen. Um den Einschwingvorgang am Anfang etwas zu verteilen, könnte an Diffusoren gedacht werden. Andererseits zeigt die untere Bildreihe, dass die Nachhallzeit viel zu lang ist, was mehr für Dämmung spricht. Vermutlich wäre jede Behandlung eine Verbesserung.

Tolerances of 10 to 20 cm are normal for length specifications, and correspondingly more for large speakers, because the exact location of the sound source is often unclear due to the interaction of several drivers. Similar discrepancies between room size and measured delays should not cause too much concern either. It is probably enough to know that the acoustic size of a room can be up to 20% larger than its architectural size. Especially if the walls are wood or plasterboard and resonate (web search "sound at media boundaries" for more).

There are several possibilities for the Y axis of the impulse response. Basically in the first step you can choose between linear and logarithmic representation. Since the sound pressure in the impulse response of a living room falls logarithmically over long distances, the latter is almost self-evident. The decrease appears as a straight line and deviations from it become recognizable. The linear representation is closer to the measured raw data. Accordingly, technical details such as the interaction of different loudspeaker chassis are better visible. To view parts with small amplitude, the curve can be enlarged with two-finger gestures.

There is a fundamental problem with both representations: the energy arriving at the microphone for a time period is closely related to the area under the corresponding piece of curve, depending on the representation. Ideally, it is concentrated on a single narrow high peak. In the real world, however, its high-frequency and low-frequency components sooner or later diverge. The high-frequency parts, however, naturally remain limited to a short period of time. Consequently, they are represented by high narrow peaks. The low-frequency components cover a comparable area, but this area extends over a much longer period of time. Therefore they can hardly be seen, although they are much more important for the auditory sense. In forum discussions on room acoustics, a raw or slightly smoothed logarithmic representation is usually posted.

On the other hand, the peaks are a useful indicator to identify individual reflections as accurately as possible. They cannot simply be "smoothed away". So to understand the system, you should switch back and forth between the representations. The app offers the following options:

Die Button-Zeile darunter [T=0] [FILTER]... öffnet verschiedene Einstellmöglichkeiten, die aus Platzgründen bei Start ausgeblendet sind. Ihre Dokumentation wird in der App gezeigt: Wenn [HINTS] aktiviert ist, erscheint zu dem jeweils letzten berührten Bedienelement eine Kurzanleitung. [FILTER] ist weiter unten beschrieben.


Der "normale" Summenfrequenzgang zeigt, wie stark jede Frequenz wiedergegeben wird, wobei egal ist, wann sie ankommt. Im Wasserfalldiagramm ist wie beim Summenfrequenzgang die Amplitude gegen die Frequenz aufgetragen. Allerdings wird die ankommende Schallintensität in Zeitabschnitte zerhackt und so auf mehrere Kurven verteilt. Diese werden ab jetzt FR(t) genannt, was darstellen soll, dass es sich um zeitabhängige Frequenzgänge (Frequency Responses) handelt. Der erste FR(t) ist also beispielsweise der Frequenzgang für die ersten 5 ms, der nächste für die nächsten 5 ms usw. Dadurch wird besser erkennbar, ob einzelne Unregelmäßigkeiten im Summenfrequenzgang z.B. von kurzen, intensiven Reflexionen oder weniger intensiven aber dafür längeren Resonanzen kommen. Sowohl die Hörsamkeit (also ob überhaupt Maßnahmen dagegen nötig sind) als auch die Maßnahmen selber hängen davon ab.

Die Ausgabe entsteht, indem der Frequenzgang nicht aus der gesamten Impulsantwort berechnet wird, sondern jeweils aus dem Abschnitt, der für FR(t) zeitlich untersucht werden soll. Jeder FR(t) entsteht also aus einem bestimmten Abschnitt der Impulsantwort um die Zeit t herum.

Selbst ein ideales Wasserfalldiagramm in einem absolut reflexionsfreien Raum würde sich nicht auf FR(t=0) beschränken: Würde man im Beispiel die Breite der untersuchten Zeitabschnitte auch auf 5 ms (= 1/200 Hz) setzen, könnten Frequenzen unterhalb von 200 Hz schlichtweg nicht definiert werden, da keine volle Periode in diesen Zeitabschnitt passt. Dieses Heisenberg-Limit verhindert, dass es schnelle Bass Arien gibt. Es kann nicht durch geschickte Algebra überwunden werden.

Aus diesem Grunde muss immer ein Kompromiss aus Zeit- und Frequenzgenauigkeit gewählt werden. Dafür gibt es zwei etablierte Methoden [IRZU FFT WAVELETS]:


The spectrogram is similar in content to the waterfall diagram. It uses different colors instead of different curves. In both diagrams, one axis is for frequency. In the spectrogram, the other axis is for time and the color is for intensity. In the waterfall diagram, the other axis is for intensity and different curves are shown for different times. With these differences in mind, the documentation is the same as the waterfall diagram.


In the Evaluation of Impulse Response section, we explained how hi-fi apps measure reverberation time T60 and presented various filtered impulse responses with associated T60. The calculations here are identical, but go one step further: the reverberation time is calculated with third-octave or octave spacing for many frequencies and displayed as a function of frequency. This allows a direct comparison with common standards. These can be selected and displayed. To calculate the standard values, the room volume must be specified.

The calculated reverberation times in the bass range must not be blindly trusted. As shown in the section "Evaluation of the impulse response", the filtered impulse response there is subject to considerable fluctuations of various period durations. These result from a mixture of chaotic room acoustics (below the Schroeder frequency), which strongly depend on the microphone position, etc., and filter effects. Ultimately, a single T60 value can never fully describe the highly complex behavior of a listening room in this frequency range. Only if its behavior is stable, however, comparative measurements and different evaluations, it can serve as a reference point. In [Zehner Ringversuch] measurements with different 13 acoustics software packages, operated by sound engineers with mostly far more than 10 years of professional experience, are compared.

It is much easier to attenuate high frequencies than low ones. Low frequencies, however, play an important role in word understanding. Accordingly, the reverberation time in the lower frequency range must be kept under control as well as possible. When DIN 18041 was drawn up, compromises were made between acoustic quality and feasibility for the lower frequency range [Fuchs 2019]. In case of doubt, it is therefore better if the measured values lie further down in the range of the standard.

If the reverberation time in the low frequency range is too long, the cause can be investigated in the spectrogram or waterfall diagram. A room mode should be found in the respective frequency band. The elimination in an existing building is difficult. Some approaches can be found here.

Room Acoustics

This representation can be understood as a parameterized sound pressure map. Sound pressure maps show how loud it is at the individual listening positions. In private rooms, however, this is not very important; the volume is more or less the same everywhere. The problem is the resonances and reflections that can create a different auditory image in each place. The sound pressure in particular for a certain frequency is therefore much more decisive here. This is exactly where the representation comes in: The sound pressure map is supplemented by a controller with which the frequency can be tuned:

Rohdaten44 Hz (λ = 7,8 m)140 Hz (λ = 2,5 m) 170 Hz (λ = 2,0 m)
Bilder zum Vergrößern anklicken
Spalte 1: Rohdaten aus den Frequenzgängen (1/3 Okt geglättet).
 1: Umschalter "Alle Frequenzen" vs. "Raummoden": für das Verhalten des Sliders zur Frequenzeinstellung daneben
 2: Setup zum Umschalten zwischen Darstellung der Rohdaten und als Heatmap etc.
 3: Auswahl der Kanäle.
Spalte 2, 3, 4: Heatmap bei verschiedenen Frequenzen. Der schmale schwarz - weiß wechselnde Streifen am rechten Bildrand stellt die Wellenlänge dar.

For the example, measurements were taken at 28 places. The graphic shows the floor plan of the room, the red areas are loud for the respective frequency, the green areas are quiet. At 44 Hz and 140 Hz, clear room resonances build up, once transversely and once longitudinally. The 44 Hz resonance does not behave "textbook": on the right side it is barely visible. The room has two thin glass doors there, which are obviously acoustically favorable. The next steps should be to investigate which other loudspeaker setups or drive units improve the picture. It is clearly recognizable that the green / blue area can hardly be improved by more power: For the required increase of the sound pressure by approx. 10 dB the power would have to be increased tenfold, which would make the bass in other room and frequency ranges (and outside the room) unbearable.

Measurement details

The measurement should extend over the entire floor space of the room in a grid at ear level. (Later during the evaluation, the diagram thus shows the floor plan of the room). The app automatically determines the microphone positions based on the sound propagation times. This makes the horizontal modes with their respective maxima visible. The more clearly the maximum is visible at a certain point, the more effective is the damping at exactly this point. This gives you concrete comparative values of what attenuation can do at a certain point and you are less dependent on intuition and guesswork.

As a mesh width 1 m has proven itself. Smaller scale fluctuations should be understood as trends as described above. So you can save yourself the trouble of measuring them individually.

The measurement is performed separately on the right and left channel. If subwoofers are used, they should be set "as usual". Even with stereo recordings without a separate bass (LFE) channel, the bass range can be (partially) monophonic mixed. This can cause amplifications and cancellations between both loudspeakers, which would be detected by further measurements, where several loudspeakers are driven at the same time. According to my personal experiences one ends however thereby fast in the "curse of the many parameters", if one does not follow a firm procedure. A suggestion for this can be found in the documentation for the app "Subwoofer Optimizer".

Evaluation details

Above the diagram there is a slider for selecting the frequency. The "All Frequencies" vs. "Room Modes" toggle to the left controls the behavior of the slider. "All frequencies" allows the free selection of the frequency, "Room modes" offers some "especially suspicious" frequencies with large level differences depending on the room position.

Falls mehrere Kanäle (R, L...) oder mehrere Lautsprecheraufstellungen gemessen wurden, wird im unteren Bereich eine entsprechende Auswahlmöglichkeit eingeblendet:

  • All speakers (max. SPL of ...) zeigt einen ersten Überblick. Raummoden sind erkennbar, egal durch welchen Lautsprecher in welcher Position sie angeregt werden. Die App verwendet für jeden Mikrofonplatz alle Messungen (rechter und linker Kanal, ggf. mehrere Lautsprecheraufstellungen) und zeigt den maximal gemessenen Schalldruck an. (Die Messungen sind automatisch gegeneinander eingepegelt.)
  • All positions of spkr. ... funktioniert wie der erste Punkt, beschränkt sich aber auf den jeweiligen Kanal (nur rechter und nur linker Kanal). So kann ermittelt werden, ob bestimmte Moden durch einen einzelnen Lautsprecher angeregt werden.
  • Single pos. of spkr. ... beschränkt sich schließlich auf eine Aufstellung eines bestimmten Lautsprechers. So kann ermittelt werden, ob bestimmte Moden durch einen einzelnen Lautsprecher in einer bestimmten Aufstellung angeregt werden.

The impulse response - technical background

Frequency response and impulse response are Fourier transformed to each other. In the examples above, filters were used in both the frequency and time domains. The Fourier and Hilbert transforms of these filters produce artifacts that must not be mixed with the physical properties of the system under study: Filters in the frequency domain can be seen as a kind of transient in the temporal domain and vice versa. This is called filter ringing and also occurs in linear systems, so technically it has a different origin than "normal ringing" due to nonlinearities.

The examples were generated in the app by replacing the measured microphone signal with the logsweep. This was used to simulate an ideal system that accurately reproduces the logsweep. Filtering of the impulse response is done in the [IMP RESP] view by activating [FILTER]. In some cases the logsweep was additionally edited by wave editor (see tables).

The following table shows artifacts caused by different sample rates and limiting the frequency response at 20 kHz:

Impulse responseZoom to initial regionLinear representation corresponding frequency response
Click images to enlarge
Column 1: The filtering of the impulse response is done in the [IMP RESP] view by activating [FILTER].
Column 2: Impulse response for input = logsweep. Sections with enlarged time axis and different sample rates.
Column 3: Analog picture at lower frequency using a filter in the app.
Column 4: Logarithmic plot of the same filtered data with reverberation time.

The impulse response shows in the normal view hardly recognizable, but after zooming of the time axis clear deflections with approx. 0.05 ms period duration, thus roughly 20 kHz. These do not depend on the sample rate. It can only be seen that the image becomes clearer at 192 kHz. The unsmoothed Hilbert transform (ETC for Energy Time Curve) forms the envelope.

For comparison, several filters with different slopes were set at 50 Hz. Here, an analogous picture appears in the corresponding time scale. Thus, the visible artifact is most likely due to the frequency response being limited at 20 kHz. An ideal filter, which would cut off abruptly at the angular frequency $\omega_g$, would have the impulse response $\sin(\omega_g t)/t$ and the zero-crossings would be a bit further out at 10 ms (web search: si-filter, Küpfmüller lowpass). The filters used here have a finite bandwidth and therefore decay faster. A roloff bandwidth of 1/3 octave allows the impulse response to decay to a non-interfering level after 5 to 10 periods. To cover this additional 1/3 octave, the measurement data would have to go up to 25 kHz.

Conclusion: Vibrations in the range of the Roloff frequency are artifacts and no indication for defects or inaccuracies in the used equipment or Consequences of whimsical "graded reflective" objects. If they occur during the examination of reflections one should classify them accordingly and not be bothered. For elimination, a Roloff well above the 20 kHz range would be necessary. Since the usual measurements do not provide any data there, this would only be possible with computational tricks.

The fourth image shows the same data logarithmically (in dB) and the corresponding T60 value. Even if the impulse response used has a reverberation time close to zero, the calculated reverberation time after filtering is not zero. In the figure, T20 (i.e. T60 determined from a drop of 20 dB) has the value 150 ms at 1 octave bandwidth and 50 Hz corner frequency. At a bandwidth of 1/6 octave, the value increases to 327 ms. In practice, if the filters are set too narrowly, the filter ringing can reach the order of magnitude of the useful signal [Goertz 2020]. This must not be forgotten when investigating narrow frequency bands.

Different speakers and rooms

  BeschreibungFrequency und Implse Response - Bild zum Vergrössern anklicken Hifi-Apps Ergebnis und Kommentar
PC R5x5
Kleine PC Lautsprecher auf einem Tisch in einem nahezu ungedämmten quadratischen Raum (5m x 5m). 2019-08-29_INT.PCSPKR.WZ5X5 (Deutsch)
Kleine PC Lautsprecher im Büro. 2019-08-13_TAB_PC_SPKR_CABLE_CONNECTED LIST_2019-08-13_TAB_PC_SPKR_BT_CONNECTED
Modularer Eigenbau mit Plasma Hochtönern (Magnat MP 02) - Dämmung mit aktustischen Vorhägen. img4-s LIST_2019-08-19_INT.GAUS.VORH.OFFEN LIST_2019-08-19_INT.GAUS.VORH.ZU LIST_2019-08-19_AUDIX.GAUS.VORH.OFFEN LIST_2019-08-19_AUDIX.GAUS.VORH.ZU.
Dipol Lautsprecher (Martin Logan ESL 9) 2019-09-22_ML_INT_XXX
Standlautsprecher (Quadral Chromnium Style 50) in 5m x 7m grossem durchschnittlich möbliertem Raum LIST_2019-06-11_AUDIX.WZ.STR-GTN LIST_2019-06-11_INT.WZ.STR-GTN.
Kein worst case Szenario: Quadral Chromnium Style 30 in einem Badezimmer. LIST_2019-10-17_INT_QUADRAL_BAD
QU30 R5x5
Künstliche Provokation von Bodenreflexionen (rechter Lautsprecher zum Boden geneigt) LIST_2019-10-17_AUDIX_W5_FB. LIST_2019-10-17_INTW5_FLOORBOUNCE.
Hornlautsprecher MARTION Bullfrog aktiv LIST_2019-10-30_MARTION_BULLFROG


[Blauert 1974, Wikipedia] Wikipedia Artikel "Blauertsche Bänder":

[Defrance 2009] G. Defrance, L. Daudet and J-D. Polack: Using Matching Pursuit for estimating mixing time within Room ImpulseResponses. DOI:10.3813/AAA.918239

[Fuchs 2019] Helmut Fuchs, Vortrag TU-Berlin 2019.

[Goertz 2020] Anselm Goertz: Seminat "Studioakustik und Monitorlautsprecher" 2020 TU-Berlin.


[Møller 1974] Møller, Henning: Relevant loudspeaker tests in studios inHi-Fi dealers' demo rooms in the home etc. using 1/3 octave, pink-weighted, random noise. 47th Audio Engineering Society Convention, 1974-02-26/29, Copenhagen (Denmark),

[Toole 2015] Toole, Floyd E: The Measurement and Calibration of Sound Reproducing Systems, in: Journal of the Audio Engineering Society 63(7/8):512-541, August 2015,

[Usher 2010] John Usher: "An improved method to determine the onset timings of reflections in an acoustic impulse response". The Journal of the Acoustical Society of America 127, EL172 (2010);

[Zehner Ringversuch] Markus Zehner: Ringversuch Nachhallzeit

[Protheroe 2013] Daniel Protheroe, Bernard Guillemin: "3D impulse response measurements of spaces using aninexpensive microphone array". Toronto, Canada International Symposium on Room Acoustics 2013