Introduction


Smartphone audio capabilities play a significant role in the overall smartphone experience. Today, we use our smartphones more and more to capture the moment with friends and family, shoot selfie videos, but also listen to music, watch videos, or play games. For all those usages, the variation of audio quality between devices for both recording and playback is huge, but there has been little guidance and information available to consumers who care about audio quality.

DXOMARK introduced its protocol in October 2019, aiming to provide a comprehensive evaluation of audio quality for each smartphone. The DXOMARK Audio score is composed of two main sub-scores: Playback and Recording. For each sub-score, we evaluate the most common use cases, which are themselves measured against a series of technical attributes.

To keep up with the latest technology trends and usages, DXOMARK regularly updates its protocols and continues to offer exhaustive and meaningful evaluations.  

Audio Score Structure


30

hours of testing

for each smartphone 

20

hours of perceptual evaluation cross-validated with objective measurements

Learn more about our Use Cases

Playback 70%

We measure the playback performance on the 3 most common use cases, as well as on a total of 5 technical attributes.

 

Technical attributes evaluated :

    • Timbre: Evaluates the ability to render the correct frequency output according to the use case and user’s expectations, looking at bass, midrange, and treble frequencies as well as the tonal balance among them. A good tonal balance consists in an even distribution of these frequencies
    • Spatial: Evaluates the ability to render a virtual sound scene truthful to reality. A good spatial performance means to be able to accurately place an instrument in or an explosion in a movie.
    • Dynamics: Covers a device’s ability to render loudness variations and to convey punch as well as clear attack and bass precision.
    • Volume: Evaluates the sound pressure levels at various volume settings to determine the minimum and the maximum volume as well as volume consistency.
    • Artifacts: Evaluates the presence of accidental or unwanted sounds, resulting from a device’s design or tuning.

Use cases evaluated :

Watching Videos

We evaluate the device’s capability to render a scene and be in line with the original track in various types of movies. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Timbre: expecting a correct tonal balance with precise and sharp bass rendition in line with the original track
    • Spatial: A wide stereo scene as well as centered voices

Listening to Music

We evaluate whether musical content is delivered in line with the master recording. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Timbre: looking for a correct tonal balance in line with the original track
    • Dynamics: A sharp and precise attack
    • Spatial: Wideness in line with the original track

Gaming

We evaluate if the device offers an immersive audio experience. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Spatial: looking for a good spatiality, allowing users to recognize where sounds are coming from, which is crucial in the gaming experience
    • Timbre: A correct tonal balance
    • Artifacts: A good management of artifacts like no possibility to occlude the speakers with the hands while gaming

Recording 30%

We measure the recording performance on the 5 most common use cases, as well as on a total of 6 technical attributes.

 

Technical attributes evaluated :

    • Timbre: Evaluates the ability to render the correct frequency output according to the use case and user’s expectations, looking at bass, midrange, and treble frequencies as well as the tonal balance among them. A good tonal balance consists in an even distribution of these frequencies
    • Spatial: Evaluates the ability to render a virtual sound scene truthful to reality. A good spatial performance means to be able to accurately place an instrument or an orchestra heard or a specific action in a video.
    • Dynamics: covers a device’s ability to render loudness variations and to convey punch as well as clear attack and bass precision.
    • Volume: Evaluates the ability of the device to record at the appropriate volume, with voices that can be heard, whatever the input acoustic level.
    • Artifacts: Evaluates if any anomalous sounds are present, which are not present in the original sound source.
    • Background: Evaluates the faithfulness of the background audio sources, their tonal balance and the presence of disturbances in the background audio rendition.

Use cases evaluated :

Friends & Family videos

We evaluate the audio recording when shooting video with the main camera in outdoor & indoor environments. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Timbre: looking for intelligible and natural voices
    • Spatial: A good spatiality, allowing users to correctly position the audio sources (wideness, distance, and localizability) and efficient audio zoom (if available)
    • Background: Background noise reduction
    • Artifacts: A good management of artifacts and particularly the reduction of wind noise

Selfie videos

We evaluate the performance of the recording with subjects in front of the selfie camera in outdoor and indoor environments. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Timbre: expecting clear and natural voices rendering
    • Artifacts: A good management of artifacts with the reduction of background sounds or impossibility to occlude the microphones

Concerts

We evaluate music recorded at high volume & high bass environments such as electronic concerts. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Timbre: looking for a correct overall tonal balance
    • Artifacts: Capability to record high loudness situations without audible distortion or compression artifacts

Meetings

We evaluate the handling of audio when several people are speaking in a room. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Spatial: looking for the ability to record several voices coming from different directions
    • Timbre: Intelligible and localizable voices

Memos

We evaluate audio recording of a single user talking in front of the smartphone in challenging environments. For such usage, our experts are particularly looking at the following attributes and device capabilities:

    • Timbre: Expecting intelligible voices
    • Dynamics: An adequate noise reduction

What we also Test