After having launched the highly intriguing Raven H smart speaker in 2017 and reportedly selling a disappointing number of units, Baidu went to the other extreme. In June 2018, the Chinese company released the Xiaodu, a low-cost smart speaker which went on to sell millions of units in only a few months. By the end of 2019, this impressive growth eventually led to Baidu’s taking over Alibaba’s lead in the Chinese smart speaker market, and edging past Google on the global scale to become Amazon’s first smart speaker rival.
In May 2020, Baidu rolled out a second edition of the Xiaodu, called Xiaodu Smart Speaker Ultimate Edition. While the updated version boasts an impressive list of connected features and voice recognition technologies, it doesn’t give out much information about its audio playback abilities. In fact, it doesn’t give any information at all, except for the fact that the very compact speaker integrates “modeling technology to ensure advanced acoustics” and is able to “automatically synthesize the parents’ tones to tell over 500 stories for children.”
To find out more about this best-seller’s audio performance, we put the Baidu Xiaodu Smart Speaker Ultimate Edition through our rigorous DXOMARK Wireless Speaker test suite. In this review, we will break down how it fared at audio playback in a variety of tests and several common use cases.
Key specifications include:
-
- Single upward-firing speaker
- Bluetooth 4.2
- No battery
- DuerOS assistant
- 300g
Test conditions:
-
- Tested with Motorola G8
- Communication protocol used: Bluetooth
- Firmware version: 2.11.2.20200520094
About DXOMARK Wireless Speaker tests: For scoring and analysis in our wireless speaker reviews, DXOMARK engineers perform a variety of objective tests and undertake more than 20 hours of perceptual evaluation under controlled lab conditions. This article highlights the most important results of our testing. Note that we evaluate playback using only the device’s built-in hardware. (For more details about our Speaker protocol, click here.) The Baidu Xiaodu Smart Speaker Ultimate Edition falls into the Essential category of devices in the DXOMARK Speaker rankings.
Test summary
With a global score of 48, the Baidu Xiaodu is among the lowest-scoring speakers we have tested so far. With the exception of the artifacts category, all of its sub-scores rank last in our database. Its overall playback performance is heavily hampered by a tonal imbalance resulting from absent bass, recessed treble, and a lack of low-midrange frequencies. Consequently, bass precision and punch performances are both particularly poor.
Since the Xiaodu is built with a unique upward-firing speaker, wideness is essentially non-existent. And to cap it all, Baidu’s low-cost speaker delivers insufficient maximum volume and inconsistent volume steps.
The good news is that undesirable artifacts are under control. Indeed, the Baidu Xiaodu produces fairly clean sound from soft to nominal volumes, and very few artifacts at maximum volumes (granting that its maximum volume is considerably lower than that of most of the other tested speakers). It also ensures good localizability, and realistic distance rendering for vocal content.
Sub-scores explained
The DXOMARK Speaker overall score of 48 for the Baidu Xiaodu Smart Speaker Ultimate Edition is derived from a range of sub-scores. In this section, we will take a closer look at these audio quality sub-scores and explain what they mean for the user, and we will show some comparison data from two of the Xiaodu’s principal competitors in the Essential category, the TMall Genie X5 and the Yandex Station.
Timbre
Baidu Xiaodu Smart Speaker Ultimate Edition
152
DXOMARK timbre tests measure how well a speaker reproduces sound across the audible tonal range and takes into account bass, midrange, treble, tonal balance, and volume dependency.
Most notable in our timbre tests of the Baidu Xiaodu is the deficit of high- and low-end extension. In other words, bass is essentially absent, and treble is critically recessed.
Low-mids are also lacking; as shown in the graph below, the frequency response starts dropping below 300 Hz, and plummets below 200 Hz.
This means that the speaker produces a particularly narrow frequency range that focuses on high-mids, resulting in an overall nasal sound not well-suited for listening to any music genre nor for watching movies. With such a timbre performance, it is safe to say that most children would hardly recognize their “synthesized parents’ tones.”
Dynamics
Baidu Xiaodu Smart Speaker Ultimate Edition
137
Our dynamics tests measure how well a device reproduces the energy level of a sound source, taking into account attack, bass precision, and punch.
The Xiaodu’s dynamics sub-score is also quite weak. While attack remains acceptable thanks to the prevalence of high-mids, the lack of low-mids impairs punch, and unsurprisingly, bass precision is severely affected by the near-absence of bass.
Spatial
Baidu Xiaodu Smart Speaker Ultimate Edition
111
Our spatial tests measure a speaker’s ability to reproduce stereo sound in all directions, taking into account localizability, balance, wideness, distance, and directivity.
In the spatial area as well, Baidu’s best-selling speaker leaves something to be desired. Its sub-score of 62 is heavily impaired by an absence of wideness and unrealistic distance rendering except for vocal content such as podcasts.
Because of the Xiaodu’s single upward-firing speaker design, its directivity performance is great — meaning that sound spreads out evenly at 360° around the speaker. Furthermore, its small form factor and sound field narrowness ensure good localizability of the various sound sources.
Volume
Baidu Xiaodu Smart Speaker Ultimate Edition
141
Our volume tests measure both the maximum loudness a speaker is able to produce and how smoothly volume increases and decreases based on user input.
In volume testing, the Xiaodu performs poorly, with a maximum volume unable to reach the target SPL (sound pressure level) for our protocol’s party scenario. This is why you will see only two distortion (THD) curves in the artifacts section further below, since the loud and maximum volumes are identical. Additionally, as shown in the graph above, the volume steps are far from being consistent.
Here are a few SPL measured when playing our sample recordings of hip-hop and classical music at maximum volume:
Correlated Pink Noise | Uncorrelated Pink Noise | Hip-Hop | Classical | Latin | Asian Pop | |
Baidu Xiaodu Smart Speaker Ultimate Edition | 67 dBA | 64.1 dBA | 65 dBA | 57.8 dBA | 66.7 dBA | 60.2 dBA |
TMall Genie X5 | 75.4 dBA | 73.8 dBA | 72.1 dBA | 67.1 dBA | 73.9 dBA | 66.3 dBA |
Yandex Station | 91.7 dBA | 89 dBA | 86.7 dBA | 79.4 dBA | 89.2 dBA | 80.2 dBA |
Artifacts
Baidu Xiaodu Smart Speaker Ultimate Edition
133
Our artifacts tests measure how much source audio is distorted when played back, along with other sound artifacts such as noise, pumping effects, and clipping. Distortion and other artifacts can occur both because of sound processing and because of the quality of the speakers.
In the light of its performance in most of our other attribute categories, the Xiaodu’s control of artifacts comes as a pleasant surprise, achieving a sub-score that is only two points away from the top-scoring speaker in this category, the Amazon Echo Studio. But let’s not forget that most artifacts are typically triggered at loud volumes, often by low- and high-end frequencies. Since the Xiaodu severely lacks bass and treble, and since its maximum volume is fairly weak, the task of controlling undesirable artifacts is considerably easier for it.
Thus from soft to nominal levels, no temporal artifacts are perceivable, and very few spectral artifacts can be heard. While compression and distortion certainly become more noticeable at loud and maximum volumes (especially in our bathroom, outdoor, and party use cases), they remain within an acceptable range.
Our testers perceived no user artifacts; however, the Bluetooth latency is quite significant, which makes the wireless connection unsuitable for watching videos, unless the player offers the possibility of adjusting the delay manually.
Conclusion
The Baidu Xiaodu’s single upward-firing speaker delivers a low-scoring performance, with significant tonal imbalance, considerable lack of bass and treble, poor dynamics, absent wideness, and weak maximum volume. The only bright side is that both spectral and temporal artifacts are well under control. In light of all this, it is fair to say that Baidu’s best-selling smart speaker is undoubtedly more smart than speaker.
Pros
- With only one upward-firing speaker, sound is evenly distributed around the speaker.
- Undesirable sound artifacts are kept under control.
- Localizability of sound sources is decent.
Cons
- Tonal reproduction suffers from tonal imbalance, with bass, low-mids, and high-ends critically lacking.
- The produced sound field is particularly narrow.
- Poor overall dynamics performance
- Maximum volume is well below average and volume steps are inconsistent.
DXOMARK encourages its readers to share comments on the articles. To read or post comments, Disqus cookies are required. Change your Cookies Preferences and read more about our Comment Policy.