Another in a long line of odd robotics projects to come out of Japan, I ran across this little gem on Automaton. According to the project’s website, researchers at Kagawa University are working towards the:
- Construction of a talking robot which has mechanical vocal organs
- Realization of an autonomous vocalization learning based on an auditory feedback
- Execution of a singing performance
Maybe that’s what they’ve done, maybe not. Who am I to judge? Looks to me more like they hooked a soda bottle to some sort of crude sexual aid, set a tortured air pump blowing through the thing, and started squeezing it with model airplane servos so it’d wail like the cries of the damned. I have trouble looking past the monstrosity they’ve created to see the real research behind it. See for yourself.
As the title of this article indicates, I’m clearly not convinced it’s a good idea. I don’t mean to dismiss the talent of the students/researchers working on this thing, or even its potential to increase our overall knowledge of robotics, but the fact is- it’s creepy. I’m downright disturbed. A quick perusal of comments on the various YouTube videos indicates that I’m not the only one feeling that way.
I’m not sure whether or not the goal was to make the device seem lifelike in addition to producing somewhat realistic sounds, but it seems to me they’ve accomplished it enough to push deep into the uncanny valley. The dead, silicone lips are repulsive in a still photo, but they are a full on horror show once they start to move. The loud whirring from the servos probably makes it better, because I can see people totally wigging out if it wasn’t obviously being mechanically actuated.
Which brings up another point- I’m no expert in anatomy, but I’m pretty sure this isn’t how people create speech. Admittedly, we do have a resonance cavity (the big nose thing on it), we have some sort of lips (mine look less like a lamprey), we push air from our lungs (usually not soda bottles), and we have tongues to actuate airflow (the diagram shows that this thing has one). Given that, the overall shape of our mouths and vocal cord are vastly different and I’m fairly certain that no part of our speech process involves the sort of crushing demonstrated in this device.
From listening to the YouTube video above and a recording of their target song on Wikipedia (Kagome Kagome) it seems to me like they’re on the right track to demonstrate at least two of their goals. I am seeing a mechanical voice organ, of some sort, and I’m hearing something that seems vaguely like the song they’re after. Of course, I’m totally unfamiliar with the Japanese language and am mostly useless as a judge of the enunciation the device is accomplishing.
To me, the truly impressive part would be how they accomplish the objective of creating “autonomous vocalization learning”. Maybe their meaning is getting lost in translation, but the diagram on their site indicates that they are using a system that adjusts the vocalization through something other than the operator listening and tweaking each phoneme. I’ve not been able to find more detail on this, but it seems intriguing and perhaps more valuable than the specific machinery they’ve chosen for this version of a speech organ. Anyway, once you’ve accepted the idea that it’s possible to create superior speech through a fake, flopping flesh-hole… you just have to balance it with something cool like an autonomous feedback system.








