Researchers at MIT ’s Computer Science and Artificial Intelligence Lab are teach computers about the relationship between auditory sensation and vision . The squad has created an artificial intelligence system that can not only call what sounds are linked to sure images , but can mimic those voice itself . Popular Sciencereports that they ’ve make a deep - learning algorithm so skilled at re - create phone that it can even fob humans — a kind of " Turing Test for sound , " as the researchers distinguish it .
for teach the calculator about sound , researcher tape 1000 videos of a drumstick hitting , scraping , and tapping different surfaces . All in all , the television captured some 46,000 sounds . Using those video , the computing gadget taught itself which sounds matched up to specific images — for representative , learning to spot between the sound of a drumstick hitting a surface , splosh water , rustling leaf , and tapping a metallic surface .
To test just how much the calculator had learned , researchers present it with a series of raw video , also of a drumstick tap different surfaces , with the sound remove . Using the existing dataset of sounds , which investigator dubbed their ‘ Greatest Hits , ’ the computer created new sound for the newfangled videos . The computer train bantam wakeless clip from the original videos and stitched them together to make totally novel sound combinations .

When researcher presented human volunteers with the computer - generated sounds , they were , for the most part , ineffectual to pick out them from real sounds . In some case , participant were even more likely to choose the computer ’s fake sounds over real sounds .
Researchers believe that the engineering they ’ve created could one day be used to automatically render well-grounded effects for movies and TV . They also say it can help oneself golem better infer the physical creation , determine to distinguish between objects that are soft and surd , or rough and smooth , by the sounds they make .
“ A robot could expect at a sidewalk and instinctively know that the cement is hard and the grass is soft , and therefore have it away what would take place if they stepped on either of them , ” researcher Andrew Owensexplains . “ Being able to predict auditory sensation is an important first footstep toward being able to predict the consequences of physical interactions with the world . ”
[ h / tPopular Science ]