No, no, you're missing the point!
Fo a realistic model like Momo, the sound generating speechbox would be at the back of the mouth, with the mouth providing all the labials, linguals, fricatives and other shapings that make the speech so much more realistic, probably with an air pump or fan to get the breath pressure right for things like "eff"s. This was probably the case with her old chassis, as well.
But what makes all the computerized things you've mentioned sound artificial isn't the sound, it's the lack of intelligence behind the words - the computer doesn't really understand what it's saying, it's just putting them together according to an (albeit sophisitcated) program. Even for Ebert and Hawking, they need to enter the words into the computer, and the computer puts tham together. Even when Cristopher Reeves was on a lung machine, his words flowed more naturally, despite the long pauses.
But in the AnPC case, it's an artificial intelligence, it knows and understands what it's saying. So the speech wouldn't be stilted, but rather would flow more naturally.
I really think we really wouldn't be able to tell...
and impersonation could abound...