It sounds like a fun project but practically speaking why not just use some end-to-end encrypted VoIP protocol? The only practical use case I can see for your solution would be to do it over normal GSM calls when you don't have a data connection. Then it could have some value.
Why not just E2E-encapsulate a VoIP packet/connection?