True - the episodes where Prophets feature heavily would lose the most without the visuals. On the other hand, DS9 episodes without a lot of action or SciFi/mystical moments could work well as audio-dramas - "Duet" being a great example... and if you played them to a non-SciFi fan who has is prejudiced towards Science Fiction, they may not even realize it was Science Fiction, without seeing the actors' alien makeup.
It's true that ENT was visually lacking for the first 2 seasons. Season 3 Xindi arc in ENT has the ship in the Expanse where we get to see it caught in a lot of anomalies - and it's not like in VOY, "oh look there's an anomaly" and the ship goes into a time warp or something, but you really get to see
some weird things happening inside and outside the ship. There's also more space action, and space scenes in general - such as a look at the Spheres themselves. It helped that the show seems to have gotten better production values between season 2 and season 3 - and it gets a lot more interesting and impressive than in the first 2 seasons.
Besides, Andorians would just not be the same without their moving antennae.
I guess one thing that might help ENT is that people would probably take T'Pol more seriously if they didn't see her catsuit... But I think that there would be a problem with her and every Vulcan in a substantial role in Trek, since without the picture you would be less likely to discern their emotions. Sure, Vulcan usually don't show emotions as obviously on their faces as most other races, but most of them show them more subtly in their facial expressions, more so than in their voices, or at least I think so. I wonder how Spock-heavy TOS episodes come across without his characteristic eyebrow raise?
Which leads me to the question, which characters would work well on audio only, and which would not? I think Trek in general would do quite good as audio-only, since many actors were stage-trained and had resonant and expressive voices. Also, how would the perceptions of each character change if the audience didn't know what they looked like? I'm going to take a guess that Seven of Nine would be perceived very differently, and that most people wouldn't even realize that she was supposed to be attractive, until another character mentioned it (Kim, EMH...), they'd just hear a robotic, socially awkward and almost asexual ex-Borg.
I admit that when I call ENT visually uninteresting, I'm likely thinking of the first half of the series.
A good point - a visual medium, especially one that's camera-based, will tend to include plenty of visual elements in the narrative and even the characterizations. I can see where Vulcans would present a problem. If Treks had been designed for audio only, how much would vocal queues have become part of their character?
This whole discussion has reminded me of some unintentional hilarity in audio Trek. I mentioned previously that I used to listen to audio tapes of TOS' "The Conscience of the King." Recall the scene where Uhura is singing to Riley over the intercom. Lenore Karidian poisons his drink during the song, but there's no verbal indication of this occurring on-screen. In pure audio, it came across like this:
UHURA: [singing "Beyond Antares", finishes] How you'd like that, Riley?
RILEY: [gag] Help me.....please! [collapses]