From what I understand you're on the right path. What I think* you're missing is a grouping algorithm. Iterate the table for like-pixels chunks by fore/background color and character to draw together. This should result in less gpu calls overall. I'm working on something similar