That'd explain it: probably the conversion caught some pages that were already in UTF-8 and reëncoded them too.
different kinds of Chinese
My quick attempt last night at deëncoding using Libreoffice gave some pretty plausible‐looking UTF-8‐encoded Traditional Chinese (modulo anything encoded with byte 0x81, or presumably any other bytes unassigned in CP1252).
That'd explain it: probably the conversion caught some pages that were already in UTF-8 and reëncoded them too.
My quick attempt last night at deëncoding using Libreoffice gave some pretty plausible‐looking UTF-8‐encoded Traditional Chinese (modulo anything encoded with byte 0x81, or presumably any other bytes unassigned in CP1252).