I think the point that ipsum2 is trying to make is that Copilot's chat service and its code completion service could be using different models, which is not uncommon for coding assistants.
Continue[0] for example can use up to 3 different models in a session: a large model such as GPT-4 Turbo for chat and code QA, a smaller low latency model such as StarCoder2-3B for code completion, and yet another model such as all-MiniLM-L6-v2 for generating embeddings.