hero image

The OpenAI-developed AI chatbot, ChatGPT, has attracted significant public interest and fascination due to its remarkable capabilities, including summarizing complex subjects and engaging in lengthy conversations. As a result, other AI enterprises are in a hurry to launch their large language models (LLMs), which is the technology that underpins chatbots like ChatGPT. In addition, certain LLMs will be incorporated into search engines and other products.

ChatGPT-4 shows inconsistencies in the word game “Wordle”

The chatbot was put to the test in the New York Times word game Wordle. The game involves players guessing a five-letter word in six tries, with the game providing feedback on which letters are in the correct positions with each guess.

The latest LLM, ChatGPT-4, was used in solving word puzzles, but surprisingly its performance was poor. Despite being trained on a vast amount of text, including public-domain books, scientific articles, Wikipedia, and web text, to improve what they do, LLMs such as ChatGPT-4 still struggle with certain tasks.

An AI chatbot’s difficulties with the Wordle game provide insights into how LLMs work with words and their limitations. The ChatGPT-4 bot’s responses were inconsistent and inaccurate when given specific patterns in the game, indicating a hit-and-miss approach. While it was able to identify suitable solutions for some patterns, it failed to propose correct options for others.

How does ChatGPT-4 work?

To operate, ChatGPT-4 uses a deep neural network, which is a complex mathematical function that maps inputs to outputs. As the AI model works with words, it requires translating these words into numbers to enable the neural network to function correctly.

Notably, the bot uses a tokenizer program to translate the words into numbers and assign unique token IDs to words and letter sequences. Whenever a user inputs a question, the words are translated into numbers before being processed by the deep neural network. For instance, the word friend has a token ID of 6756, with words like “friendship” broken into “friend” and “ship” represented as identifiers 6756 and 6729. Since the network cannot access the words as text, it cannot reason about the letters.