by Tyron Devotta
For decades, psychology textbooks have painted a clear picture: machines can't perceive objects unless they see them whole. The human mind, they claimed, was far superior. We could recognize a pencil or a book from just a glimpse.
But this landscape is transforming. Generative AI, a branch of artificial intelligence focused on creating new data, is challenging our understanding of perception. These AI models can now identify objects from partial visuals, a feat once thought exclusive to humans.
This shift is clear in the capabilities of GPT-4V, a cutting-edge AI from OpenAI. This model can interact with images and text, allowing it to not only answer questions about visuals but also decipher text embedded within images. This technology effectively dismantles CAPTCHA, a security measure designed to identify humans by presenting confusing puzzles.
The implications extend beyond basic object recognition. Google's new AI assistant showcases remarkable abilities – recognizing objects, deciphering complex coding, and even providing creative marketing solutions based on its environmental observations.
These advancements beg the question: such technology could revolutionise journalism? Imagine AI reporters capable of observing events, asking questions, recording footage, and analysing situations. They could even compare past events and generate reports faster than their human counterparts.
However, a major hurdle exists – AI hallucinations. These are instances where AI generates false or misleading information presented as fact. Studies suggest chatbots powered by large language models can fabricate information in nearly a third of their responses. Mitigating these hallucinations is crucial before AI can be fully relied upon.
The future of perception is no longer a one-sided story. As AI capabilities evolve, the line between human and machine understanding continues to blur. While AI holds the potential to transform fields like journalism, addressing the issue of AI hallucinations remains paramount.
Also Read: