OpenAI has unveiled a new tool named GPTbot.
This revolutionary web crawler has been crafted to accumulate data from all corners of the internet, amplifying the precision and capabilities of AI models.
OpenAI says that granting GPTbot access to websites can play a pivotal role in refining AI models鈥 accuracy, increasing their overall potential, and enhancing safety measures. However, it has come to light that a substantial 15% of the world鈥檚 top 100 websites have opted to block GPTbot鈥檚 access.
听
GPTbot鈥檚 Impact and Adoption
听
Originality.AI has released data that reveals that within the initial fortnight following the launch of GPTBot鈥檚 documentation, nearly 10% of the globe鈥檚 most prominent 1000 websites chose to prevent GPTbot鈥檚 intrusion.
Notable sites such as Amazon, Quora, Wikihow, and several international news outlets have taken measures to thwart GPTbot鈥檚 presence on their platforms. This brings into question the potential accuracy and limitations of ChatGPT.
听
The Mechanism Behind GPTbot
听
GPTbot operates through a structured process starting with the identification of potential data sources. This step involves web crawling where the tool scours the internet to pinpoint websites containing relevant information. Once an appropriate source is found, GPTbot extracts relevant data from the identified website.
The collected information is then catalogued within a database, used for the training of AI models.
听
More from News
- From Workouts To Managing Jetlag: The British Tech Scale-Up That Just Hit One Million Users Globally Appoints New CEO
- Hackers Tricked Instagram鈥檚 AI To Leak Your Log In Details 鈥 How Can Users Stay Protected?
- New Research Reveals The UK鈥檚 Top 10 鈥淔uture-Ready鈥 Cities
- New Research Shows How Elections Are Impacting The Job Market 鈥 Here鈥檚 How
- Is London Becoming The World鈥檚 Next AI Capital?
- Google鈥檚 AI Can鈥檛 Even Spell 鈥淕oogle鈥 鈥 So Why Is It Replacing Search?
- Will AI Labels Actually Save YouTube From AI Slop?
- The Rise Of 鈥淣ew Brand鈥 Cybercrime Groups And The Business Of Ransomware
听
Versatility in Data Extraction
听
One of GPTbot鈥檚 standout attributes is its ability to extract data from an array of sources, spanning text, images, and code. In terms of textual content, GPTbot extracts information from websites, articles, books, and diverse documents.
Furthermore, its ability extends to image-based data, allowing it to discern objects depicted within images and decipher textual content. Impressively, GPTbot can even extract code from repositories hosted on GitHub, as well as other code sources scattered across the internet.
听
The Nexus with AI Models
听
OpenAI鈥檚 flagship product, ChatGPT, and similar generative AI tools draw information from the data culled from websites to fuel their training processes. Even prominent figures like Elon Musk, in a previous iteration of the social media platform now known as Twitter, had intervened to halt OpenAI鈥檚 data scraping from the platform.
The creation of GPTbot represents a leap forward in AI advancement. By capturing data from the expansive digital landscape, GPTbot is poised to usher in a new era of AI proficiency.
The decision of some top websites to bar GPTbot鈥檚 access showcases the complexities around data usage rights. As OpenAI continues its stride toward AI excellence, the interplay between data, innovation, and legal considerations remains a central point.