let’s connect

Enhancing Web-based applications and Generating content with GpT-3 Model

  • Student: Amaar Afzal
  • Academic supervisor: Dehan Kong
  • Industry supervisor: Konstantin Eletskiy
  • Company: ArcadeJolt

The OpenAI Generative Pre-trained Transformer 3 (GPT-3) model is the currently widely used language model for generating human-like text. Our research aims to answer if it is possible for a machine to have human-like thinking and creativity in producing automatically generated content with a GPT-3 model. The research has four main steps: creating the dataset, fine tuning GPT-3, modeling the trained data, and then generating the unique content.

We present a novel approach of creating the dataset for the GPT-3 model through web scraping content. The web scraping aspect tries to solve the problem of information extraction from repetitive blocks in the structure of web pages, by utilizing both the visual features and the semantic information to extract informative blocks, and then use this information in the generative model. This allows us to generate sentences and long-form articles that are on-topic and relevant. We then perform an analysis of similarity to existing content on related web pages to determine if it is possible to produce unique and human-like content from the generative model.

The findings show that the proposed model is capable of generating human-like content after training the model on relevant informative blocks. The research shows that it is possible to use this in production systems to generate articles, summaries, and long form unique content in web pages.