I Tried Creating a Blog Generator Chatbot with 5 Different Open Source Transformers

Discover how creativity and technology are transforming modern design with AI tools, virtual reality, and generative design, reshaping creative processes and future trends.

Share this article

https-medium-com-priyankatariya-26-i-tried-creating-a-blog-generator-chatbot-with-5-different-open-source-transformers-cdca72b0511c

Advances in Natural Language Processing (NLP) technology have led to remarkable developments in automated content creation. In this project, I explored how contemporary transformer-based Large Language Models (LLMs) perform in generating articles that are not only coherent but also contextually relevant. By comparing several prominent models, including GPT-2, T5, LLaMA, Mistral, and GPT-Neo, I evaluated their ability to produce high-quality content while focusing on reducing grammatical errors and ensuring novelty, as measured by several key evaluative metrics.

Methodology Adopted For The Proposed Blog Generation

Why Automated Blog Generation?

The technique of using Artificial Intelligence (AI) to generate written content is becoming increasingly popular. AI-based content generation saves time, effort, and resources for companies and individuals who need large quantities of keyword-rich material for blogs, social media, and SEO optimization. By leveraging large datasets and sophisticated language models, AI can now generate content that mirrors human writing, with applications ranging from article writing to technical documentation and creative storytelling.

Let’s Deep Dive into Understanding the Key Transformer Models

In my experimentation, I focused on five open source models: GPT-2, T5, LLaMA, Mistral, and GPT-Neo. Each of these models had been installed through hugging face and fine-tuned to excel in different areas of content generation, and my goal was to compare their performance using various metrics, including:

Grammar Errors: The fewer errors, the more polished the output.
Coherence: How well the content flows and remains on-topic.
BERTScore: A metric that evaluates semantic similarity between generated and reference text.
BLEU Score: A measure of text overlap, used to evaluate accuracy in text generation.
RAVEN: A metric that evaluates novelty in generated content.

Key Findings: A Comparative Study

I ran multiple prompts through these models, ranging from technical topics like “Advancements in AI” to more creative topics such as “Baking Sugary Goods.” Below are the findings:

Prompt 1: “Advancements in AI”

GPT-2 and Mistral outperformed the rest in terms of minimizing grammatical errors, with GPT-2 producing a flawless output and Mistral only having a few minor errors.
Mistral had the lowest coherence score (0.67), though it performed admirably in grammatical accuracy with just two errors.
All models scored highly on novelty (RAVEN 1.0), showing that they could generate fresh, non-redundant content.

Prompt 2: “Saving Turtles from Plastic in the Ocean”

GPT-2 once again led with the highest coherence score of 0.98, producing contextually aligned and error-free content.
Mistral demonstrated the highest BERTScore (0.3221), meaning that while its coherence was lower (0.66), the generated text was still semantically relevant.
GPT-Neo struggled with coherence, scoring only 0.49, indicating difficulties in contextual understanding.

Prompt 3: “Baking Sugary Goods Boosts a Person’s Mood”

Mistral excelled in semantic quality, with the highest BERTScore (0.34). However, its novelty slightly decreased (RAVEN 0.9931), indicating a trade-off between semantic richness and original content.
GPT-2 remained consistent, with a good balance between coherence and readability (0.87).

The Role of Visual Content

In addition to text generation, the system I created integrated ‘Pexels’, a free stock photo service, to enhance the blog creation process. By analyzing the generated content, the system automatically suggests high-quality images that are relevant to the text. This visual component is crucial in attracting readers, ensuring that the final product is not only well-written but also visually appealing.

Working Model using Mistral7B

Conclusion: Which Open Source Model Worked the Best?

In conclusion, my research into modern AI models such as LLaMA, T5, GPT-2, GPT-Neo, and Mistral highlighted the impressive capabilities of GPT-2, Mistral, and LLaMA in generating high-quality content with a strong balance of precision, coherence, and novelty. These models consistently performed well across various prompts, making them highly effective for general-purpose content creation. However, further research and exploration could explore how each model performs when generating topic-specific content, as certain models may excel in niche areas. Expanding this to incorporate video and multimodal content could further enhance user engagement, offering richer, more interactive experiences. This opens up exciting possibilities for the future of AI-driven content creation, especially as these models continue to evolve and integrate new capabilities.

Written by Priyanka Katariya

Freelancing as a College Student: Start Earning Through Side Hustles

Oct 22, 2024

Freelancing as a College Student: Start Earning Through Side Hustles

Oct 22, 2024

Freelancing as a College Student: Start Earning Through Side Hustles

Oct 22, 2024

Three Years of Creating ML Projects : What Did I Learn?

Oct 10, 2024

Three Years of Creating ML Projects : What Did I Learn?

Oct 10, 2024

Three Years of Creating ML Projects : What Did I Learn?

Oct 10, 2024

A Tool for Optimizing Code Health Through Automatic Feedback and Visualization : preciseLake

Jun 15, 2024

A Tool for Optimizing Code Health Through Automatic Feedback and Visualization : preciseLake

Jun 15, 2024

A Tool for Optimizing Code Health Through Automatic Feedback and Visualization : preciseLake

Jun 15, 2024

Pushing the Boundaries of Deep Learning for Cyberbullying Detection

Apr 25, 2024

Pushing the Boundaries of Deep Learning for Cyberbullying Detection

Apr 25, 2024

Pushing the Boundaries of Deep Learning for Cyberbullying Detection

Apr 25, 2024