Back To Top

Speech Emotion Recognition with SpeechBrain
August 1, 2024

Speech Emotion Recognition with SpeechBrain

A Practical Approach to Analyzing Emotions Over Time in Speech With the upcoming release of advanced voice methods by OpenAI, we’d like to explore what open-source technology is currently available for speech emotion recognition. Specifically,  we will explore the use
Fine-tuning Idefics2 VLM for Document VQA
June 10, 2024

Fine-tuning Idefics2 VLM for Document VQA

An Open State-of-the-Art Vision-Language Foundation Model Idefics2, the latest iteration, builds on the success of Idefics1 with enhanced Optical Character Recognition (OCR) abilities, improved architecture, and superior performance on Visual Question Answering (VQA) benchmarks. With 8 billion parameters and an
Introduction to Large Action Models – The Next AI Frontier
May 12, 2024

Introduction to Large Action Models – The Next AI Frontier

Large Action Models for Automation Conventional language models extend their capabilities through Large Action Models (LAMs) incorporating mechanisms that enable direct interaction with digital and physical environments. Sectors like healthcare, finance, and customer service would find LAMs invaluable for navigating
Fine-Tuning Phi-3 with Hugging Face
May 2, 2024

Fine-Tuning Phi-3 with Hugging Face

A Highly Capable Language Model Fine-tuning Phi-3, a compact yet powerful AI model, has emerged as a powerful technique. Fine-tuning allows Phi-3 to specialize for specific tasks and domains. The Phi-3 series targets efficient AI on mobile and low-power devices.
Fine-tuning BLIP2 for Image Caption Generation with PEFT
April 26, 2024

Fine-tuning BLIP2 for Image Caption Generation with PEFT

Customizing BLIP2 with LoRA and HuggingFace on the Flickr30k Dataset Fine-tuning BLIP2, a state-of-the-art open-source visual language model, can be a game-changer for various business applications. BLIP2 stands out as one of the most powerful models in its class. It
Fine-Tuning StarCoder to Customize a Coding Assistant
April 25, 2024

Fine-Tuning StarCoder to Customize a Coding Assistant

A Comprehensive Guide Fine-tune a Code LLM on Private Code using a Single GPU to Enhance its Contextual Awareness Recently, powerful language models like Codex, StarCoder, and Code Llama that can understand and generate code have been created. These models
Fine-Tuning LayoutLMv2 for Document Question Answering
April 20, 2024

Fine-Tuning LayoutLMv2 for Document Question Answering

A Step-by-Step Guide to Optimizing LayoutLMv2 for Enhanced Domain-Specific Document Question Answering Efficiency Document question answering (DQA) plays a key role in various tasks, allowing us to efficiently retrieve information from documents. However, traditional DQA models, including those not leveraging
Automating Scientific Knowledge Retrieval with AI in Python
April 12, 2024

Automating Scientific Knowledge Retrieval with AI in Python

End-to-End Guide for Developing a Research Chatbot with OpenAI functions Capable of Semantic Search Across Arxiv The vast amount of scientific publications today challenges effective knowledge retrieval. Researchers, academics, and professionals need innovative methods to stay updated. AI and semantic search technologies
AI Image Analysis with BLIB, YOLOv9, ViT, and CLIP
March 13, 2024

AI Image Analysis with BLIB, YOLOv9, ViT, and CLIP

Image Analysis with AI by Automating Captions, Detection and Similarity Search in Python Recently, image analysis with AI has witnessed significant advancements, particularly driven by the rapid evolution of artificial computational technologies. As a result, these innovations enable a more
Cloning Yourself on WhatsApp with AI in Python
February 27, 2024

Cloning Yourself on WhatsApp with AI in Python

Integrating OpenAI and Twilio for Chatbot Interactions Which Mimic Your Chatting Style and Understands Image Inputs Have you ever wished you could clone yourself to keep up with all the WhatsApp messages from your friends, family, and colleagues? Well, in this
Narrating Videos with OpenAI Vision and Whisperer Automatically
January 19, 2024

Narrating Videos with OpenAI Vision and Whisperer Automatically

Synchronizing Narration with Video Length, Tone and Frame Content Contextually Rich and Dynamic Storytelling This guide aims to present an end-to-end solution that for automatically narrating videos with AI, leveraging OpenAI’s GPT-4 Vision and Text-to-Speech technology’s cutting-edge capabilities. OpenAI’s GPT-4
Real-Time Emotion Recognition in Python with OpenCV and FER
January 19, 2024

Real-Time Emotion Recognition in Python with OpenCV and FER

A Comprehensive Python Guide for the Detection, Capture, and Analytical Interpretation of Live Emotional Data Emotion recognition technology presents an interesting intersection of psychology, AI and computer science. We harness the capabilities of OpenCV for video processing and the Facial
Interactive Data Analytics in Python with Microsoft LIDA
January 19, 2024

Interactive Data Analytics in Python with Microsoft LIDA

Automatically Going From Raw Data to Insight, Empowering Data-Driven Decisions Much Quicker This article explores LIDA, Microsoft’s innovative tool for interactive data visualization. Uniquely, LIDA harnesses large language models, transforming complex datasets into insightful visual representations. LIDA excels with datasets that
Uniting LLMs with Knowledge Graphs for Fact-Based Chatbots
January 18, 2024

Uniting LLMs with Knowledge Graphs for Fact-Based Chatbots

An In-Depth End-to-End Tutorial for Structuring Raw Data into Knowledge-Driven AI Chatbots In this article, we skillfully combine the structured, relationship-focused architecture of knowledge graphs with the sophisticated language understanding abilities of LLMs. Knowledge graphs contribute a layer of structured