- ThePrompt
- Posts
- MiniGPT-4
MiniGPT-4
Chat with your images, for real
Hi folks!๐๐ป This is The Prompt! We're the AI newsletter thatโs like a nice jog in the morning - refreshing, energizing, and a great way to start the day.
Lace up your sneakers and letโs get it ๐๐ป๐
FEATURED
MiniGPT - chat with your images๐ฌ
MiniGPT-4 is out of this world. ๐
It is a chatbot that can answer questions about your images (a functionality promised by GPT-4, but still not released).
Some of the things it can do:
explain what is on the photo
find problems and solve them (ex: dead plant, what to do with it)
write poems for photos
give you HTML/JS code for a sketch (I actually tried this and it didnโt really work..)
write an advertisement for a given photo
write recipes and shopping list for a given meal photo
explain art
The model is open-sourced, and you can find the code here.
How does it work?
It is built on top of BLIP-2 (which is a model that understands images) & Vicuna โ which is an open-sourced platform to build chatbots that have a similar quality to ChatGPT.
And it is using the LAVIS library, one of the most comprehensive open-source libraries for multimodal language and vision intelligence.
Extra info
As I was writing this newsletter, I found that there is another model similar to MiniGPT, called โ LLaVA. It does the same thing but is MUCH faster. Give it a go!
WHAT ELSE IS GOING ON
๐ฆย Elon wants to build a TruthGPT. He thinks that Microsoft directly owns OpenAI right now and that we need another open that is not controlled by anyone. How much will you charge for the truth Elon... I guess $3?
๐ย Stanford student built his own HealthGPT. He uploaded his Apple Health data and added a chatbot interface. The code is open source.
๐๐ปโโ๏ธ ย NVIDIA dropped new text-to-video model. They also made it possible to personalize these videos using Dreambooth + Stable Diffusion. Still no public demo but hereโs a stormtrooper that is vacuuming the beach๐๐ป
RESOURCES
The best resources we came across lately that will help you become better at writing prompts & building AI apps.
๐ย LLM prompting for programmingย [Short & useful]
๐๐ป ThinkGPT - Python library for long memory [Code & explanations]
๐ฅ Can AI kill the greenscreen? [Short VOX video]
๐ฅย Beginners guide to autonomous agents [ a must read ]
TOOLBOX
The latest AI tools to use or get inspiration from.
MeetGeek - Email/Meetings summarizer
Human or Not - Social Turing game
ChatGPT2D - ChatGPT in a 2Dimensional map
MyAskAI - Personalized ChatGPT for your website
WebScrapeAI - Scrape the web using AI and no-code
Blok - Tasks, notes, meetings on autopilot
Slait - AI tutor for American Sign Language (ASL)
PROMPT OF THE DAY
TOOL
Midjourney
PROMPT
Elon Musk in a poor street situation eating sitting on the sidewalk next to homeless people -- v5
RESULT
LATEST PAPERS
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, and More
NeAI: A Pre-convoluted Representation for Plug-and-Play Neural Ambient Illumination
Solving Math Word Problems by Combining Language Models With Symbolic Solvers
Generative Disco: Text-to-Video Generation for Music Visualization