Building a Cross-Modal AI Experience: From Image Analysis to Immersive Audio
Exploring the intersection of computer vision and generative audio by creating an app that analyzes images and generates matching soundscapes using open-source AI models.
🌍 Product and Engineering Leader · ex-Microsoft & Netflix
I love to build AI products that cross borders, delight millions, and shape the future of technology. 🚀
Exploring the intersection of computer vision and generative audio by creating an app that analyzes images and generates matching soundscapes using open-source AI models.
Built from scratch with open-source LLMs and vector search, this post shares high-level technical details on how I created an AI-powered search that understands human language, not just keywords.