My Explorations & Adventures (Visit synesis.in for more details)

Entropy & Stable Diffusion (Generative AI)

When I was in secondary school, heard the term Entropy in science, randomness is increasing with time, used to wonder about this phenomenon & it’s potential uses.

“World constantly is moving towards disorientation”

Formally studied as a part of physics…thermodynamics (second law).

Voila, decades later realised randomness, its beauty lies in computer sciences (AI) as well.

Entropy is participant during taming/training of Machines(ML) & Information science. It quantifies data uncertainty, an indication of how much additional information is required for more accurate predictions.

History aside, concept like Diffusion and it’s Latent counterpart Stable Diffusion, brain behind current Text-to-Image/Image-to-Image generation. AI & Generative innovations like OpenAI’s DALL-E Midjourney & Stable Diffusion have revolutionized the way we interact with images.

Text-conditioned models can efficiently generate images based on text description.

How textual inputs can generate a unique unseen image is real wonder.

Deciphering how random(Gaussian) noise makes all the difference is amazing.

Noise is not the only option, new advances and research are emerging.

It seems challenge building Imagination in Machines, disturbing artist community!

Notion of iterative refinement is applied to train a diffusion model capable of turning noise into beautiful unseen synthetic images.

Stable Diffusion, an open source project, focusses on generating diverse & high-quality images through the diffusion process.

A combination of multiple technologies, following are its major components:

1. The pretrained text encoder (Open AI’s CLIP).

2. The UNet Model as noise predictor.

3. Decoder part of the autoencoder-decoder network.

Do not be Scared by mathematical derivations in proofs, it will become clear as you progress.

Diffusion models have taken the throne as state-of-the-art generative models

Seeing is believing, try stable diffusion @https://stability.ai/

For developers, have a look at article by Jay Alammar’s @ https://jalammar.github.io/illustrated-stable-diffusion. A good place to start with Jeremy Howard’s free FastAI course can help clear mist .

Hugging Face’s Diffusers library is one such implementation. Comes with pre-trained models for generating images, audio, and even 3D structures of molecules.

You can also try & play with KerasCV, another such pre- trained implementation.

Wishing you a fun filled journey…

Contact me for further details or any clarifications.

« »