🎛️🎚️ Pixel Alchemist: Semantic image editing in realtime with a multi-parameter interface for StyleCLIP global directions

Start interactive notebook:

Edit StyleGAN-generated images in realtime with custom prompts and multiple parametric controls. Based on https://arxiv.org/abs/2103.17249

FFHQ

LSUN Churches

LSUN Car

Setup

You need a (free) ngrok authtoken for this notebook: https://dashboard.ngrok.com/get-started/your-authtoken

Open the Google Colab Notebook, make sure your runtime has a GPU, run all cells, and open the web interface from the last cell. If you run the notebook for the first time, the GUI will ask you to register for a ngrok account to get an authoken that you have to paste in the corresponding cell.

Usage

Enter any text prompt under each slider. Each model has a pre-determined list of prompts that work well, but feel free to enter whatever you can think of. You can dynamically add and remove sliders with the '+' and '-' button. The position of each slider controls how much more of that text prompt should be visible in the generated image. A negative value will decrease the presence of whatever the text prompt describes. A slider with a negative value and the prompt 'Trees' will therefore remove trees from the image. Lastly, the threshold knob on top of each slider determines how many parts of the image will be affected by a change. A low threshold value will change almost everything in the image, while a high value will only touch the most important parts. For example, the prompt 'red eyes' on ffhq will only change the color of the iris when a high threshold is applied, but might change the mouth, nose, and whole expression when the knob is fully turned to the right. When the resulting image is black, the threshold is too high and has to be decreased.

Using your own model

Refer to the StyleCLIP repository, here, to learn how to preprocess your own StyleGAN model for global directions. You can use your model with the Pixel Alchemist notebook by uploading the "fs3", "W", "S", and "S_mean_std" files from the preprocessing and your .pkl file containing your StyleGAN weights to two different Google Drive folders. Finally, you have to add the two links (which need to be formatted for gdown) to the cell responsible for downloading all models from Google Drive.

Roadmap

Initial code release
Info note for model bias & ethics
Add MIDI control
Add more datasets (Imagenet-512 StyleGAN-XL, Conditional WikiArt, StyleGAN-Human)
Edit uploaded images with inversion
Text-to-image feature

References

StyleCLIP: Patashnik, Or, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021. “Styleclip: Text-Driven Manipulation of Stylegan Imagery.” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2085–94.

Posters dataset: The images used for the graphic design dataset are courtesy of: typo/graphic posters, André Felipe Menezes, www.typo-graphicposters.com

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
PixelAlchemist.ipynb		PixelAlchemist.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎛️🎚️ Pixel Alchemist: Semantic image editing in realtime with a multi-parameter interface for StyleCLIP global directions

FFHQ

LSUN Churches

LSUN Car

Setup

Usage

Using your own model

Roadmap

References

About

Releases

Packages

Languages

titusss/PixelAlchemist

Folders and files

Latest commit

History

Repository files navigation

🎛️🎚️ Pixel Alchemist: Semantic image editing in realtime with a multi-parameter interface for StyleCLIP global directions

FFHQ

LSUN Churches

LSUN Car

Setup

Usage

Using your own model

Roadmap

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages