Stable Diffusion 照片修改(不使用webui)

最近沉迷于stable diffusion的研究,一般的教程都在讲述如何使用webui来生成图片。但是作为一名程序员,单纯点点点是满足不了我想理解它底层原理的欲望。

本文叙述基于Diffusers和ROCM(Amd GPU)的简单示例。

安装

注:基于Manjaro系统

新建python工程

最近感觉poetry比较火的样子 🙂

mkdir sd_script
cd sd_script
poetry new .

# 我习惯把venv放在工程目录
poetry config virtualenvs.in-project true --local
poetry env use python3.10

# 一顿乱装
poetry add diffusers transformers accelerate scipy safetensors omegaconf
# 参考:https://pytorch.org/get-started/locally/
poetry source add --priority=supplemental pytorch https://download.pytorch.org/whl/rocm5.4.2
poetry add --source pytorch torch torchvision torchaudio

# 进入venv
source .venv/bin/activate.fish

写点代码

from diffusers import StableDiffusionImg2ImgPipeline, DPMSolverMultistepScheduler
from PIL import Image

pipe = StableDiffusionImg2ImgPipeline.from_single_file(
    # https:civitai.com/models/7240
    "./model/model/MeinaMix.v13.safetensors"
)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
    pipe.scheduler.config, use_karras_sigmas=True
)

# rocm不配拥有名字
pipe.to("cuda")

# 明明不是R18 ...
if pipe.safety_checker is not None:
    pipe.safety_checker = None

# 天灵灵,地灵灵,神灵灵,祖宗灵灵。
prompt = "masterpiece, best quality, ultra detail,(2girl), upper body,close up, smile, happy, open eye, with glasses"
nprompt = "NSFW,(worst quality, low quality:1.4), (bad_prompt_version2:1), muscular, greyscale, monochrome, lineart, 2koma, 3koma, 4koma, manga, 3D, 3Dcubism, pablo picasso, disney, marvel, mutanted breasts, mutanted nipple, cropped, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry,  artist name, lowres, trademark, watermark, title, text, deformed, bad anatomy, disfigured, mutated, extra limbs, ugly, missing limb, floating limbs, disconnected limbs, out of frame, mutated hands and fingers, poorly drawn hands, malformed hands, poorly drawn face, poorly drawn asymmetrical eyes, (blurry:1.4), duplicate, (worst quality, low quality:1.4)"

init_image = Image.open('./resources/xxxxx.jpg').convert("RGB")
init_image.thumbnail((512, 512))

image = pipe(prompt=prompt, 
             negative_prompt=nprompt, 
             image=init_image, 
             strength=0.45, 
             guidance_scale=7.5,
             num_inference_steps=100
             ).images[0]
image.save("output.png")

成果

下面是一张在上海后滩拍摄的照片。

原图(人脸已扣掉,两个扣脚大汉😄)

输出(两个萌妹 😄)

留下评论