這是妙筆的測試版本。妙筆,一個中文文生圖模型,與經典的stable-diffusion 1.5版本擁有一致的結構,兼容現有的lora,controlnet,T2I-Adapter等主流插件及其權重。
This is the beta version of MiaoBi, a chinese text-to-image model, following the classical structure of sd-v1.5, compatible with existing mainstream plugins such as Lora, Controlnet, T2I Adapter, etc.
- Clone the repository
git clone https://github.com/ShineChen1024/MiaoBi.git
- Create a conda environment and install the required packages
conda create -n MiaoBi-SD python==3.10
conda activate MiaoBi-SD
pip install torch==2.0.1 torchvision==0.15.2 numpy==1.25.1 diffusers==0.25.1 opencv-python==4.8.0 transformers==4.31.0 accelerate==0.21.0
- Download checkpoints
download weights from Huggingface, and put it on checkpoints folder.
from diffusers import StableDiffusionPipeline
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("checkpoints/miaobi_beta0.9/tokenizer", trust_remote_code=True)
pipe = StableDiffusionPipeline.from_pretrained("checkpoints/miaobi_beta0.9")
pipe.to("cuda")
prompt = "一只穿著鎧甲的貓"
image = pipe(prompt).images[0]
image.save("鎧甲貓.png")
- python demo
python miaobi_generate.py
- controlnet demo
python miaobi_controlnet.py
一只精致的陶瓷貓咪雕像,全身繪有精美的傳統花紋,眼睛仿佛會發光。
動漫風格的風景畫,有山脈、湖泊,也有繁華的小鎮子,色彩鮮艷,光影效果明顯。
極具真實感的復雜農村的老人肖像,黑白。
紅燒獅子頭
車水馬龍的上海街道,春節,舞龍舞獅。
枯藤老樹昏鴉,小橋流水人家。水墨畫。
妙筆的訓練數據包含Laion-5B中的中文子集(經過清洗過濾),Midjourney相關的開源數據(將英文提示詞翻譯成中文),以及我們收集的一批數十萬的caption數據。由于整個數據集大量缺少成語與古詩詞數據,所以對成語與古詩詞的理解可能存在偏差,對中國的名勝地標建筑數據的缺少以及大量的英譯中數據,可能會導致出現一些對象的混亂。如果有以上較高數據質量的伙伴,希望能完善此項目,請與我們聯系,我們將根據提供的數據訓練全新的版本。妙筆Beta0.9在8張4090顯卡上完成訓練,我們正在拓展我們的機器資源來訓練SDXL來獲得更優的結果,敬請期待。
Due to limitations in computing power and the size of Chinese datasets, the performance of Miaobi may be inferior to commercial models at this stage. We are expanding our computing resources and collecting larger scale data, looking forward to the future performance of Miaobi.