Counting 1,417 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1


zi2zi: Master Chinese Calligraphy with Conditional Adversarial Networks



Learning eastern asian language typefaces with GAN. zi2zi(字到字, meaning from character to character) is an application and extension of the recent popular pix2pix model to Chinese characters.

Details could be found in this blog post.

Network Structure

alt network

The network structure is based off pix2pix with the addition of category embedding and two other losses, category loss and constant loss, from AC-GAN and DTN respectively.


Compare with Ground Truth


Brush Writing Fonts


Cursive Script (Requested by SNS audience)


Random Gaussian Style







animation animation

easter egg

How to Use

Step Zero

Download tons of fonts as you please


  • Python 2.7
  • CUDA
  • cudnn
  • Tensorflow >= 1.0.1
  • Pillow(PIL)
  • numpy >= 1.12.1
  • scipy >= 0.18.1
  • imageio


To avoid IO bottleneck, preprocessing is necessary to pickle your data into binary and persist in memory during training.

First run the below command to get the font images:

python --src_font=src.ttf

Four default charsets are offered: CN, CN_T(traditional), JP, KR. You can also point it to a one line file, it will generate the images of the characters in it. Note, filter option is highly recommended, it will pre sample some characters and filter all the images that have the same hash, usually indicating that character is missing. label indicating index in the category embeddings that this font associated with, default to 0.

After obtaining all images, run to pickle the images and their corresponding labels into binary format:

python --dir=image_directories

After running this, you will find two objects train.obj and val.obj under the save_dir for training and validation, respectively.

Experiment Layout

└── data
    ├── train.obj
    └── val.obj

Create a experiment directory under the root of the project, and a data directory within it to place the two binaries. Assuming a directory layout enforce bettet data isolation, especially if you have multiple experiments running.


To start training run the following command

python --experiment_dir=experiment 

schedule here means in between how many epochs, the learning rate will decay by half. The train command will create sample,logs,checkpoint directory under experiment_dir if non-existed, where you can check and manage the progress of your training.

Infer and Interpolate

After training is done, run the below command to infer test data:

python --model_dir=checkpoint_dir/ 
                --embedding_ids=label[s] of the font, separate by comma

Also you can do interpolation with this command:

python --model_dir= checkpoint_dir/ 
                --embedding_ids=label[s] of the font, separate by comma

It will run through all the pairs of fonts specified in embedding_ids and interpolate the number of steps as specified.

Pretrained Model

Pretained model can be downloaded here which is trained with 27 fonts, only generator is saved to reduce the model size. You can use encoder in the this pretrained model to accelerate the training process.


Code derived and rehashed from:


Apache 2.0