PocketFlow unofficial guide

PocketFlow unofficial guide

PocketFlow is a model compression framework open-sourced by Tencent. After a close look, I find it should be a fairly handy tool. However, there are three caveats:

  1. The [tutorial] is not very helpful if you want to build your own model;
  2. The repo is not actively maintained (sadly to see no support from Tencent);
  3. There are some problems with TF versions.

I am trying to address the above problems, and in the article, I am trying to provide a userful guidance to compress your own model.

Steps to build a training/compression pipeline

  1. Clone the repo

    git clone git@github.com:Tencent/PocketFlow.git
     cd PocketFlow
    

    From now on, the working directory is the repo folder.

  2. Set up data input pipeline

    a. Create path.conf, a template is at ./path.conf.template. This file specifies the data path as data_dir_local_<dataset> = .... The name <dataset> is important, and will be used in below.

    b. Create Dataset class for your data at ./datasets/<your dataset classe>.py. The name of the class and the file name are not important, you will get the file path as FLAGS.data_dir_local without the <dataset> name here. Importantly, you need inherate the AbstractDataset and define the following functions and properties:

    • batch_size
    • dataset_fn: to create a tf Dataset object.
    • parse_fn: to parse the dataset, and need to accept an argument is_train to differentiate training and testing pipelines.
  3. Create your model

    a. Create <model>_at_<dataset>.py. It is important to use the same <dataset> as used in step 1. Defining ModelHelper class is critical. Basically using existing code as a template is a good start. For the minimum, you need to define your model in forward_fn and replace the dataset class with yours defined above.

    b. Create <model>_at_<dataset>_run.py. It is important to use the same <dataset> as used in step 1. In this file, basically you only need to replace the ModelHelper function from the file you defined.

Training and compression

  • It is highly recommended to use ./script/run_local.sh nets/<model>_at_<dataset>_run.py first to test your code and make sure PocketFlow can train it with full precision.
  • The compression algorithms are under the name learner. Find more information from the official documentation.
  • The optional arguments can be provided after the above running script. Unfortunately the arguments are defined spreading the whole source codes with FLAGS. Stay tuned, I will provide further guidance next time.

Fix version issues with Docker

Unfortunately the PocketFlow has compatability issues with Tensorflow newer versions. To be on the safe side, use Docker. I will create a brief overview of the solution:

  1. Create a ./Dockerfile as below

    FROM tensorflow/tensorflow:1.10.1-gpu-py3
    
    WORKDIR /tf
    ADD * /tf/   # this is the PocketFlow folder
    
    VOLUME /output  # to save outputs, change your code / flags accordingly
    VOLUME /data    # input path, don't put your data in the working folder.
    
    CMD ./script/run_local.sh nets/<model>_at_<dataset>_run.py <additional flags> ...
  2. Build the docker as docker build -t <your image name> .

  3. To run your docker with bash, run docker run --gpus 0 --rm -it -v <your data folder>:/data -v <your output folder>:/output <your image name> bash
  4. To run your training script (indicated by CMD in the Dockerfile, use docker run --gpus 0 --rm -v <your data folder>:/data -v <your output folder>:/output <your image name>.

Comments