PocketFlow unofficial guide

Jiayi (Jason) Liu

2019-12-14 00:20

PocketFlow unofficial guide¶

PocketFlow is a model compression framework open-sourced by Tencent. After a close look, I find it should be a fairly handy tool. However, there are three caveats:

The [tutorial] is not very helpful if you want to build your own model;
The repo is not actively maintained (sadly to see no support from Tencent);
There are some problems with TF versions.

I am trying to address the above problems, and in the article, I am trying to provide a userful guidance to compress your own model.

Steps to build a training/compression pipeline¶

Clone the repo
```
git clone git@github.com:Tencent/PocketFlow.git
 cd PocketFlow
```
From now on, the working directory is the repo folder.
Set up data input pipeline

a. Create path.conf, a template is at ./path.conf.template. This file specifies the data path as data_dir_local_<dataset> = .... The name <dataset> is important, and will be used in below.

b. Create Dataset class for your data at ./datasets/<your dataset classe>.py. The name of the class and the file name are not important, you will get the file path as FLAGS.data_dir_local without the <dataset> name here. Importantly, you need inherate the AbstractDataset and define the following functions and properties:
- batch_size
- dataset_fn: to create a tf Dataset object.
- parse_fn: to parse the dataset, and need to accept an argument is_train to differentiate training and testing pipelines.
Create your model

a. Create <model>_at_<dataset>.py. It is important to use the same <dataset> as used in step 1. Defining ModelHelper class is critical. Basically using existing code as a template is a good start. For the minimum, you need to define your model in forward_fn and replace the dataset class with yours defined above.

b. Create <model>_at_<dataset>_run.py. It is important to use the same <dataset> as used in step 1. In this file, basically you only need to replace the ModelHelper function from the file you defined.

Training and compression¶

It is highly recommended to use ./script/run_local.sh nets/<model>_at_<dataset>_run.py first to test your code and make sure PocketFlow can train it with full precision.
The compression algorithms are under the name learner. Find more information from the official documentation.
The optional arguments can be provided after the above running script. Unfortunately the arguments are defined spreading the whole source codes with FLAGS. Stay tuned, I will provide further guidance next time.

Fix version issues with Docker¶

Unfortunately the PocketFlow has compatability issues with Tensorflow newer versions. To be on the safe side, use Docker. I will create a brief overview of the solution:

Create a ./Dockerfile as below

FROM tensorflow/tensorflow:1.10.1-gpu-py3

WORKDIR /tf
ADD * /tf/   # this is the PocketFlow folder

VOLUME /output  # to save outputs, change your code / flags accordingly
VOLUME /data    # input path, don't put your data in the working folder.

CMD ./script/run_local.sh nets/<model>_at_<dataset>_run.py <additional flags> ...

Build the docker as docker build -t <your image name> .
To run your docker with bash, run docker run --gpus 0 --rm -it -v <your data folder>:/data -v <your output folder>:/output <your image name> bash
To run your training script (indicated by CMD in the Dockerfile, use docker run --gpus 0 --rm -v <your data folder>:/data -v <your output folder>:/output <your image name>.