PocketFlow unofficial guide
PocketFlow unofficial guide¶
PocketFlow is a model compression framework open-sourced by Tencent. After a close look, I find it should be a fairly handy tool. However, there are three caveats:
- The [tutorial] is not very helpful if you want to build your own model;
- The repo is not actively maintained (sadly to see no support from Tencent);
- There are some problems with TF versions.
I am trying to address the above problems, and in the article, I am trying to provide a userful guidance to compress your own model.
Steps to build a training/compression pipeline¶
-
Clone the repo
git clone git@github.com:Tencent/PocketFlow.git cd PocketFlow
From now on, the working directory is the repo folder.
-
Set up data input pipeline
a. Create
path.conf
, a template is at./path.conf.template
. This file specifies the data path asdata_dir_local_<dataset> = ...
. The name<dataset>
is important, and will be used in below.b. Create
Dataset
class for your data at./datasets/<your dataset classe>.py
. The name of the class and the file name are not important, you will get the file path asFLAGS.data_dir_local
without the<dataset>
name here. Importantly, you need inherate theAbstractDataset
and define the following functions and properties:batch_size
-
dataset_fn
: to create a tf Dataset object. -
parse_fn
: to parse the dataset, and need to accept an argumentis_train
to differentiate training and testing pipelines.
-
Create your model
a. Create
<model>_at_<dataset>.py
. It is important to use the same<dataset>
as used in step 1. DefiningModelHelper
class is critical. Basically using existing code as a template is a good start. For the minimum, you need to define your model inforward_fn
and replace thedataset
class with yours defined above.b. Create
<model>_at_<dataset>_run.py
. It is important to use the same<dataset>
as used in step 1. In this file, basically you only need to replace theModelHelper
function from the file you defined.
Training and compression¶
- It is highly recommended to use
./script/run_local.sh nets/<model>_at_<dataset>_run.py
first to test your code and make sure PocketFlow can train it with full precision. - The compression algorithms are under the name
learner
. Find more information from the official documentation. - The optional arguments can be provided after the above running script. Unfortunately the arguments are defined spreading the whole source codes with FLAGS. Stay tuned, I will provide further guidance next time.
Fix version issues with Docker¶
Unfortunately the PocketFlow has compatability issues with Tensorflow newer versions. To be on the safe side, use Docker. I will create a brief overview of the solution:
-
Create a
./Dockerfile
as belowFROM tensorflow/tensorflow:1.10.1-gpu-py3 WORKDIR /tf ADD * /tf/ # this is the PocketFlow folder VOLUME /output # to save outputs, change your code / flags accordingly VOLUME /data # input path, don't put your data in the working folder. CMD ./script/run_local.sh nets/<model>_at_<dataset>_run.py <additional flags> ...
-
Build the docker as
docker build -t <your image name> .
- To run your docker with
bash
, rundocker run --gpus 0 --rm -it -v <your data folder>:/data -v <your output folder>:/output <your image name> bash
- To run your training script (indicated by
CMD
in theDockerfile
, usedocker run --gpus 0 --rm -v <your data folder>:/data -v <your output folder>:/output <your image name>
.
Comments