Trusted FL (Federated Learning)#
Federated Learning is a new tool in PPML (Privacy Preserving Machine Learning), which empowers multi-parities to build a united model across different parties without compromising privacy, even if these parties have different datasets or features. In FL training stage, sensitive data will be kept locally, and only temp gradients or weights will be safely aggregated by a trusted third-party. In our design, this trusted third-parity is fully protected by Intel SGX.
A number of FL tools or frameworks have been proposed to enable FL in different areas, i.e., OpenFL, TensorFlow Federated, FATE, Flower and PySyft etc. However, none of them is designed for Big Data scenarios. To enable FL in big data ecosystem, BigDL PPML provides a SGX-based End-to-end Trusted FL platform. With this platform, data scientists and developers can easily setup FL applications upon distributed large-scale datasets with a few clicks. To achieve this goal, we provide the following features:
ID & feature align: figure out portions of local data that will participate in the training stage
Horizontal FL: training across multi-parties with the same features and different entities
Vertical FL: training across multi-parties with the same entries and different features.
To ensure sensitive data are fully protected in the training and inference stages, we make sure:
Sensitive data and weights are kept local, only temp gradients or weights will be safely aggregated by a trusted third-party
Trusted third-party, i.e., FL Server, is protected by SGX Enclaves
Local training environment is protected by SGX Enclaves (recommended but not enforced)
Network communication and Storage (e.g., data and model) protected by encryption and Transport Layer Security (TLS)](https://en.wikipedia.org/wiki/Transport_Layer_Security)
That is, even when the program runs in an untrusted cloud environment, all the data and models are protected (e.g., using encryption) on disk and network, and the compute and memory are also protected using SGX Enclaves.
Prerequisite#
Please ensure SGX is properly enabled, and SGX driver is installed. If not, please refer to the Install SGX Driver.
Prepare Keys & Dataset#
Generate the signing key for SGX Enclaves
Generate the enclave key using the command below, keep it safely for future remote attestations and to start SGX Enclaves more securely. It will generate a file
enclave-key.pem
in the current working directory, which will be the enclave key. To store the key elsewhere, modify the output file path.cd scripts/ openssl genrsa -3 -out enclave-key.pem 3072 cd ..
Then modify
ENCLAVE_KEY_PATH
indeploy_fl_container.sh
with your path toenclave-key.pem
.Prepare keys for TLS with root permission (test only, need input security password for keys). Please also install JDK/OpenJDK and set the environment path of the java path to get
keytool
.cd scripts/ ./generate-keys.sh cd ..
When entering the passphrase or password, you could input the same password by yourself; and these passwords could also be used for the next step of generating other passwords. Password should be longer than 6 bits and contain numbers and letters, and one sample password is “3456abcd”. These passwords would be used for future remote attestations and to start SGX enclaves more securely. And This script will generate 6 files in
./ppml/scripts/keys
dir (you can replace them with your own TLS keys).keystore.jks keystore.pkcs12 server.crt server.csr server.key server.pem
If run in container, please modify
KEYS_PATH
tokeys/
you generated in last step indeploy_fl_container.sh
. This dir will mount to container’s/ppml/trusted-big-data-ml/work/keys
, then modify theprivateKeyFilePath
andcertChainFilePath
inppml-conf.yaml
with container’s absolute path. If not in container, just modify theprivateKeyFilePath
andcertChainFilePath
inppml-conf.yaml
with your local path. If you don’t want to build tls channel with certificate, just delete theprivateKeyFilePath
andcertChainFilePath
inppml-conf.yaml
.Prepare dataset for FL training. For demo purposes, we have added a public dataset in BigDL PPML Demo data. Please download these data into your local machine. Then modify
DATA_PATH
to./data
with absolute path in your machine and your local ip indeploy_fl_container.sh
. The./data
path will mount to container’s/ppml/trusted-big-data-ml/work/data
, so if you don’t run in container, you need to modify the data path inrunH_VflClient1_2.sh
.
Prepare Docker Image#
Pull image from Dockerhub
docker pull intelanalytics/bigdl-ppml-trusted-fl-graphene:2.1.0-SNAPSHOT
If Dockerhub is not accessible, you can build docker image. Modify your http_proxy
in build-image.sh
then run:
./build-image.sh
Start FLServer#
Before starting any local training client or worker, we need to start a Trusted third-parity, i.e., FL Server, for secure aggregation. In our design, this FL Server is running in SGX with help of Graphene or Occlum. Local workers/Clients can verify its integrity with SGX Remote Attestation.
Running this command will start a docker container and initialize the SGX environment.
bash deploy_fl_container.sh
sudo docker exec -it flDemo bash
./init.sh
In container, run:
./runFlServer.sh
The fl-server will start and listen on 8980 port. Both horizontal fl-demo and vertical fl-demo need two clients. You can change the listening port and client number by editing BigDL/scala/ppml/demo/ppml-conf.yaml
’s serverPort
and clientNum
.
Note that we skip ID & Feature for simplifying demo. In practice, before we start Federated Learning, we need to align ID & Feature, and figure out portions of local data that will participate in later training stages. In horizontal FL, feature alignment is required to ensure each party is training on the same features. In vertical FL, both ID and feature alignment are required to ensure each party training on different features of the same record.
HFL Logistic Regression#
Open two new terminals, run:
sudo docker exec -it flDemo bash
to enter the container, then in a terminal run:
./runHflClient1.sh
in another terminal run:
./runHflClient2.sh
Then we start two horizontal fl-clients to cooperate in training a model.
VFL Logistic Regression#
Open two new terminals, run:
sudo docker exec -it flDemo bash
to enter the container, then in a terminal run:
./runVflClient1.sh
in another terminal run:
./runVflClient2.sh
Then we start two vertical fl-clients to cooperate in training a model.
References#
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated Machine Learning: Concept and Applications. ACM Trans. Intell. Syst. Technol. 10, 2, Article 12 (February 2019), 19 pages. DOI:https://doi.org/10.1145/3298981