SkyPilot: Setting up reverse SSH tunneling to run Sky jobs¶
A reverse SSH tunnel is a technique used to securely connect to a client machine (usually behind a firewall or NAT) from a server. It establishes a connection from the client to the server, allowing the server to initiate communication with the client.
In some cases, if the Sky cluster cannot reach the DKubeX cluster, in that case running jobs like ingestion, LLM evaluation, etc cannot be done by SkyPilot as is. In these scenarios setting up a reverse SSH tunnel between the two clusters become necessary.
For example, let’s assume you have run an data ingestion job with this command- d3x dataset ingest -d demodataset --sky-cluster=ingestion-demodataset --dkubex-url https://123.45.67.89:32443/ --dkubex-apikey ${DKUBEX_APIKEY} --config /home/data/ingest.yaml --remote-sky and you get a Weaviate error. This is because the Sky cluster cannot reach back to your DKubeX cluster. In this case, you will need to set up a reverse SSH tunneling.
Follow the steps provided below to set up a reverse SSH tunnel between the DKubeX and the Sky cluster.
Run the following command and note down the Head IP for the Sky cluster. This will be necessary later.
d3x sky status -raOn your local terminal, log (SSH) into your DKubeX installer node/DKubeX backend CLI.
Exec into the Sky-Controller pod in the DKubeX backend by running the following command:
kubectl exec -it -n d3x sky-controller-v2-0 -- bashGet the public key for the Sky cluster and save it as a file named
key-ssh-skyby running the following command.cat /root/.ssh/sky-key > key-ssh-skySet only owner read permission to the
key-ssh-skyfile by running the following command.chmod 400 key-ssh-skyRun
cat key-ssh-skywhich will show the contents of the file. Copy the entire contents of the file.Run
exitto exit from the Sky-Controller pod. Here, runvim key-ssh-skyto create the same file we created in the Sky-Controller pod, paste the content we copied in the previous step, and save the file.Start the reverse SSH tunnel by running the following command. Make sure to replace the
<sky-cluster-ip>part with the Head IP of the Sky cluster we got earlier. This command will keep on running- do not stop it or close this terminal.ssh -NT -R 32443:localhost:32443 ubuntu@<sky-cluster-ip> -i key-ssh-skyssh -NT -R 32443:localhost:32443 ubuntu@98.76.54.32 -i key-ssh-skyOpen another local terminal and log (SSH) into your DKubeX installer node/DKubeX backend CLI.
Log (SSH) into your Sky cluster using the file
key-ssh-skywe created earlier by running the following command. Make sure to replace the<sky-cluster-ip>part with the Head IP of the Sky cluster we got earlier.ssh ubuntu@<sky-cluster-ip> -i key-ssh-skyssh ubuntu@98.76.54.32 -i key-ssh-skyVerify reverse proxy and port mapping by running the following command:
curl -k https://localhost:32443/mlflowIf the response given by running the command is similar to the following, reverse tunneling is successful.
<html> <head><title>302 Found</title></head> <body> <center><h1>302 Found</h1></center> <hr><center>nginx</center> </body> </html>
Go back to your DKubeX workspace CLI and run the same command you used to run the job earlier with one change:
In the
--dkubex-urlflag instead of passing your DKubeX setup URL, usehttps://localhost:32443For example,
d3x dataset ingest -d demodataset --sky-cluster=ingestion-demodataset --dkubex-url https://localhost:32443 --dkubex-apikey ${DKUBEX_APIKEY} --config /home/data/ingest.yaml --remote-sky
Now your Sky job will run properly.
Attention
You need to set up the reverse SSH tunnel again in the following scenarios:
If the Sky cluster is downed/autodowned.
If you are creating a new Sky cluster and want to run jobs on it.