← Blog > Dynamic SSL Proxy for Jupyter Notebook

Dynamic SSL Proxy for Jupyter Notebook

We are creating docker containers for students to run a Jupyter Notebook and then embedding that notebook on Learn.co in an iframe. Because Learn.co is using SSL, the connection with the Jupyter Notebook must use SSL as well otherwise… This is happening because when the user's page loads, it makes a request to connect to […]

Reading Time 4 mins

This is happening because when the user's page loads, it makes a request to connect to a Jupyter Notebook. This request goes through a GeoDNS server, then a location based load balancer (where the SSL is terminated), until it finally hits an instance of our app (Phoeyonce) that will create a Docker container running a Jupyter Notebook for them. The app then sends back the server and port that the user's docker container is running on so that the user's webpage can connect directly to the Jupyter Notebook.

As you can see from the diagram though, our app was in charge of terminating the SSL cert. Now, we no longer have a secure connection when the client connects directly to its Jupyter Notebook container.

No big deal, why not just…

Take the quiz: What Coding Course Is Right For Me?

Why not just follow the same path to the server as you did the first time?

The problem with trying to connect through the GeoDNS and load balancer again is that we cannot be sure that we'll end up at the same server. Now we need to get back to the same server and port because it's already running our Jupyter Notebook. The problem here is that it's not the load balancer's job to send us back to the same place, but to send us to the server with the most available resources. Right from the start, it was clear that this solution really wasn't going to work.

Why not just have the app that sets up the Jupyter Notebook terminate the SSL cert?

Doing this would require that each Docker container have a copy of the cert. This would end up exposing the cert to any user that knew to dig around and look in the right place. That's definitely not something we want to allow.

Our Solution

Our solution was to setup a proxy server that would terminate the SSL and then forward the connection to the correct server. For this to work, we needed every request sent to the Jupyter Notebook to go through our ide-proxy.ide.learn.co proxy server and have it contain the address of the server that the user's container was already running on along with the port. This setup looks something like this:

This seemed like a great job for query params! We tried out doing something like ide-proxy.ide.learn.co/notebook?server=nyc-01&port=6578. Unfortunately, we ran into quite a large problem when the notebook started loading on the page and started requesting more assets from the container. When the notebook tried to request additional assets, it did not know that it needed to add these particular query params to the end of each request. After trying for some time to add these query params to all requests made by the notebook (or to the header of all requests made by the notebook), we realized this was not a viable solution. The solution that we landed on was using a particularly formatted sub-domain to act as the server and port identifier!We constructed the URL sent back from the initial request to look something like servername-port.ide-proxy.ide.learn.co (e.g. nyc-01-6578.ide-proxy.ide.learn.co). Because the nyc-01-6578 is a subdomain, the request still came into the proxy server we set up at ide-proxy.ide.learn.co. We were able to accomplish this by making the new GeoDNS a Wildcard DNS. Now all requests made by Jupyter Notebook will, in some way, contain the info to find the exact server and port that it's running on.The next step was to extract this oddly constructed subdomain at our proxy server to forward the request to the right place. To do this, we configured our nginx.conf to look as follows:

http { server { listen 443 ssl; # generic ssl cert settings...


    server_name ~^(?(.*?))-(?d+).ide-proxy.ide.learn.co$;
    location / {

      proxy_pass http://$phoeyonce_host.ide.learn.co:$jupyter_port;

      proxy_read_timeout 300s;

      proxy_set_header Host $host;

      proxy_set_header X-Real-Ip $remote_addr;

      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

    }

} }

I think the trickiest bit here is the regex at server_name – ~^(?(.*?))-(?d+).ide-proxy.ide.learn.co$ to assign phoeyonce_host as nyc-01 and jupyter_port as 6578.We can then use those variables in proxy_pass to forward all traffic to the correct location! This setup allows the ide-proxy.ide.learn.co proxy server to terminate the SSL cert, forward traffic to the correct server and port for a user's jupyter container and allow all subsequent requests to securely follow the same path!

Disclaimer: The information in this blog is current as of May 25, 2018. Current policies, offerings, procedures, and programs may differ.

Flatiron School

About Flatiron School

Related Resources

Data Science

Learn to Code Python: Free Lesson for Beginners

Behind JavaScript, HTML/CSS, and SQL, Python is the fourth most popular language with 44.1% of developers. Check out this article on how you can learn this popular programming language for free.

Announcements

Flatiron School Announces Partnership with Bletchley Institute

Flatiron School is thrilled to announce a partnership with the Bletchley Institute, an organization dedicated to creating the premier community for technologists and creatives.

Tech Trends

Quantifying Rafael Nadal’s Dominance with French Open Data

The French Open tennis tournament is underway in Paris. Learn how data science can help us understand Rafael Nadal’s success and how impressive his career has been at the clay court tournament.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Dynamic SSL Proxy for Jupyter Notebook

No big deal, why not just…

Our Solution

About Flatiron School

Related Posts

Learn to Code Python: Free Lesson for Beginners

Flatiron School Announces Partnership with Bletchley Institute

Quantifying Rafael Nadal’s Dominance with French Open Data

Learn to Code Python: Free Lesson for Beginners

Flatiron School Announces Partnership with Bletchley Institute

Quantifying Rafael Nadal’s Dominance with French Open Data