-
Type: Task
-
Status: In Progress
-
Priority: Minor
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: QualifiedToSchedule
-
Component/s: Web Common
As explained in our documentation, having a good reverse proxy can significantly improve performances and scalability of a Nuxeo deployment.
However, currently what we are lacking is a blueprint with the associated documentation of what this reverse proxy should look like.
The best way to define this blueprint is probably to implement the best reverse proxy we can for Nuxeo Cloud and then document and publish the configuration items.
Requirements
Here are the high level requirements for this reverse proxy.
Throttling
We want to be able to limit the http throughput in order to:
- avoid DoS
- prevent one user from hammering the server
- maintain a balance between users or services
This basically means being able to:
- limit global throughput
- limit throughput on a per url pattern basis
- limit throughput on a per user basis (session, IP?)
Upload/Download buffering
We want to avoid wasting Nuxeo threads for managing slow uploads or downloads.
The Reverse proxy in from of Nuxeo should:
- buffer upload so that :
- user first uploads to the reverse proxy (potentially slow connection)
- then, when all the data has been uploaded by the user, the reverse proxy uploads to Nuxeo (LAN connection)
- buffer download so that :
- the reverse proxy downloads from Nuxeo as fast as possible to free the Nuxeo server
- the reverse proxy handles the download to the end user (potentially slow connection)
Caching
HTTP caching can be critical to an efficient resource management on the server side.
Nuxeo provides caching directives, but ideally we need the reverse proxy to do caching of all the resources that are tagged.
High Availability
Being the entry point of the Nuxeo deployment, the reverse proxy needs to be HA.
Traffic segregation
The reverse proxy can also be a good place to do traffic segregation between different types of usage:
- web vs WebDAV vs Nuxeo Drive
- internal users vs external users
- monitoring traffic vs application traffic
Building blocks
Tomcat
Inside Tomcat we can define different connectors, listening on different ports and managing different thread pools.
Apache
Apache is often installed on the Nuxeo box and can handle:
- SSL/TLS encryption
- routing between Tomcat connectors (i.e. Rewrite rules based on urls or headers)
Dedicated proxy - NGINX
NGINX provides a technical solutions for:
- throttling
- upload buffering
- my understanding is that this is the default behavior proxy_request_buffering
- download buffering
- this seems to be by default proxy_buffering
Dedicated Proxy - HAProxy
It provides the same features than NGINX (throttling, buffering by default) but seems more oriented to do proxy (hence its name). Therefore logging and analysis is far more richer than NGINX poor status pages.
It also provides a slow start mecanism that allows to send only a few requests when a backend is starting rather than a full load immediatly.