High availability with COTG

What are the options for high availability when using COTG?

In this instance the server is virtual.

This is what I’m thinking are the various options:

The COTG repository is hosted in the cloud, so I’m assuming that Objectif Lune has redundancy build into that piece.

  • Reinstall from backups, reactivate and configure which is hardly High Availability

    • Considering that everything is reconfigured, COTG would be the same
  • If on a virtual server, keep a clone snapshot and load the snapshot in event of failure, which is better

    • Because this is a snapshot, COTG would be the same
    • Clone can’t be running at the same time because they use the same serial number
  • Purchase and configure a backup license, manually switch to the backup server in event of failure

    • Because this would be manually switched, COTG would be the same
    • Both can be running at the same time due to separate license, but production can’t run through backup unless the production server goes down
    • Could switch backup server IP Address to Production in even of failure
  • Utilize a Load Balancer configured to utilize a configured backup license server in event of a failure

    • Both can be running at the same time due to separate license, but production can’t run through backup unless the production server goes down
    • Not sure how COTG would work with a Load Balancer
    • Assuming that their Load Balancer can be configured this way
  • Purchase another full license and utilize a Load Balancer that detects if a server is down and routes to the other server

    • Not sure how COTG would work with a Load Balancer
    • Assuming that their Load Balancer can be configured this way

The load balancer with multiple licenses is the way to go. COTG would be unaware of the presence of such a load balancer, it simply submits its jobs to a URL.

I can’t tell you how to configure the load balancer - not my area of expertise - but I do know that we have customers set up that way.

The load balancer can be set up so that it forwards specific clients to specific servers (e.g. Server 1 handles East Coast, Server 2 handles West Coast) or so that it distributes all requests across all servers (e.g. both Servers may handle traffic from both Coasts).

The former is easier to set up because you know which server handles which clients and each server can therefore have its own Connect database as there is no risk of having West Coast data on Server 1, and vice versa.

However, the latter may require the servers to share a single Connect database since a request could be coming into any of the servers while the COTG document would have been generated from a different server. Depending on the type of solution being implemented, this may or may not be an issue.

I’m not sure if the request is for load or more for backup in event of server failure.

So best practice High Availability for server load would be to add a full license. Place a Load Balancer in front of the servers.

  • Try to segregate the load so both COTG creation and COTG document serving remains on the same server.
  • If this is not possible then utilize the same Connect database for both Connect servers.

I have not used the same Connect database for multiple Connect servers before.

  • Do you simply connect two servers to the same database using Connect Server Configuration?

  • Is it correct that both servers can utilize the same COTG repository?
    Probably with additional user licenses though.

For High Availability for server failure, the Load Balance would route all the jobs to the remaining server. Existing jobs however which would be pointed to the server that is down I believe would be stuck.

  • Is there a way to recover these jobs?

Yes, you can connect both servers to the same database through the Database Connection setting shown in your screen capture. And yes, they should both use the same COTG Repository.

When one server goes down, there is no way to recover pending jobs that have already been sent to that server. One possible way to avoid some of this would be to direct most jobs to a network share to which both servers have access, but that’s not feasible for HTTP inputs since the client application would still be waiting for a reply from the host that went down.

I believe that you can use Windows Clusters to automatically switch from one Server to another (with all IP addresses being re-routed to the failover server), but that’s beyond my level of comprehension… :stuck_out_tongue:

Are you aware of any use cases that have a F5 Load Balancer with a input point to PlanetPress Connect is an NodeJS Server Input? I have cases where there is a F5 Load Balancer with a input point to PlanetPress Connect is an LPD queue.

I’m not familiar with Load Balancers as a whole, so I can’t say. But using it in front of a NodeJS input should not be a problem at all, NodeJS being so ubiquitous these days.