IT Administrators‎ > ‎Installation‎ > ‎

Plan for and Deploy Triaster Servers

 This article is suitable for an IT Adminstrator

Ref: 20091117
Last edited 29th September 2011

Introduction

The purpose of this article is to explain the deployment options for the Triaster Server, and the most important factors to take into account when planning a deployment.

From 2010 onwards, all Triaster server products - previously referred to as The Triaster Publication Server (for Sharing processes), The Triaster Browser Toolkit Server (for Using processes) and The Triaster Improvement Workbench Server (for Improving processes) - are all combined into a single Triaster Server. The minimum system requirements for a Triaster Server are available in the following article: What are the Minimum System Requirements?

In the smaller of deployments, a single Triaster Server operating on a single CPU is sufficient. As demand on the physical server hardware increases however it is beneficial to "spread the load" or to "scale" the Triaster solution so that more people can use the solution without loss of performance.
 
Furthermore, to ensure high levels of up-time and resilience, it is helpful to plan in advance for general classes of system failures and to provide a failover mechanism for when things go wrong, and test environments to ensure the live environment is never contaminated with maps, software or OS changes that cause failure.
 

Planning Factors

These are:
  1. Resilience: This incluces reliability, up-time and continuity of service
  2. Performance: The end-user experience of the system
  3. Security: The safeguards to prevent unauthorised updates or access
Factor 1 is concerned with ensuring the live environment that is accessed by the end users is always available, and that safeguards are in place to detect any problems well in advance of any issues spreading to the live environment.
 
The 2nd factor is with the end-user experience. The end-user is the person who accesses the process library to find information to help them perform their tasks. Generally, the end-user experience is highly dependent on the following:
  • Time taken to find what they are looking for
  • Accuracy of returned information
  • The ease with which the returned information can be understood
  • How long it takes for an update to the system to pull through to the library
Most end-users will be dissatisfied if the time taken to wait for a process map to load, or a search to return a result, is noticeable. We are dealing with fractions of a second ideally, a couple of seconds from time to time, and maybe very rarely up to 10 seconds.
 
The planning strategy then is to ensure that the end-user experience is comfortably within the page load times described above.
 
Regarding Security, organisations frequently require to separate their web content from their back-end systems. This is a complex topic, very much dependent on individual customer circumstances, and is not covered in this article.

Resilience

This section is concerned with planning for and enabling Resilience, i.e. high levels of reliability, up-time and continuity of service. The Triaster Server is a single point of failure, so planning to avoid failures, and to enable rapid recovery when they occur, is important.
 
The typical non-resilient deployment strategy is shown below:
 
 
 
 
 
Process Authors on the left create and modify processes which are then stored on a File Server. The Triaster Server then publishes and serves these maps to the end-user population via HTTP.
  
There are generally 2 classes of problems that can cause the Triaster solution not to work as intended:
  1. Something in the maps themselves (the source Visio, Excel or XML files) causes a problem. Over the years, the following issues in this bracket have been encountered, there are others too:
    • Compatability clash with other Visio solutions
    • File truncation or corruption caused by an incomplete save operation
    • The file name and page name combination when converted to a HTML file generates too long a file name for a browser to display
    • Large files (> 25mb) can sometimes take too long to load and it seems therefore as the system has 'crashed'
  2. Something in the server environment causes a problem. Over the years, the following issues in this bracket have been encountered, there are others too:
    • Any one of the issues in the maps can cause a "halt" state on the server. This normally happens because Visio itself, while running in a non-interactive service mode, enters a state requiring user input. For example, if Visio tries to publish a corrupt file in service mode, the state of the Visio session is not predictable but we have seen cases where it simply stops.
    • A known issue with Visio, wherein if hundreds of files are opened and closed in sequence (exactly as happens during a publish), Visio can enter a 'hung' state where the Visio session is live but unresponsive.
    • OS changes or updates have been known to adversely affect the operation of the Triaster Server, especially those relating to IIS.
    • Hardware fail
So, it is advised that safeguards be implemented to avoid any of these types of issues getting to the Triaster Server that is responsible for delivering content to the end-users.
 

Resilience Step 1 - Organise so that the Triaster Server is Easily Recoverable

We recommend:
 
 
 Recommendation  Reason

 No master copy of user data is stored on the Triaster Server. This includes the source process maps and any documents that are linked to.

Copies of process maps on the Triaster Server are fine (to enable faster publishing for example), and copies of documents that are linked to are fine too.

 So that over time, the only changes that take place on the Triaster Server are those associated with software updates and the output of a publish
 The Triaster Server is virtualised and backed-up  In conjunction with step 1 above, any situation that leads to a fail on the Triaster Server can be remedied by restoring from the last working back-up
 
No user data is stored on the Triaster Server, so that over time the only changes that take place on the server are those associated with software updates and the output of a publish.
 

Resilience Step 2 - Implement a Test Server

There will be a huge class of problems that are avoided if a Test Server is in-place to act as the first safeguard to preventing issues being introduced into the live system. All Triaster customers with a Trusted Partner Agreement are entitled to use a Test Server as a free benefit of that agreement.
 
A Test Server can accomplish all of the following:
  1. Test new builds of Triaster software on your maps
  2. Test configuration changes
  3. Test OS updates
  4. Test map updates and upgrades
  5. Test new sets of maps
A Test Server need not replicate the live server to the nth degree, it merely needs to trap the large class of potential problems that can be spotted by publishing, upgrading or applying OS updates.
 
A Test Server can also be a virtual machine.
 
The diagram below shows how a Test Server can be brought into the server deployment strategy.
 
 
 
 
 
Two (completely independent) servers are used. The Test Server runs alongside the Live Server, and any configuration changes or software updates are applied to it first. Publishes can also take place on the Test Server before performing publishes on the Live Server.
 
Note that the Test Server should not be confused with test sites or test libraries. For example, there may be 6 libraries on the Live Server, each with a Sandpit, Pre-live and Live site within them. These should all be replicated on the Test Server. The purpose of the Sandpit, Pre-Live and Live sites is to manage and approve content, not to test configuration changes or software updates.
 

Resilience Step 3 - Implement a Failover Server

The purpose of the Failover Server is to provide continuity of service if, for whatever reason (though generally hardware), the Live Server fails.
 
Since virtualising the Triaster Server and backing it up removes the need for a failover, we no longer recommend a dedicated Failover Server be created unless there are unusually strong availability requirements.

Performance

Scaling is the process of maintaining satisfactory levels of performance whilst at the same time increasing the end-user population and the total amount of content in the libraries.
 
The recommendation regarding scaling is to first implement Resilience as described above. The scaling steps are then simply the introduction of a cluster of Live Triaster Servers that each take specialist roles (role partitioning), service a sub-set of all the content (content partitioning), or share the processing of long tasks between them (load balancing).
 
Our recommended start-point for all customers is to implement a resilient two server cluster with role partitioning as described below.
 

Scaling Step 1 - A Resilient Two Server Cluster with Role Partitioning

Publishing maps to HTML and cloning libraries or sites is a CPU intensive task. When a publish is triggered or a site is cloned, the Triaster Server CPU can be used very heavily for several hours at a time and disk activity is intensive. If end-users are trying to access content in a library while a publish or a library clone is taking place, then page load times will increase. The first step in scaling therefore is to create a 2 server cluster with role partitioning, and dedicate one server to the role of publishing the source maps to HTML and one to the role of serving HTML to the end users.
 
This is the standard recommended implementation approach that Triaster perform for all but the smallest of organisations (up to low hundreds of desktops).
 
By isolating and dedicating one Triaster Server to the role of publishing the maps (and cloning libraries) and and one to the role serving the HTML, publishing and cloning operations will not impact on page load times or the end-user experience.
 
Furthermore, the HTML server is the only Triaster server that faces end-users and which is accessed through http. By isolating this server, into a DMZ for example, the back end processes associated with cloning and publishing can still continue.
 
 
 

Scaling Step 2 - A Resilient Three Server Cluster with Role Partitioning and Content Partitioning

Eventually, as the number of libraries grows, the publishing tmes will increase to such a degree that the delay between requesting a publish and the site being published becomes too long. 

The 2nd scaling step therefore is to add an additional Triaster Server to create a resilient three server cluster as shown below.

 
 
Two of the servers are dedicated to publishing the process maps (and cloning libraries), and a single server is devoted to serving the resultant HTML. In 10.1 of the Triaster Solution, content partitioning should be used so that each Triaster Server performing a publish role is dedicated to publishing a specific set of libraries (as opposed to load balancing wherein both servers simultaneously publish parts of the same library at the same time). In 11.1 of the Triaster Solution and later, load balancing as well as content partitioning is available.
 
The important point to note is that a Triaster Server can perform any role (test, failover, publish, HTML). So the specific needs of the project determine precisely how a 3-server cluster would be configured. 
 
For example, suppose a University has process libraries which are staff facing only, and others which are student facing. A natural partition strategy may be to dedicate two Triaster Servers to the HTML role, one for staff and one for students, both serviced by a single Triaster Server dedicated to publishing and cloning.
 
As a second example of two Triaster Servers in the HTML role, suppose a large organisation has many process libraries on a single Triaster Server, and it is now about to process map a highly secure process, secure even to internal staff. In this case, the highly secure process could be isolated onto its own Triaster Server and subject to different lock-down and security measures.
 
However, for the bulk of existing Triaster customers, additional additional publication capacity to the server cluster is the most natural scaling step.
 

Scaling Step 3 - An n-Server Cluster

The 3rd scaling step is to add more servers into the cluster, each of which takes a specialist role, or services the needs of a specific set (or partition) of libraries.
 
Suppose a global organisation requires process libraries to be delivered into several different countries. Rather than end-users in each country have to load the libraries from a Triaster Server in a different continent, the end-users' experience will be much better if they can load from a LAN. In this example, a Triaster Server would be installed on a server in each country.
 

Scaling Step 4 - Network Load Balanced Clusters

Several HTML Servers can form a Network Load Balanced Cluster (NLBC). However, this would cover the case that a specific library, that has already been isolated out on a dedicated HTML Server as part of a partitioning strategy, is under such heavy usage that the end-user experience is still poor.
 
For example, suppose there are 10,000 users of a HR library, and each year during the appraisal month the library comes under very heavy usage and the server is not sufficiently powerful to maintain page load times at a reasonable level. By clustering 2 or more HTML Servers into an NLBC, the load is split across the servers (essentially, each request to the server is allocated to the next available server) and in this way page load times can be improved.

Usage Profiles and Server Numbers

It is impossible to be too precise regarding the number of servers required to meet the needs of different user populations. 1 person accessing the library once a day is exactly the same server demand in an organisation of 10 as it is in an organisation of 10,000.
 
To help with planning decisions consider the following KPIs in respect of the performance standards you expect from the Triaster solution. What are the Typical Expected and Worst Case values you require for your project? If you are finding that performance is low in a particular area, then add server capacity as appropriate to address the specific bottleneck.
 
Performance KPI Typical Expected Performance Worst Case - very rarely encountered Comment How to improve actual performance
 Time to publish  For example 1 hour  3 hours?  Everything from the click of the Publish button to the receipt of the email confirming the publish has completed  Add a Triaster Server and dedicate it to the task of publishing. This can reduce publish times by up to 50%.
 Time to load a map in a browser  For example 1 second  10 seconds?  For example, having performed a search, this is the time taken for the map to load when the search result is selected  Add a Triaster Server and dedicate it to the HTML Role
 Time to perform a search  For example 1 second  10 seconds?  The time taken between clicking the search button and the results being displayed in the browser window   Add a Triaster Server and dedicate it to the HTML Role
 
 
Triaster work on the following rules of thumb based on the total number of people that will use the content in the library anywhere from once a day to once a year. This is based on how existing customers use the solution and the publish and page load times they are happy with as at June 2011. We will update this table regularly as we learn more from each implementation.
 
Number of Users Number of Libraries Cluster Size
 < 100 <10  1
 100-4000 <10   2
 4000-10,000 <20   3
 10,000 + 20 +  3+

Conclusion

The resilience and scaling steps outlined above provide a way for any organisation to scale the Triaster solution up to hundreds of thousands of end users, and in many different countries. A combination of partitioning and clustering provides a straightforward way to respond to increased end-user demand for the content in process libraries. 


Need further help? Contact the Triaster Support team by e-mailing support@triaster.co.uk or by calling us on +44 (0)870 402 1234.
 
Do you have any feedback or suggestions that you would like to share with Triaster? We would love to hear from you! Please e-mail feedback@triaster.co.uk 
Comments