What is a SLA?

A service level agreement is a target of performance and quality for a service that is agreed to by the provider and client. SLA attributes and KPIs (Key Performance Indicators) to be measured for any service is dependent upon the value perceived by each of the stakeholders. Stakeholders include business owners, business managers, IT operations and development, administrators, executives, and also customers and partners.

A well defined, managed collection of SLAs allows the business to provide higher value and higher margin services to customers and preferred business channels. This is applicable in traditional business, internet commerce, and also intra-enterprise activities. SLA driven services can be monitored specifically to provide transaction, billing and penalty attributes to assist the business to track and manage overall performance.  A collection of SLAs can also be looked on as a policy – a business policy, a compliance policy, a corporate policy, etc.

Business owners and executives are typically provided the least amount of information and support in the critical aspect of managing information technology.  sense re-balances the equation and provides a clear way to assign business policies that are dynamically used at runtime to enforce SLAs. sense business SLAs  enable the business to specify a performance level requirement on each of the service elements (and applications) participating in the cloud. sense is able to maintain the status and management of each single component in the federation.   sense and the sense managed nodes select the best component in the cloud that can respond to an incoming request. This dynamic runtime evaluation, decision and action execution allows the system to auto-tune depending upon circumstances at the moment of the request.  Scaling for additional load, bypassing and rerouting out-of-band services, instantiating new resources from the cloud to support higher value services, and other autonomic behaviors are simple examples of the dynamic capabilities of sense.

These capabilities increase the guarantee level of a business service to match the desired business SLA.   Extensive administrative and operational intervention is not needed along with the ability to limit and ‘throttle’ costs according to business policies limits.  Evaluations and actions are made transparently visible to the business users (if desired).

Logical approach to SLAs in sense

Sensible Cloud has adopted a logical and consistent approach to SLAs in sense.  Implementation and extension of the patterns described within the European Union (EU) research project SLA@SOI are currently underway.  The approach of ‘bottom up’ adopts the following principle steps:

  • Analyze a business service and define its KPIs noting information on various entities and their interaction.  Then measure and ensure some level of service is provided to the requester
    • together with business stakeholders
  • Define one fact (or occasionally more) for each KPI -  definite and measurable – for use in the core logic development of the SLA within the embedded Business Rules Engine
    • dialogue between a business analyst and the SLA Rules implementer
  • HIGH – service is operating within normal bands and can easily accept more instances;
  • MEDIUM – service is loaded but can host additional instances;
  • LOW – service is very busy and it is appropriate to instruct a scaling strategy;
  • CRITICAL – service could stop functioning at any moment -strongly recommended to provide an alternative service provider for response or  assist the stressed service provider in some way;
  • NO_SERVICE – service is not accepting new instances because no more computational space is available or the service and network is down.  A mock strategy is activated when the service is declared with this status.  This lets the service, at minimum, to respond normally to the request (avoiding a domino effect).   sense can trigger self-healing strategies and send notifications to any nominated services and message managers. See here for more on mock strategy.
  • (NONE) – null threshold
  • Decide on a number of threshold levels for each fact/KPI – each of 5 levels (plus a null level) can be defined

This activity is conducted with the business stake holder and/or analyst

  • Design actions at each threshold to ensure management is according to business policy and business deliverables for each fact/KPI threshold.  Actions may include incremental scale of SLA stress from early notification of a low level increase, to automated provisioning of cloud resources, to applications begin secure transfer of data and compute cycles to a disaster recovery partner.
    • available provisioning actions will depend on the technology base of the services – in this case elastic IaaS provisioning will help.  A set of provisioning steps can be described and developed in sense.  Also a higher level instructions set can be passed to an external provisioning environment
    • sense contains a simple sequence and an orchestrating construct that enables more complex processes to be managed directly

A logical view of this approach is designed here: