APM (Application Performance Management) spans both production performance management leveraging synthetic transactions and quality assurance utilizing comprehensive performance testing and analysis. APM (Application Performance Management) is not a substitution for another commonly misused buzzword: BAM (Business Activity Management). Application Performance Management initiatives focus on the provision of processes and technologies to support proactive performance monitoring of the production environment, rigorous approaches to test application performance prior to acceptance for production operation, as well as processes and technologies to proactively determine system scalability.
The primary objective of APM (Application Performance Management) is predictable assurance of application performance and scalability through proactive production performance monitoring and optimization, rigorous end-to-end performance assurance for systems and applications, and methods for proactively determining system scalability.
CMS Application Performance Management
by: California State University
REVISION CONTROL
Document Title: | CMS Application Performance Management Process |
Author: | CMS Performance Action Team |
File Reference: | CMS Application Performance Management Process_20070924.doc |
Date | By | Action | Pages |
09/24/07 | Y Hepperle | Release of New Document | All |
10/04/07 | Y. Hepperle | Added Benchmark and replication clarification | 1,3,4 |
10/15/07 | G. Mansoor | Updated, PSPing Clarification (pg 5), | 5, |
10/22/07 | G. Mansoor | Added new CTIs that will facilitate description posting. | 5-8 |
Review/Approval History
Date | By | Action | Pages |
9/24/07 | CSU Performance Action Team | Review document and approve for initial draft posting. | All |
10/15/07 | CSU Performance Action Team | Review document and approve for posting | All |
10/22/07 | CSU Performance ActionTeam | Review document and approve for posting | All |
Table of Contents
Page
1.0…. Purpose. 4
2.0…. Objectives. 4
3.0…. Benchmarks Defined. 4
4.0…. Performance Issue Categorization. 4
Category 1: General Performance Issue. 4
Category 2: On-line Transaction Performance Issue. 4
Category 3: Batch Performance Issue. 4
Module Identification – Remedy Item.. 5
Process Type Identification – Remedy Item.. 5
5.0…. Reporting a Performance Problem.. 5
Category 1: General Performance Category. 5
Category 2: On-line Transaction Performance Issue. 6
Category 3: Batch Performance Issue. 8
5.1 Incomplete Tickets. 10
1.0 Purpose
Ensuring that CMS applications perform to a high standard is an ongoing challenge and a priority for CMS and the CSU. Resolution of periodic performance problems and optimizing the performance of current and future releases requires the cooperation and collaboration of all involved in the CMS service model: Providers (Unisys and Oracle), Central CMS, and each campus. Successful resolution of problems and success in defining performance metrics and conducting periodic performance measurement and analysis will require an organized approach and a concerted effort by all concerned.
2.0 Objectives
- Identify an efficient and expeditious process for managing and resolving performance related issues that impact the stability and usability of the CMS applications.
- Clearly define roles and responsibilities to alleviate any confusion
- Establish expectations for how reported performance issues will be managed.
3.0 Benchmarks Defined
The term “benchmark” is referred to several times in this document. There is presently no accurate benchmark for every possible process in the system under every type of scenario that exists out at the campuses. Benchmark, as related in this context is defined as follows:
Obvious performance change – always runs 30 seconds and now running 16 hours
Historical comparison – batch process scheduler indicates always 5 minutes, now taking 20 minutes
Compared against baseline – campus process compared against the baseline
Compare against published standards – will be established as we move forward (in progress)
General – PS Ping of performance issue compared against campus “benchmark” PS Ping
4.0 Performance Issue Categorization
Reviewing the types of currently reported or otherwise known performance issues, CMS has determined that the following issue categorization will facilitate the management of performance problems.
Category 1: General Performance Issue
This type of performance issue is based on reports of overall system slowness that at initial report does not appear to be associated with any specific process or module. Indicators of overall slowness may include slow login and navigation as well as degradation in overall performance metrics. Campuses reporting this sort of problem will be required to provide data on overall performance metrics. The Remedy/Category/Item/Type of CMS/Performance/General should be used.
Category 2: On-line Transaction Performance Issue
This type of performance issue is based on reports that a specific transaction is slow based on benchmarks or previous norms. These transactions may also be “timing out” and generating a white page response but not generating a functional error. Campuses reporting this sort of problem will be required to provide specific component information including basis for reporting a problem. The Remedy/Category/Item/Type of CMS/Performance/<MODULE> should be used.
Category 3: Batch Performance Issue
This type of performance issue is based on reports that a batch process is slow based on benchmarks or previous norms. The process may also be “timing out” but is not generating a functional error. This category does not include report repository posting or other output distribution-type issues. This category only covers the processing time of the process. Campuses reporting this sort of problem will be required to provide specific process or job name as well as basis for reporting a problem. The Remedy/Category/Item/Type of CMS/Performance/<MODULE> should be used.
Module Identification – Remedy Item
When reporting online or batch transactions, users must identify a module area when submitting a problem ticket. The module area can help expedite getting the ticket to the appropriate support team. The following Modules/Items have been identified
HR – This item to be used for any application component or process that is related to the HR functional area. This item should also be used for any Baseline HR query run via the Query Viewer that where the query results are slow to produce.
Finance – This item to be used for any application component or process that is related to the Finance functional area. This item should also be used for any Baseline Finance query run via the Query Viewer that where the query results are slow to produce.
SA – This item to be used for any application component or process that is related to the Student Administration functional area. This item should also be used for any Baseline Student Administration query run via the Query Viewer that where the query results are slow to produce.
Tools – This item to be used for any application component or process that is not related to any functional area of the application. Examples would include Security Administration or Tree Manager. Additionally, Query Manager/Query Viewer issues where a performance issue is with the tool itself (Component is slow to open) should be reported under this category. If the issue is with running a specific query, then the appropriate functional application module area should be used.
Campus – This item will be used to track campus modifications that are identified by CMS Central as a modification that appears to be impacting performance and should be reviewed by the campus. CMS will initiate a ticket with this C/T/I. This ticket will be assigned to the campus for action.
Process Type Identification – Remedy Item
When reporting a problem in a module area, Remedy items have been created to distinguish between batch or online processes. When selecting the Remedy item, please ensure that the correct suffix-type is selected. This will allow Remedy to default a form in the description field requesting the basic information elements required for that problem type.
5.0 Reporting a Performance Problem
This section describes the procedure campuses will follow to report a performance problem.
Category 1: General Performance Category
Under Category 1, two types of issues are identified.
A) Issue is occurring now
This problem should generally be reported as an urgent severity if it is impacting production. Severity for development environments will be based on current severity standards documented in the ITService Center Problem Management Procedures. Campuses will have validated this claim by reviewing current general performance metrics generated by the PeopleSoft PS Ping utility against previously submitted benchmark data. See CMS Performance Management page at http://cms.calstate.edu/06_Projects-Initiatives/06E_Performance/CMS_06E_00_PROJ_Performance.asp for additional information on the PSPing utility.
PSPing data submitted for this purpose should be run for a minimum of 10 minutes with a repeat time interval of 5 seconds. If system performance is degraded to point that campus support personnel cannot generate PS Ping data, please note that in the Business Justification field.
Campuses will open a ticket in Remedy following these guidelines
C/T/I | CMS/Performance/General |
Severity | Urgent if production; for development based on impact to business operations (see IT Service Center Problem Management Procedures available at http://cms.calstate.edu/11_OSS/OSS_ITServiceCenter-Home.asp) |
Summary | <DBNAME> General performance issues |
Description | <DBNAME> General performance issues |
Business Justification* | Review of PS Ping data and analysis of degradation against benchmark. |
Attachments | PS Ping data in Excel format. |
Ticket handling
Production issues will be handled as an urgent case and worked until resolved or issue identified as not being a general performance problem. Issue will be tracked as appropriate after resolution.
B) Issue was occurring (no longer current)
This problem should generally be reported as an information ticket.
Campuses will open a ticket in Remedy following these guidelines
C/T/I | CMS/Performance/General |
Severity | INFORMATION |
Summary | <DBNAME> General performance issues |
Description | <DBNAME> General performance issues. Duration and impact of event to include number of users reporting problem an earliest and last known times of occurrence Note: If campus support personnel were able to generate a PS Ping snapshot of issue, please do attach this problem. |
Ticket handling
Tickets will be reviewed and placed on trending report but will be closed after review.
Category 2: On-line Transaction Performance Issue
Category 2 problems should be reported with a severity based on the impact to campus business operations (see IT Service Center Problem Management Procedures available at http://cms.calstate.edu/11_OSS/OSS_ITServiceCenter-Home.asp). Campuses will be required to provide the specific component name in the ticket as well as other request information listed below. A key component to the timely resolution of this problem type is being able to replicate the problem. Campuses experiencing this sort of issue in production should replicate the problem in a current copy of production and submit trace files from both production and development.
Issue Replication:
– Campuses are required to replicate any Category 2 (specific online) or Category 3 (specific batch) issues they report.
– The campus will need to clone a current copy of production into a campus development instance.
– This is required for CMS Central to troubleshoot and test any possible solutions.
– Please note that the campus is NOT required to demonstrate the same issue in the development copy. If the process performs as designed in development then that is a valid replication.
– CMS Central will work a ticket that can not be recreated as long as the campus sets up the same scenario in development. This provides insight into the issue and will guide our troubleshooting methodology.
Campuses will open a ticket in Remedy following these guidelines
C/T/I | CMS/Performance/<MODULE> – On-line Where <MODULE> is application module area of HR, SA or Finance |
Severity | Based on impact to business operations (see IT Service Center Problem Management Procedures available at http://cms.calstate.edu/11_OSS/OSS_ITServiceCenter-Home.asp) |
Summary | <DBNAME> <COMPONENT NAME> |
Description | In the description, campus must complete the information requested below. This basic form will be defaulted in the Remedy description field and must be completed.[BEGIN BLOCK]Navigation to component: When was this last working: What technical changes since last working: What CMS application changes since last working: What COMR application changes since last working: What application setup changes since last working: What steps have been performed to troubleshoot problem: Development environment replicated problem in: Userids reporting problem: Relevant data for component (search values, etc): Are all users of this component impacted: YES or NO What error message (if any)is displayed: Any additional information: [END BLOCK] |
Business Justification | In this field, campus should provide basis for identifying this component as poor performing. Information here should include current transaction time as well as previously know good transaction time. |
Attachments | Trace files from production and development. Screen shots if applicable. |
Ticket handling
The ticket will be reviewed by the CMS Performance Action Team to ensure that the ticket contains enough information to begin an analysis of the problem.
If the performance issue reported for a component is significantly different to the established benchmarks, then the analyst will work with the campus, CMS Application team, Technical Services and the Service Center to obtain the following information:
Detailed information on the changes to the campus database since the last time the component was working within the established benchmark
- Replication of the issue in a campus cloned database
- Creation of a trace file if necessary from the campus database
- Technical analysis for the component – which may include the counts of the main tables in the component, index information, etc
- Verification on whether the problem has been reported by other campuses
If the performance issue reported for a component is within the established benchmarks, but is deemed as unacceptable by the campus, then the analyst will put the ticket into a pending status for future consideration.
Category 3: Batch Performance Issue
This problem should be reported with a severity based on the impact to campus business operations (see IT Service Center Problem Management Procedures available at http://cms.calstate.edu/11_OSS/OSS_ITServiceCenter-Home.asp). Campuses will be required to provide the specific process or job name in the ticket as well as other request information listed below. A key component to the timely resolution of this problem type is being able to replicate the problem. Campuses experiencing this sort of issue in production should replicate the problem in a current copy of production and submit trace files from both production and development.
Issue Replication:
– Campuses are required to replicate any Category 2 (specific online) or Category 3 (specific batch) issues they report.
– The campus will need to clone a current copy of production into a campus development instance.
– This is required for CMS Central to troubleshoot and test any possible solutions.
– Please note that the campus is NOT required to demonstrate the same issue in the development copy. If the process performs as designed in development then that is a valid replication.
– CMS Central will work a ticket that can not be recreated as long as the campus sets up the same scenario in development. This provides insight into the issue and will guide our troubleshooting methodology.
Campuses will open a ticket in Remedy following these guidelines
C/T/I | CMS/Performance/<MODULE> – Batch Where <MODULE> is application module area of HR, SA or Finance |
Severity | Based on impact to business operations (see IT Service Center Problem Management Procedures available at http://cms.calstate.edu/11_OSS/OSS_ITServiceCenter-Home.asp) |
Summary | <DBNAME> <PROCESS/JOB NAME> |
Description | In the description, campus must complete the information requested below. This basic form will be defaulted in the Remedy description field and must be completed.[BEGIN BLOCK] Navigation to run control: When was this last working: What technical changes since last working: What CMS application changes since last working: What COMR application changes since last working: What application setup changes since last working: What steps have been performed to troubleshoot problem: Development environment replicated problem in: Relevant data for run control: Process ID: Last 3 execution times with volume metrics: Any additional information: [END BLOCK] |
Business Justification | In this field, campus should provide basis for identifying this component as poor performing. |
Attachments | Trace files from production and development. Screen shots of run control pages. |
Ticket handling
The ticket will be reviewed by the CMS Performance Action Team to ensure that the ticket contains enough information to begin an analysis of the problem.
If the performance issue reported for a batch process is significantly different to the established benchmarks, then the analyst will work with the campus, CMS Application team, Technical Services and the Service Center to obtain the following information:
Information on the changes to the campus database since the last time the batch object was working within the established benchmark
- Replication of the issue in a campus cloned database
- Creation of a trace file if necessary from the campus cloned database
- Technical analysis for the batch process – which may include the counts of the main tables in the batch process, previous run history for the batch process, etc
- Verification on whether the problem has been reported by other campuses
- If the performance issue reported for the batch process is within the established benchmarks, but is deemed as unacceptable by the campus, then the analyst will put the ticket into a pending status for future consideration.
5.1 Incomplete Tickets
Performance tickets opened by the campuses that are incomplete will be reassigned to the appropriate campus contact in the “CMS TUG Group” assignment group established in remedy. The work log will contain a list items that need to be completed before this ticket can be assigned to the CMS Performance Team. Once the ticket has been updated with all the required information, the CMS TUG representative should assign this ticket to the CMS Performance Team.