Implementing SIF Identity Management – Where to Begin

Overview

The intent of this document is to serve as a guide for school organizations wanting to start a SIF implementation with the goal of implementing Identity Management in a number of schools and include data from:

  • elementary schools
  • secondary schools
  • secondary school consortiums (where the MIS suppliers may or may not be the same)
  • special education schools (those who would have an MIS-like application that would keep track of its learners)
  • schools with good quality data and those with bad quality data

This guide is going to assume the use of Visual Software products. Although some parts of this guide may be generic (such as in SIF agents and the Zone Integration Server (ZIS)), other parts, such as in the use of Veracity and Envoy, are presently only available from this supplier.SIFDeploymentFC

High Level View

At a very high level, the steps are as follows:

  1. Install the Zone Integration Server (ZIServer) into the environment
  2. Set up agents for the MIS systems for the schools that will be participating
  3. Set up logical instances of Veracity for the schools that will be participating
  4. Create a set of Veracity rules that:
    1. match the data quality policies of the top level organization
    2. illustrate where Envoy will be needed (show where the duplicate records will be)
    3. (talk about this)
  5. Have schools clean up data
  6. Set up Envoy and verify business rules with organization
    1. Learners (after this, the IdM can start to be set up)
    2. Teachers (after this, the VLE can start to be set up)
    3. Contacts
  7. Continue setting up other agents

 

1) Installing the Zone Integration Server

In preparing to do this, you will first need to decide if you will be needing redundancy. Now, this can mean many things since the ZIS is made up of a web part and a database part and each of these parts offer different redundancy options. For a complete description of these options and which are supported by our ZIServer product, see ZIServer at the Enterprise Level.

If the ZIServer is to be installed on a single server or on a single web server and a shared SQL server, then the install procedure is much more direct.

Single Server Installation (non-redundant)

This is the simplest configuration. IIS and SQL Server are both installed on the same server and the ZIServer.MSI is typically executed with all of the default options (unless you would like to change the default location or web site port number).

NOTE: When you first install SQL Server, you should choose “mixed mode authentication”, or if this is an existing server, you may change it through the SQL Server properties.

Two Server Installation (non-redundant)

If you are using a two-server configuration, you should run the ZIServer.MSI twice: on the database server machine first, then on the web server. Both times, choose to install a custom configuration. When installing on the database server, only choose the database parts and when installing on the web server, choose to not install the database parts.

When you install the web parts, you will still be required to specify the name of the database server – it will need to know this so it can connect the two parts together.

High Availability Installation – Database Part

SQL Server allows its users to “scale up” or (with a lot of work) “scale out”. We highly recommend staying with the “scaling up” model (use a server that allows for growth – extra processor slots), but using one of the following two techniques to provide redundancy:

  • SQL database mirroring, high safety mode
  • SQL database clustering

We prefer SQL database mirroring over clustering for two reasons:

  1. The most important reason is that with database mirroring, there is no single point of failure. With SQL clustering, there are only one copy of the databases that are shared between the two machines. With database mirroring, if a block becomes corrupted (and cannot be repaired through the RAID mechanisms), it is automatically repaired from the extra copy of that block from the other copy (from the mirror). As stated in the SQL Server documentation “A failover cluster does not protect against disk failure. You can use failover clustering to reduce system downtime and provide higher application availability.”
  2. Since ZIServer has two databases, we recommend assigning one to each of two servers as its principal server. In this way, we achieve some of the performance benefits of “scaling out” not normally associated with SQL Server.

We do not recommend any high-availability database techniques that involve log shipping, replication or scalable shared databases. The problem with these methods is, especially in busy systems, that the copy can be up to several seconds or more out of date from the original. If messages are lost in between, some SIF agents have a very difficult time recovering.

High-Availability Installation – Web Server Part

In choosing the high-availability method for the web part, you will have a few choices as well – some will depend on your network and the type of connections (HTTP or HTTPS) you need to make. Acceptable choices include:

  • Windows Network Load Balancing – this functionality is built into the Windows Server operating system. If you are considering this, we recommend using Windows Server 2008 R2 over one of the earlier versions (it is much easier to set up, it no longer required special hardware and we’ve found it much more reliable as well).
  • Windows Clustering – this is sometimes referred to as a “active-passive” setup where one server is continually backing up the other and both servers are sharing a common set of disk drives. There is a version of this known as “geographically dispersed multi-site clustering” used for disaster recovery scenarios that uses multiple data stores, but most of the time, Windows clustering suffers from the disk being a single point of failure.
  • Use of hardware Load Balancing Devices – these devices typically handle switching traffic between multiple machines but can also sometimes handle HTTPS encoding as well. These roughly serve the same purpose as the first option, except that the operating system does not need to spend any of its time dealing with the routing of messages. On the other hand, the operating system has no knowledge of how the routing is being done and is sometimes at a disadvantage when making certain types of decisions – there are tradeoffs.

Side note: Do not assume that by putting a Zone Integration Server on a high-availability platform it will automatically become a high-availability ZIS. ZIServer has been designed and programmed for these environments. For example, it is likely that all ZIS will properly receive messages correctly from both SIF pull and push mode agents. Likewise, they will most likely handle sending messages to SIF pull mode agents. But, correctly sending messages to SIF push mode agents, especially when a server goes down and then comes back up again is where most will fail and having hardware or operating system assist will not help at all.

 

2) Set UP MIS (SIS) Agents

First, the MIS (Management Information System or Student Information System (SIS) as they are called in the US) SIF agent should be set up before SIF agents for any of the other systems. This is because they are typically the primary feeders for most other systems in the zone.

Ideally, SIF agents would be available from the MIS (SIS) suppliers and one could be purchased or otherwise obtained from the supplier.  The SIF agent serves as an adapter between the application and the rest of the infrastructure and packages the information from the MIS and makes it available to the other applications that are connected to it. Normally, the MIS would send the information whenever it changes through its own SIF agent, but if no SIF agent exists directly from the supplier, we offer two options:

  • Mimic – this is a SIF agent wizard that is intended for those applications that can create extract files for objects such as schools, learners, teachers, contacts, classrooms, courses, etc. in CSV format and deposit them into a directory on a regular basis (such as once an hour). The Mimic SIF agent will regularly look at these to see differences in the files between the current file and the last time it looked. It will then generate SIF events corresponding to the differences. It will also respond to SIF requests from other agents in the zone.  Below is a simple slide show illustrating Mimic’s configuration and installation process – for more information see Mimic.

    [smooth=id:1;]

  • ZIAgent – this is our fully functional, configurable SIF agent made for existing applications. Being substantially different from SIF agent Agent Development Kits (ADKs), ZIAgent allows its user to configure a SIF agent in a matter of hours to days without programming that has sophisticated features such as audit trails, automated code discovery, dynamic record matching, business rules, regression testing and much more. For more information, see ZIAgent.

Connections to these agents should be set up in the ZIS first and the Access Control Lists (ACLs) for the agents should be set up to control the information that they will publish. These lists can be set up in two ways: at the object level and at the element level.

 

Object Level Control

Object level control gives the administrator over the objects that a SIF agent is allowed to publish or over those to which it is allowed to subscribe. The screen in the ZIServer administration tool that allows control over object permissions looks like this:

image
Click on the image to see a full-size representation.

The permissions for each object include Provide, Request, Respond, Subscribe, Publish Add, Publish Change and Publish Delete. In this example, the “TestProvider” agent was allowed to Provide, Respond and Publish all types of events for all types of objects in our test environment. In a production environment, it is best to determine which objects will be needed by subscribers in the zone and only publish those zones.

NOTE: It is also recommended shutting off those objects in the SIF agent software, so that it doesn’t continually try to do things that it is not allowed to do by policy.

Element Level Control

Element level control is something that has been recently added to the SIF specification. This allows the administrator to further restrict information being sent to subscribing applications at the element (or attribute) level. On the ZIServer product, certain objects and elements within those objects are pre-selected as “those that would be likely candidates for having restrictions” (this makes the user interface easier to use and runtime processing faster). In the ZIServer ACL interface (as shown above), the names of objects that are likely candidates for element level filtering are shown up as hyperlinks. When selected, another screen is displayed that looks like this:

image
Click on the image to see a full-size representation.

On the left side of the screen are the names of the elements (or attributes) that can be restricted from being delivered to this agent. If the box is selected, then messages that would normally be delivered to this agent will have these elements removed from them before they are delivered.

NOTE: In ZIServer’s audits, you may notice that a single event message appears one time for each agent it is delivered to – this is because there is a potential that each copy of the message may have slightly different contents because the element level filtering may be set differently for the different agents.

3) Set up Logical Instances of VeracityZIAnswers_Collage

This is where our (Visual Software’s) recommendations and most everyone else’s diverge. Our recommendation is that before you connect an automatic feed to subscribing applications, you know what the data is going to look like and how well it will meet the organization’s standard of quality. This may simply be an exercise to verify that everyone has been meticulously been entering in all data, but more often than not, it proof of what everyone has known all along – the data needs some work.

Setting up Veracity for this purpose is quite simple – at a later point, we will get more elegant with setting up additional user accounts, but at this stage we simply want to get the data in good enough shape so that we can connect the other applications (including identity management) and have a good chance of success.

Setting up Veracity also includes defining business rules that reflect data quality standards for the organization. These can include things such as checking:

  • make sure all learners have reasonable birth dates
  • make sure all learners have home addresses specified in their demographic records
  • make sure all learners have a value for the gender field that is ‘M’ or ‘F’ (not missing or ‘U’ (unknown) – although allowed by the SIF specification, this value may not be acceptable for this organization)
  • make sure al learners have a given name, family name
  • check for learners who are enrolled in a “English as a Second Language” class whose primary language is English
  • check for invalid phone number format

Rules like these may come from local standards or may be the result of requirements of state or census reporting. Furthermore, the checks may be labeled as “Errors”, “Warnings” or “Information Only” to indicate a level of severity attached to it.

4) Set up Veracity Rules

The rules that will be applied against the school data should, at a minimum, look for the errors in the data that would inhibit the assigning of learner, teacher and contact identifiers.

It might be a good idea to start with a limited set of rules because, from our experience, it is very disheartening for school users if they see that they need to correct 300,000 errors in their data before it can be used (we’ve seen worse than that…). So, we find that it’s best to start with the rules that you need and then add in new ones over time once the users have had a chance to benefit from some of the benefits of the software.

5) Have Schools Clean Up Data

This will most likely require some training, encouragement, and finding a way to have them see the benefit in doing the work to clean up the data. In the training it will be necessary to show school users that as soon as they make the correction in the MIS, the error drops off the Veracity screen. It might also be good to see a demonstration of SIF in action during the training so that the users sense that the work that is being done is for a purpose other than “the boss says so” (it helps when there is a purpose in what they are doing).

Expect that this process could be one of the longest parts of the project, depending on the initial quality of the data, the aggressiveness of the set of rules and the time that the school is willing to put into correcting the errors.

6) Verify Envoy Rules

Envoy has a set of business rules for every SIF object. These business rules define procedures for:

  • Detecting duplicate records (within an organization, if two records are received for the same object)
  • Choosing an “authoritative” copy between the duplicates (if there are duplicates, which one is more reliable?)
  • Understanding all security and any special routing requirements

For example, a simple object is SchoolInfo – it contains information about a school. Let’s say we are creating a shared zone that is fed information from several MIS systems in several schools. This may sound very simple on the surface, but…

  • In each school’s MIS, there may be SchoolInfo records for dozens to hundreds of schools (including, of course, its own)
  • We can look for duplicates by looking at the school’s URN (most reliable), then if that is not available, use a combination of LA number and Establishment ID (if non-zero), then…
  • Every MIS’s RefId’s are independent of each other’s (this is defined by the SIF specification)
  • It is safe to assume that there are hundreds of SchoolInfo records being shared in this shared zone, all with different RefIds for every school. If you requested all SchoolInfo records from all schools, you would end up having all copies of all of the versions for all of the schools.

Which one is the most reliable?

Our default business rule is: Assume that a school has the most reliable information about itself. Choose a school’s own SchoolInfo record as “the authoritative” record and have Envoy map all other references from any of the other schools to it from now on.

This is a simple example. But, for every object used in Envoy, there are business rules like this. Before installing this product, we need to go over these rules with the client and make sure that everyone agrees on how records are being matched and how authoritative records are being chosen.

7) Connect Other Applications

Once Envoy has been installed, virtual zones should be created for each group of schools where a subscriber will be registered. If installed at a Regional Broadband Consortium (RBC), this would likely include:

  • one zone for the RBC itself – this would be needed for any applications hosted by and shared at the RBC level where the application maintains a single database for all schools in the RBC and uses a field to keep records separate. This would most likely be where the Identity Management application would connect.
  • one zone for each Local Authority (LA) – these would be used to connect applications hosted at the LA level that use a single database to hold learners from all schools.

Once it has been installed, you might also want to connect a copy of Veracity into each of these zones as well and create rules that check:

  • to make sure that the school’s data privacy settings have been set correctly (data isn’t “leaking” where it shouldn’t be going to);
  • to verify that the matching and “authoritative record choice” business rules established for Envoy are working the way everyone wanted.

Print This Page Print This Page