Identity and Directory Management (and how SIF can help)
Overview
This document describes two challenges and how SIF can be used to address them:
- Identity Management
- Directory Management
Identity Management can vary widely in scope and meaning depending on country and other factors. In general, it means assigning a single identifier to a population and generally includes some sort of matching to handle cases where a single person might appear in more than one data set. 
The sophistication of this matching and the extent to which these records will be matched, however, is where the variations will arise and the implementations will differ.
Directory Management is the creation and management of user accounts in a system such as Microsoft Active Directory. The way these accounts are grouped and where they are managed (the school, the local authority (or the district) or in a consolidated location (such as a broadband consortium)) can vary widely from one installation to the next, so flexible tools to manage them is an absolute requirement.
The SIF Environment
The ZIAD SIF Agent is made up of two, potentially three parts:
- The SIF agent: this is the part that communicates with the SIF infrastructure, has the business rules that controls the logic of what gets done when and the and the part that calls the web services.
- The web service: this is installed wherever there is a root of a forest. It carries functions such as creating the account, adding the account to AD groups, creating a home directory, setting home directory permissions and copying default files into a home directory.
- In installations where there are schools where learners attend multiple schools during a given day, the Envoy product may be used to implement Managed Virtual Zones. This provides record matching capabilities for learners, teachers and contacts.
First, we will discuss a simple Active Directory environment and then we will consider the environment where there are many schools and learners, teachers and contacts may appear in the student management systems (sometimes referred to as SIS or MIS as well) at more than one location. Moreover, when they appear in these different locations, they may appear with different SIF IDs.
Creating the ID (or IDs)
In the first step above, one of duties of the SIF agent was to create one or more identifiers for this user. These identifiers may include:
- Active Directory account name
- Shibboleth
- Email address
The SIF agent can either be set up to create and manage these identifiers or to subscribe to the SIF Identity object that is published by a separate identity management provider. If it is managing identities, it will publish the SIF Identity object.

If it is configured to manage Shibboleth IDs, it will maintain an IdP “shibpid” table as it receives SIF messages for learners, teachers and contacts. This is the table that applications would go to if they wanted to know if a user that is attempting to access an application has already authenticated. For more information about how this is accomplished, look here.
Since the format of a Shibboleth ID and those for email names and Active Directory sAMNames have similar naming restrictions (for length before the ‘@’ symbol and allowable characters), ZIAD business rules will create a common “stem” for all three.
When it has created these IDs, it will then check to see if these have been used already. By default, this is done by checking the current IdP “shibpid” table to make sure it wasn’t there already. But, there is no reason it couldn’t be extended to also call a web service to also call a web service to make sure it wasn’t also included in some larger set of “already used IDs”.
Lastly, after it has verified that the ID has not already been used before, it publishes a SIF Identity object to notify others in the zone of the newly chosen identity. This object may optionally also contain an encrypted password based on either:
- a fixed value (a fixed value that needs to be changed immediately, for example)
- something to do with the environment that has nothing to do with received data
- something that depends on received data (such as a learner’s birth date, for example)
Creating Accounts and Directories
The SIF agent is built using Visual Software’s configurable SIF agent, ZIAgent™. In the UK, this SIF agent subscribes to the following objects:
- LAInfo
- SchoolInfo
- LearnerPersonal
- LearnerSchoolEnrolment
- WorkforcePersonal
- ContactPersonal
- LearnerContact
- LearnerGroupEnrolment
…and any other objects where the user would need to attach business rules. For example, a business rule could be attached to the TeachingGroup SIF object that would add all teachers in this group to an Active Directory group.
This SIF agent publishes one object: the Identity SIF object.
Depending on which business rules are activated, the SIF agent can perform a number of functions:
- It collects information from LAInfo, SchoolInfo, LearnerPersonal and LearnerSchoolEnrolment and uses this information to:
- create a new ID (or IDs) – it also checks to make sure that this ID has not already been used, and if it already has, it goes through a sequence to try again
- use this new ID to create an Active Directory account for the learner
- use the same ID to create a home directory for the learner (this may be located on a central SAN or on a server at the school)
- add the new user to appropriate AD groups depending on the school enrollment and other characteristics found in the information already collected
- set up the new user’s home directory by copying default files (if so configured) and setting up default permissions
- It collects WorkforcePersonal information and does similar actions for teachers
- It collects ContactPersonal and LearnerContact information and does similar actions for contacts
- When it receives LearnerGroupEnrolment messages, it adds learners to active directory groups that may have been established for the associated Active Directory course groups
What about Learner Moves?
A learner moving from one school to another is a very common occurrence within a school system and ends up being a significant amount of tedious effort for an IT department when totaled up over the course of a school year.
The SIF agent, when it sees what looks like a learner school-to-school move does what an IT would likely do:
- Move the learner account from the old OU (belonging to the old school) to the new OU (the one belonging to the new school)
- Remove the learner from any old AD groups and add account to new AD groups according to rules that that apply to new school
- Move contents of home directory from old school server to new school’s server and change permissions on all the files to reflect those that would be reasonable for the new school
Record Matching
Once an implement gets larger than a few schools, record matching complexities begin to complicate the implementation and the need for a more sophisticated form of record matching arises.
Since most student information in the US is managed at “one level up” from the schools (at the district level), this need hasn’t been quite as evident, but in the UK, since learner data is typically maintained at the school level, things get very complex quickly.
For this reason, Visual Software introduced the concept of “Managed Virtual Zones” and a product Envoy™ that implements it.
Managed Virtual Zones
We have found that, in many of the larger UK installations, there are a mix of MIS systems installed at the different schools. If each of these were to have SIF agents, they would each have different SIF identifiers for each of the different things they kept track of, even if they were the same object.
For example, each of these schools might store the name and other basic information of every school in the LA (Local Authority). That information would be stored in a SIF object called SchoolInfo. That SchoolInfo object would have a SIF identifier called a RefId. Since each of these MIS systems don’t confer with each other every time they do something (that would really slow them down), they just keep their own independent set of identifiers.
The result is that each school keeps a SchoolInfo record for each school in the LA and all of the RefIds are “out of sync”. So, if there is a “Charles Dickens Elementary School” and there are 30 schools in the LA, there will be 30 different RefIds, one in each MIS.
Move on, one more step…
There are instances where people’s records may appear in more than one system as well:
- In the UK, learners participating in the 14–19 program will have records appearing in more than one MIS. Depending the software that is being used in the LA, the LearnerPersonal RefIds may or may not be synchronized. It might if two schools use MISs from the same supplier; they won’t if the two MISs are from different suppliers.
- In the UK or in the US, many children in special education programs attend more than one school during the course of a single day. Their records will appear in more than one system.
- Teachers who teach at more than one school will have records in more than one system.
- Parents or other contacts who have children attending classes at more than one school will have records in more than one system.
The big problem comes when a subscribing application needs to be shared by more than one school (such as an identity provider). Issues it needs to deal with are:
- It receives two records for the same learner. They appear to be the same child, but is it the same child? How can it be certain? What if some of the information is missing? What if it has a “close match”?
- What if it has to publish back information to the MIS? Which zone does it publish it to? What if it was one of these learners that attended more than one school? Which set of RefIds does it use to publish?
There are actually five types of records that need to be matched between systems in UK SIF 1.1:
- LAInfo
- SchoolInfo
- LearnerPersonal
- WorkforcePersonal
- ContactPersonal
…and two philosophies on how best to do this:
- Multi-zoning: this has been around since the early days of SIF. With this type, the result is (for every subscribing application):
- every application is responsible for maintaining multiple sets of RefIds for every MIS that sends it data
- the record matching logic must be present in every application that shares data from more than one school and should represent the local authority’s policies
- if the application publishes data, it needs to keep track of the correct zone (or zones) to publish it to and make sure to use the correct set of RefIds for that zone
- Managed Virtual Zones, as implemented by Envoy:
- every application is only responsible for one set of RefIds
- the record matching logic is handled by a central manager for all applications and is configurable so that it can match local authority’s policies
For more information on Managed Virtual Zones, see Scalability and Reliability.

The diagram above is an illustration of how Managed Virtual Zones can be set up. Any group of school zones can be logically grouped together to form a virtual zone. These virtual zones can be used to represent:
- Local Authorities
- Regional Broadband Consortia
- 14–19 partnerships
- Group of schools that feed a common special education school
Applications that register, for example, as a subscriber in one of these virtual zones will receive events from all the member schools in that zone. The events will have a synchronized pre-matched set of records. This means that if a learner attends two schools, there will only be one set of “already matched up” records, as if there was a single MIS. These records will contain information from both schools (such as attendance and assessment records), but the RefIds will will be consistent.
The difference between multi-zone approach and the Managed Virtual Zone approach?
- With the multi-zone approach, the learner would have two different sets of RefIds for everything, for the SchoolInfo, LearnerPersonal, attendance, assessment, etc.. The application would need to apply some internal logic to look through the student records to determine if two records were for the same learner. Beginning of story…
- With the Managed Virtual Zone approach, the application would have only received one record in the first place. End of story…
As discussed in the Reliability article, the entire Managed Virtual Zones concept began as an attempt to make SIF implementations more reliable. Along the way, it ended up being a very reliable way of addressing the IdP issue as well.
Although it took us eight complete different product designs before we landed on an Envoy design we could be happy with, it also ended up being a very efficient way to implement Identity management on a large scale.
If you would like more information, please feel free to contact us at
+1 (215) 493-8210.




