Awesome – so I fell victim to the oldest problem in the IT handbook – an inexplicable corruption of my blog repository and apparently insufficient means to restore. I think I have a couple of friends that have the RSS feed in Outlook, so I’ll see if I can get some of the old posts from them, but it’s time to start blogging again anyway. Everything I’ve been working toward for the past year is coming to a head when Medicare season hits full swing on 7 October, so my schedule is opening up. Bear with me – it’s been an amazing year, and I have a lot to share about where I believe the future of contact centers lies.
Crash!
September 29th, 2009End the silence!
March 6th, 2009After a very long hiatus from blogging, I’m back online. Going forward, I would expect this blog to have a very similar tone to its previous incarnation. That is, technically focused with a business bent. I have studied and have a passion for both technology and business, but above all, I desire to bridge the divide between the two camps.
I blog for several reasons. First and foremost, I use blogs to solve many of my day-to-day issues. This is a way for me to give back to the community and perhaps help solve problems for another user of blogs. Secondly, I find some validation in blogging. When I am working in a particularly obscure problem space, or one that doesn’t have any local validation, there is always someone on the Internet ready and willing to validate (assuming the post was valid). Finally, blogging connects me to other like-minded (or at least similarly-tasked) professionals, which is always good for fleshing out thought patterns.
Status Update
A number of people have asked me what ever came of the software we were working on last year. In a nutshell, we had a significant decision to make: move incrementally from Cisco Unified Contact Center Express (UCCX) to Unified Contact Center Enterprise (UCCE) or take a huge leap to Microsoft Office Communications Server (OCS). We eventually made the (heart-stopping) decision to move to OCS, for a number of reasons:
- The primary motivator in our decision to work with OCS was the fact that we have a number of talented .NET developers on staff. Doing custom development work against other contact center systems frequently requires specialized training; doing custom development work in the OCS space was something we were able to pick up without any training.
- The capital expenditure for UCCE was in the millions, the capex for OCS was 10% of that. The scale expense per employee for UCCE was thousands of dollars, the scale expense per employee for OCS was 20% of that.
- Maintaining UCCE requires specialized hardware and staffing needs. Our OCS deployment uses Dell blade servers that we’re free to upgrade without consequence. We are able to use our existing network operations staff to maintain the OCS environment.
- UCCE made it difficult at the time to hang users off the Internet without specialized connections (VPN and/or QoS). OCS did not require a VPN, and Microsoft recommends that OCS implementers not implement QoS unless it becomes necessary.
- Microsoft unquestionably has the commitment to invest huge amounts of resources going forward. We knew there would be some pain points with OCS 2007, and there were. However, the release of OCS 2007 R2 has already improved some of those pain points, and has added some very interesting new features.
The end result is that we were enormously successful. To be sure, there were a number of excruciatingly painful things we found – but most of our real pain points were with Speech Server 2007, which we used as an IVR. Specifically, we found that our IVR would frequently throw exceptions in internal code, which would cause unpredictable results in the workflow. We have patched our way through most of these issues, but we consider them to be workarounds, not solutions. We have high hopes for UCMA 2.0 improving significantly in these areas.
The other tremendously painful things we found were that audio conferences do not support DTMF or transfers. Our call workflow would add a recorder endpoint to a conversation as it was transferred to an internal employee, making the conversation a conference. This made it impossible for us to bring in an external party that required DTMF input (for instance, another IVR) or fully transfer that call to an external party. Rather, the employee would add the external party and drop off the line with the end effect of the recorder remaining on the line for the duration of the external parties discussion. Early in the season, we didn’t realize (nor is there good documentation) on the fact that if there are no internal parties still involved in the conference, the call will be terminated after x minutes. This makes sense as a security feature, but we wound up dropping a number of calls that we had transferred to one of our partners. It was messy.
Still, I would reiterate that we were enormously successful. Those areas of OCS that weren’t pain points worked phenomenally. We got into a scalable, manageable enterprise voice solution for a small capex. And most importantly, we wrote Complemax.
Complemax
When I was blogging last year, I expressed frustration with a number of existing contact center matching algorithms. There are three characteristics of traditional contact center matching algorithms that I really struggled with: first, most of the algorithms out there have a queue-based approach. I hated that, because there are simply some calls that are more important than others. The queue-based solution to this is to create a unique queue that is shorter than other queues, but you still don’t achieve infinite flexibility – rather, you have a queue permutation problem – how many queues are you willing to support? Secondly, most of the algorithms use a skill-based mechanism. For the most part, I support skill-based mechanisms, but many implementations of skill-based routing have the same limitations in scoring – scoring happens in buckets rather than on a fine scale. Finally, and most importantly, all of the algorithms I looked at (as far as I could tell) considered scoring only from the point of the call. It’s hard to explain this in words, so let me try to present a couple of pictures.
[Disclaimer: Please take the following with a grain of salt and do your own research. I’m not a contact center expert, and I detest people who misrepresent something without qualification. I don’t believe I’m misrepresenting the predominant algorithms in use in contact centers today, but I could be wrong.]
Skills-based routing (fully queued)
This is, in my opinion, almost the worst possible form of routing. The only worse form of routing would be straight queue-based routing that doesn’t even take skills into account. In the case of skills-based routing, agents are dumped into skills queues as they become available. If an agent has multiple skills, they can theoretically be put into multiple queues and an additional algorithm will run behind the scenes to determine which call an agent should get if (s)he is first in line in multiple queues with calls waiting.
Calls have a corresponding skill queue. When a call is waiting in its skill queue and an agent is available in the corresponding skill queue, both are popped out of the queue and connected. The deficiencies of not utilizing game theory are even more apparent here. If agent 2 in the above picture is much more capable of skill 1 than agent 1, nothing in the algorithm will allow agent 2 to take the call instead.
Skills-based percentage routing (partially queued)
In this form of routing calls, an agent registers with a skills “queue” as with straight skills-based routing. Unlike the former scenario, however, when a call is waiting, the algorithm will match the call to the agent with the highest score or percentage in that skill. This means that the agent half of the skill queue is not really a queue as it is a grouping of available agents.
This form of routing calls clearly has some advantages over skills-based routing. First and foremost, the caller is likely to get someone more experienced in their skill. The agent is more likely to get a call (s)he can deal with. Other agents, who may be stronger in other skills, may be more likely to get calls they can deal with. Agents are encouraged to improve their skills – if they don’t, they don’t get calls.
However, this algorithm still has some major deficiencies. For one thing, it’s really hard to represent a multiplicity of skills needed to handle a call. In our environment, we have scenarios where we consider many factors – whether there is a language match, a licensing match, whether the caller has spoken to the agent before, what role the caller needs to talk to, etc. In the worst case, a skills-based percentage routing would have to have the full permutation of skills queues to represent all of our skills combinations. That means if we support 2 languages and 50 state licensure, we’re already up to 100 queues. More importantly, this algorithm still doesn’t attempt to maximize the efficacy of the contact center as a whole. Just because an agent is the best match for a call does not mean the call is the best match for the agent. If that doesn’t make sense, make sure you keep reading.
Complemax routing
Because of my frustrations with the existing algorithms in contact center routing solutions, and because we were working on deploying OCS (which doesn’t have an automatic call distributor or ACD), we seized the opportunity to fix what we perceived as a problem. I believe that most, if not all, routing algorithms in current usage take the wrong approach. Before you write that off as arrogant, let me explain.
The first point to understand is that we want to use the term matching as opposed to routing. Complemax looks for tight matches between agents and calls. That matching relationship can be built of any number of factors. We currently have about twenty different factors running in our environment, and we pretty consistently get solid matches. Where we don’t, we try to understand which factor is adversely affecting the matches or what needs to be added or removed to make better matches. Once we create those matches, we build a representation of a grid and start to analyze that grid for “best matches.”
This is where things get really complex. I’m going to skip a lot of the details here not only because this intellectual property belongs to my company, but because I’d lose the interest of nearly everyone who has actually read this far. Suffice it to say that the problem is considered NP-complete – that is, there is no known algorithm that can consistently solve the grid in polynomial time. To compensate for the NP-complete barrier, we used game theory and Pareto frontiers to narrow the set of possible solutions.
The end result of Complemax is best understood by looking at the diagram above. In that diagram, we can clearly see that the best match for caller 1 is agent 2 and that the best call for agent 2 is caller 1. So why shouldn’t we match agent 2 to caller 1? Any human can tell you, if they look at the chart for a bit, that the most efficacious course for the contact center is not to match agent 2 to caller 1, but to match agent 2 to caller 2. This is because agent 4, while not quite as good of a match for caller 1, is quite capable of handling the call – nearly as capable as agent 1. Simply stated, we look at a full mesh of agents to calls, their “match scores”, and we try to maximize the sum of the match scores.
I will give details in a future post about how well this solution worked for us, but the preview is that we handled almost 20x the calls we did the previous year, and we’re still in business. It worked.
Going Forward
I have been working a lot recently on catching up on some sorely missed professional development. I consider blogging to be part of that professional development, and I do plan to start blogging regularly again. I hope this will be a venue where new ideas and new theories can be thought out, exposed, and hopefully receive some criticism eventually. We have many plans now past Complemax, and I’ll start to talk about some of what we’re doing now as I get time to write.
Office Communications Server Deployment, Day 9
May 30th, 2008Note: Sorry this wasn’t posted sooner, there was a bit of a shake-up internally as we tried to decide what all was appropriate to post. I’ve had this post ready for a few days now and have just been waiting for definitive answers from my management. This post represents nearly complete OCS deployment. By the time it ends, we have Enterprise Voice complete. The remaining things we will deploy are the archiving server, the QoE monitoring role, and edge servers.
1:07 PM : Creating UM Dial Plan
Note: there are three important things here. The first is the dial plan name. You’ll see that when I create the location profile in OCS that the name is slcutloc.extendhealth.com. That must match. Second is the URI type – it must be SipName for OCS integration. The last thing is VoIP security, which should be Secured for OCS. (Secured > SipSecured)
Have to add the dial plans to the UM servers – both mail1 and mail2.
1:20 PM : Running ExchUCUtil.ps1
Verified IP gateways. If there were more, I’d have to disable them.
1:31 PM : Creating Location Profiles
I’m not going to comment on this much as there is a lot to say. Screen caps should be sufficient to let you know what I’m doing.
2:07 PM : Running OcsUMUtil.exe
The last step is to integrate from the OCS side by running OcsUMUtil, which creates OCS objects for the auto assistant and subscriber access numbers in Exchange UM. This facilitates access to these numbers from Communicator.
2:10 PM : Assigning a Default Location to the Pool
2:15 PM : Configuring Mediation Servers
2:22 PM : Configuring Policies and Phone Usages
Office Communications Server Deployment, Day 8
May 22nd, 20088:08 AM : Loopback Fix
I’ve been here for a while, catching up on some of my non-blog communication, MBA coursework, etc. About ten minutes ago, I started testing a probable fix for the validation error I had last night. Just as a reminder, that validation error looked like this:
The fix is recorded in Appendix D of the Office Communications Server 2007 Enterprise Edition and Communicator 2007 Deployment Guide. In a nutshell, you need to add a multi-string value to HKLM\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0. The MSV should be named BackConnectionHostNames and should have a value of your pool’s FQDN. What this does is allow IIS to validate certain FQDNs as being valid for loopback. You’ll want to remove this value when you’re not validating, and more detail is available by reading the referenced guide.
When I followed the instructions for the fix, the validation wizard for the remaining steps executed properly.
8:16 AM : Validation Wizards
(Yes, that’s a different validation wizard.)
(Yes again.)
8:23 AM : Validation Results
So the current state of our deployment is that there are two validation warnings, neither of which I care about because I haven’t deployed Enterprise Voice or edge access yet.
From the Validate Front End Server Configuration wizard, we have:
From the Validate Web Components Server Functionality wizard, we have:
8:27 AM : Internal Deployment Complete
Aside from the above validation warnings, it seems that internal deployment is complete. I do have one more warning in my Communicator client regarding Exchange Web Services, but the Exchange deployment on this domain isn’t complete yet, so it’s also expected. The ramification at this point is that Communicator can’t automatically set my status to “In a Meeting” if I have a meeting scheduled in Outlook.
Next step is external user access, meaning I’ll be bringing up a scaled single-site edge topology. I’ll try to explain that in more detail, but there will probably be some downtime here as I test Communicator internally and prep another couple of servers to be edge servers. (I have to install Server 2003 at least.)
1:53 PM : Enterprise Voice
1:56 PM : Activating Mediation Server
2:00 PM : Assigning Certificates
3:16 PM : Enterprise Voice Prep
I’ve been reading (and will continue to read through) the Microsoft Office Communications Server 2007 Enterprise Voice Planning and Deployment Guide. This will probably take the rest of the day and will ensure that I make minimal mistakes when deploying Enterprise Voice. I have a good idea of what it is that I need to do, but I want to be certain.
Office Communications Server Deployment, Day 7.5
May 22nd, 2008All of these steps and screenshots were performed late last night. I’ll fill in commentary now (morning of Day 8).
Back Story
I was crushingly disappointed when Microsoft told me that I’d have to reinstall my entire PKI because the hashing algorithms I used were for a Cryptography Next Generation (CNG) CSP, not a CryptoAPI Version 1 CSP. Knowing what I know now, I can see some allusions to that on pp. 158-159 of Brian Komar’s book. Before I left work yesterday, I e-mailed Brian and explained my situation and that I was on a support call with Microsoft. I then updated him via e-mail of their response (”it’s not supported) and the fact that they were closing the support case.
He sent this response:
Mark,
There is a security update that will allow XP and 2003 clients to validate certificates that implement SHA-2 signatures.
The update is included in Windows XP service pack 3.
Per the release notes for service pack 3:
|
Microsoft Cryptographic Module |
Implements and supports the SHA2 hashing algorithms (SHA256, SHA384, and SHA512) in X.509 certificate validation. This has been added to the crypto module rsaenh.dll. XP SP2 crypto modules Rsaenh.dll/Dssenh.dll/Fips.sys had been certified according to FIPS 140-1 specifications. The Federal Information Processing Standard (FIPS) 140-1 standard has been replaced by FIPS 140-2, and these modules have been validated and certified according to this standard. For more information, see the Microsoft Kernel Mode Cryptographic Module. |
You cannot create these certs in 2k3, but you would be able to validate them.
Brian
Based upon that hope, I went out and did some strategic searching and came across this KB: http://support.microsoft.com/kb/938397. After an hour of waiting on hold while some (nice enough) tech researched the history on my support case, I was finally given a link to download the hotfix. Note that there is a link there to register for the hotfix also, which I did, but was told that it would take up to 24 hours. It actually took about two hours.
Hotfix in hand, I patched the server and all the certificates looked great! There were still a couple of strange artifacts with how I had to request certificates, but I was able to do it without incident.
Now that the back story is complete, I’ll try to recreate the timeline as best I can based upon the timestamps in my screencaps. Thanks, OneNote!
8:50 PM : Assigning the Certificate to IIS
This is where things went awry yesterday. If you want to know what to do to get to this point, read that post.
8:52 PM : Starting Services
I’m deviating here from the norm of not including the wizard starts in the screen captures. The final screen of a wizard generally has useful information (like success, hopefully), but the start of a wizard usually just says what it is you’re doing. Since I generally label what it is that I’m doing already, I had been skipping the first screen for the wizards. At this point, however, the wizards start to blur together, especially in the validation phases. Therefore, I’m going to include some wizard start screens if I can to differentiate the wizards. (That said, I think I noticed last night that all the validation wizards start with the same screen anyway.)
9:29 PM : Server/Pool Validation
[Delay reason: had to put my son to bed.]
Oops… in order to validate the server and pool functionality, I need a couple of user accounts to be enabled for Office Communications Server. The trick to this is that you have to use Active Directory Users and Groups to enable the users, but you also have to have the OCS Administrative Tools installed on that computer. Because my domain controller is Server 2008, I can’t install the OCS Administrative Tools there (and be supported). In this case, I just opened an MMC on ocsfe1, added the Active Directory Users and Groups snapin, and connected to the extendhealth.com domain. Right-clicking on users now exposes the following option:
Now that the users are enabled, I can see them if I open the Office Communications Server snapin (Start > All Programs > Administrative Tools > Office Communications Server 2007).
9:36 PM : Back to Validation
Note that I didn’t check test connectivity of federated users because I don’t have external access yet.
This was the only warning I had. Since I haven’t deployed Enterprise Voice yet, I’m not concerned about this warning.
11:15 PM : More Validation
I think I took some time before this screenshot to correct some previous validation errors, but I can’t recall very clearly. I do want to note that I ran into some validation errors last night, as the following screenshot shows:
I believe this particular screenshot is an artifact of a known issue with IIS loopback, so I’ll try to fix it this morning. I didn’t think it was important last night since I recalled how to deal with it (although not the specific steps) and since the server and pool validated okay.
11:23 PM : The Payoff
Enough said.
Office Communications Server Deployment, Day 7
May 21st, 20088:33 AM : Picking Up Where We Left Off
As you may recall, I ran into an issue last night just before I left because I didn’t have the SQL client tools necessary (specifically the SQL 2005 Backwards Compatibility Pack and the SQL Native Client) installed on my front end server ocsfe1. I did try installing the tools this morning to no avail – unfortunately I wasn’t even getting a good quality error message, just “Pool backend discovery failed” – the same message I posted yesterday.
I’m pursuing a workaround at this point for two reasons:
- I need to keep the ball rolling. I have to get the internal deployment completed today.
- I’m planning to move the database to an official cluster anyway, per the directions in the Admin guide for moving the backend database for an Enterprise pool.
Primarily because of reason two, I don’t feel bad about installing SQL locally for a short time period (<1 month) until our cluster is ready to support the Enterprise pool. As with other cautions I’ve offered, this isn’t recommended. For me, it’s just real life. To achieve the goal I want, I’ve created a CNAME (alias) in DNS to tell my computer that dbcluster1 is currently the same as ocsfe1. I’ve also installed SQL Server 2005 Standard Edition SP2 32-bit locally.
8:39 AM : Creating the Enterprise Pool
Two notes here:
- We specified a different internal web farm FQDN because we may eventually move to an expanded configuration, and having a different FQDN may facilitate that transition.
- The planning documentation states that if you don’t specify an external web farm FQDN at this point, you’ll need to use the command line utility later. Usefulness of command line utilities notwithstanding, I’d rather specify it now since I know what it is.
Another note: our database files will be going onto a SAN with the transition to the database cluster. If you aren’t storing your database files on a SAN, you’ll want to make sure the database and log files are on different spindles (different physical volumes). This is basic database optimization, not an OCS thing.
I didn’t enable meeting archiving yet as it probably requires the Archiving and CDR role, which doesn’t exist yet in my infrastructure. I’m quite certain you can enable this later, so I’ll skip it for now. I have put the path in, however, so that you can see what I would be using if I were to enable it right now.
Archiving is not enabled for the same reason listed above.
Ugh. I made a mistake early on in the wizard – my pool is named ocspool.extendhealth.com, not pool.extendhealth.com. I think I can probably fix this later, so I’ll keep going for now. There were no other warnings in the log.
8:59 AM : Configuring Enterprise Pool
There’s the wrong pool name I mentioned above.
|
Pros |
Cons |
|
|
DNAT |
> 65,000 users |
Increased difficulty of configuration |
|
SNAT |
Easy configuration |
< 65,000 users |
Note: Only one pool or server can authenticate automatic logon requests.
I’ll definitely be configuring external user access, but two things are stopping me from doing it right now:
- I want the edge deployment to be distinct from the pool deployment for my own sanity and anyone’s sanity following along with this thread.
- I think the only way you can configure your edge topology right now is if you’re migrating from LCS 2005 R2? and already have an edge topology deployed. I’m not certain on that, I just think that’s what I recall.
9:10 AM : Adding Ocsfe1 to Pool
So far so good this morning – everything seems to be turning out okay aside from my dumb mistake with the pool name and the issues with the pool backend. I’m now ready to add ocsfe1 to the pool as the first front-end server.
(Takes a while. Lots of time for screen captures.)
Apparently Microsoft thinks it’s funny to continually remind me of my mistakes.
Yes, the password really is that long. As a reminder (I think for the third time), I use WinGuides Password Generator to generate passwords for service accounts.
Same warnings as before:
Aside from that error being in the logs about 20 times, there were no other errors. I think I’m still okay.
9:30 AM : Fixing the Pool FQDN
Before I proceed any further, I want to correct the pool FQDN. I’ve been warned sufficiently. As part of installing the Front End role, the administrative tools for OCS were installed. I’m opening them from Start > All Programs > Administrative Tools > Office Communications Server 2007.
9:36 AM : ???
Wow … http://forums.microsoft.com/unifiedcommunications/ShowPost.aspx?PostID=2931495&SiteID=57
Apparently I’ll be removing the pool and creating it all over again. Hope that goes okay.
Lesson learned: get the pool name right in the first place.
9:44 AM : Configuring Certificates
Well, at least it didn’t take too long to get back on track. For this next step, please note that there are two distinct steps. The Web Components role requires its certificate to be manually configured in IIS. The rest of the Front End roles have a wizard. I’ll deal with the wizard first, then IIS.
Because I have a PKI deployed, I can opt to send the request to an online certification authority (Active Directory will help me locate one).
In this case, we don’t care if the cert is exportable, but I left the box checked anyway. We also don’t care about client EKU – the only place that matters is for the certificate assigned to the external interface for the Access Edge role.
I chose to include the local machine name in the SAN here. If you’re configuring automatic client logon, the SAN must also contain sip.<domain>. In my case, it was automatically populated because of the choices I made in earlier wizards to enable automatic client logon.
… and … I accidentally clicked through the next screen, so I think it succeeded but I’m not 100% certain.
Well, I got that far before realizing that the prior wizard had actually failed. It has something to do with Server 2003 not recognizing the authenticity of the certificate chain. My PKI is completely implemented with Server 2008, so I guess it’s time to go research what to do.
3:22 PM : Square 1
As if there weren’t enough blocks already…
I just got off the phone with Microsoft support. The certificate issue is “by design”. In this case, I interpret “by design” to mean, “We knew about the problem but haven’t taken the initiative to fix it.” The specific issue is that Server 2003 and Windows XP don’t support certificate chains with algorithms > SHA1. Since my root CA had a SHA512 thumbprint, and my other CAs had a SHA256 thumbprint (per NIST guidelines), Server 2003 barfed.
Generally speaking I’m very happy with Microsoft. Today, I’m not. Off to rebuild the PKI from scratch…
Office Communications Server, Day 6
May 20th, 2008I spent the entire day yesterday dealing with administrative and management issues. As such, there was nothing to report.
5:35 AM : Amber Alert (Ex post facto)
This morning, I arrived at our data center to finish up some final issues remaining from the previous day. Installing all of this new equipment has caused heartburn, to say the least. The IP KVM we have (by Avocent) is not particularly incredible and has been on the fritz since Sunday, meaning that I couldn’t remote control any computers to install them from the office. That said, the plan this morning was to bypass the IP KVM, install a couple of servers with Windows Server 2003, and head back to the office to actually start on the OCS deployment steps past planning complete. Upon arrival, however, I immediately noticed that I didn’t get an IP address from our DHCP server there. The second thing I noticed was that all of our slave switches in the enclosures appeared dead. The third thing I noticed is that the consoles on the front of the blade enclosures were amber. In case you’re not a network admin (which I’m not any more, but experience has taught me), amber = bad.
It turned out that overnight, our data center had a significant A/C failure and had caused lots of problems. This isn’t a small data center, it’s enterprise class. A failure like this hasn’t happened in the entire history of the facility. Of course it would have to happen while I’m trying to deploy OCS: administrator’s law.
12:00 PM : Amber Remediated (Ex post facto)
By noon, we had the issues straightened out at the data center. I should note here that Dell wasn’t particularly well trained on our equipment, which is brand new (in the sense of recently released to manufacturing). It turned out that our Cisco switches had overheated and shut themselves down as a protective measure. Reseating the switches finally resolved most of our problems there. On the plus side, the work with fixing the amber alerts also somehow fixed the IP KVM.
Back at the office, I was finally able to deploy Windows Server 2008 (for an Exchange deployment) and Windows Server 2003 to servers. The current deployment toolset is using Microsoft Deployment as I was never able to get Configuration Manager 2007 running properly.
2:28 PM : Windows Server 2003 R2 with SP2 Deployment Complete
After working through several minor driver issues, I was just able to finish deploying Windows Server 2003 R2 (with SP2) via Microsoft Deployment. There were actually two different Broadcom drivers necessary, and I had to be sneaky about where I put one of them. If you happen to run into issues with a similar situation and need help, you can submit a comment here, but I don’t feel the need to detail what I did – it’s time to get into OCS, finally!
2:40 PM : Planning Recap
Since there were some final adjustments to several IPs internally, I’ll repost the planning table I posted last week with the updated IPs. If you can’t see it all, just copy and paste it into Excel.
Edit: Removed planning table
2:50 PM : Created A Records
I just created the A records for ocspool, ocsmeetings, and ocsmeetingsext. Note that certain parts of the planning documentation are pretty picky about whether these are A or CNAME records. I was also under the impression that I needed to create a sip.extendhealth.com A record, but can’t find mention of it in the planning docs for now, so I’ll skip it until it becomes a problem.
2:54 PM : Crashed MMC 3.0
It might be just me, but the MMC 3.0 seems particularly unstable. I just tried to add the SRV record for automatic configuration (_sipinternaltls._tcp.extendhealth.com) and the MMC crashed.
2:57 PM : Created SRV Record for Client Automatic Configuration
Note: this record gets created in the Forward Lookup Zones/<domain>/_tcp node.
2:59 PM : Finishing Updates
The ocsfe1 server will be the first server to come up (be added to the pool). It’s currently finishing some updates, which is why I’ve been picking away at DNS requirements. I should also note (if you didn’t read the posts from last week) that I have a PKI infrastructure in place to deal with the certificate requirements.
The one other critical thing I should highlight while I wait is that we expect some load balancers within two weeks. The VIPs referenced above would normally be assigned to the load balancer. For now, since we’re still missing this hardware, I plan to proceed with deployment as if they already existed. In order to (hopefully) fool OCS, I plan to assign the IP address that will be assigned to the VIP to ocsfe1 (temporarily). That means that ocsfe1 will currently have the following three IPs: 10.10.3.1, 10.10.3.51, 10.10.3.53. Please note that this is almost certainly not the recommended course of action, and I’m only ignoring my own advice out of necessity. When the load balancer comes in, I’ll assign the VIP IP to it, remove it from the server, and rerun the validation wizard and the best practices analyzer.
3:08 PM : Creating File Shares
Another thing you need to do before deploying OCS is set up some file shares that will store (mostly) Live Meeting related files. I have set up four shared folders on my file server: OCS\AddressBook, OCS\MeetingArchive*, OCS\MeetingContent, and OCS\MeetingMetadata.
* Optional, will only need this if archiving and CDR archives meetings.
3:20 PM : Installed IIS
Since I will be deploying an OCS Enterprise Pool, Consolidated Configuration, I installed IIS from the Add Role wizard. I didn’t enable ASP.NET as I don’t think OCS uses ASP.NET. (The planning documentation says you need ASP, however.)
3:30 PM : Opening the Setup Wizard
I think I’ve completed all the prerequisite steps for OCS installation and am opening the setup wizard for the first time. I’ll try to take as many screenshots as are relevant through the installation process.
3:32 PM : Preparing Active Directory
(Snipped for some semblance of brevity.)
(This wizard happened too fast to even grab a screen cap of the process.)
3:45 PM : Active Directory Prepared
Everything went flawlessly (or at least apparently so) in the Active Directory preparation phase. I’m now ready to create the Enterprise Pool. The one thing I think I might need here is user accounts that I haven’t created yet. I create my passwords from the WinGuides Password Generator for security’s sake.
3:47 PM : Creating Enterprise Pool
As with above, relevant screenshots.
Curses! The first error. I just forgot to install the SQL client tools.
4:14 PM : SQL Client Install
![]()
4:30 PM : EOD
Unfortunately, that’s where it’s going to have to sit for tonight. Hopefully will be able to finish off the pool by mid-morning tomorrow, barring the type of disasters that happened today.
Office Communications Server, Day 5
May 16th, 20088:08 AM : Sufficient Information
I arrived at about 6:30 AM and began gathering the data I would need to facilitate deployment of OCS. I have put together a spreadsheet that has most of the information I’ll need in it. Several IP addresses are missing from the edge servers (not that I would want to post that on a public Web site anyway) and I haven’t looked into certificate requirements for the Enterprise Voice servers. That said, I have enough to start creating entries in DNS for client autoconfiguration and I have enough information to install my first front-end server.
I should note that after I got my family situated last night, I did some more looking into Configuration Manager’s deployment things and I found some other resources that may or may not come in handy. I’ll list them here for future reference or for others’ perusal.
- The guy who did the whirlwind tour of configuring Configuration Manager: http://blogs.technet.com/deploymentguys/default.aspx
- Also related to them: http://www.deploymentforum.com/
- Microsoft Deployment blog: http://blogs.technet.com/msdeployment/
- Desktop Deployment tech center: http://technet.microsoft.com/en-us/desktopdeployment/default.aspx
Most of those links came from a Web cast from a couple of days back which I watched last night: http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?culture=en-US&EventID=1032373731&CountryCode=US
The bad news is that I at least have to get Microsoft Deployment running in order to deploy some bare-metal servers. The good news is that I know how to work with Microsoft Deployment. It’s Configuration Manager that’s giving me grief.
9:21 AM : Review Complete
Just finished reviewing IP addresses with my boss and have completed filling out my spreadsheet. I would recommend filling out a similar spreadsheet if you are working on deploying OCS. A couple of notes: first, I left our public IPs off the spreadsheet. Second, I still don’t have the certificate details for mediation server or speech server completed. I’ll work on those in more detail when I’m deploying enterprise voice. Here’s the list:
Edit: Removed planning sheet
12:34 PM : Configuring Microsoft Deployment
KMS is now running on the new domain, which facilitates deployment by allowing volume license operating systems to activate against a local server rather than MAK, which authenticates against Microsoft’s servers. I’m also picking away at producing requirements for my team(s) so that they stay busy and getting the vanilla Microsoft Deployment solution accelerator running. Until I have at least a bit done on OCS, I can’t dedicate any more time to Configuration Manager. Microsoft Deployment will at least allow me to push operating systems without actually physically touching the box, since we have an IP KVM. I’m also very used to setting up these types of deployment (I used BDD 2007 quite a bit).
12:53 PM : Added Windows Server 2003 R2 (32-bit) and Windows Server 2008 (64-bit)
I have two operating systems set up now in the Deployment Workbench. I also configured Windows Deployment Services with pretty much the default values (but set it to respond to all client requests, known and unknown). I imported the same Broadcom drivers I originally got from their RIS download, and I’m ready to set up a lab deployment point and create boot images.
3:15 PM : F Lock
That’s really not intended to be a derivative of a curse word, although I almost wish it were: I just spent the last hour of my life feeling even more frustrated because I knew I had Microsoft Deployment configured properly, but I couldn’t get PXE to actually pull down the boot image. It turned out that my F Lock key was on. (Some Microsoft keyboards have an F Lock key that open up some keyboard shortcuts.) The F12 command was actually being sent as Print. At least I didn’t print 1200 copies of a boot screen.
4:42 PM : WinPE2.1 & Broadcom
Apparently WinPE2 changed the way it enumerates hardware and accordingly has trouble installing/recognizing Broadcom network devices, at least in a 64-bit environment. Thankfully Jeff Huston has a solution that I’m trying right now. At least it didn’t have to do with my F Lock key.
5:47 PM : No Luck
Still no luck. It seems, however, that this was probably the problem with Configuration Manager in the first place. I’m quite certain I had the right drivers imported, but this post gives me hope that it’s just a network driver issue. Maybe if I can find the right driver, I’ll be able to get Configuration Manager running. For now, I need to run and help a friend drywall.
Office Communications Server, Day 4
May 15th, 20087:43 AM : Firehose
Wow. I just watched a screencast with the most information I’ve ever seen packed into 32 minutes and 14 seconds. The screencast was actually tremendously informative and would be beneficial for anyone who is working on deploying operating systems. Here’s the link: http://edge.technet.com/Media/System-Center-Configuration-Manager-2007-and-Microsoft-Deployment-Toolkit-Screencast/
I also found another interesting link last night that I forgot to mention. I didn’t use it, but it might be useful to someone else out there: http://myitforum.com/cs2/blogs/cstauffer/archive/2008/02/13/notes-on-getting-pxefilter-vbs-working.aspx
I’m going to try to apply some of what I saw in the screencast at this point.
7:52 AM : Fixed up Windows Deployment Services
I removed the boot images I’d added to WDS and set the PXE Response Policy to not respond. Apparently this is important so that Configuration Manager will respond instead.
7:55 AM : Configuring Client Agent Properties
I am now working on configuring the client agent properties appropriately to reflect customization in titles and subtitles. I entered a network access account on the General tab and customized the text on the Customization tab. For reference, I used:
|
Field |
Value |
|
Organization name |
Extend Health, Inc. |
|
Software updates |
Installing approved software updates |
|
Software distribution |
Installing new applications |
|
Operating system deployments |
Installing operating system |
8:15 AM : Configuring SAN
Trying to get some SAN volumes straightened out so our deployment files go onto the SAN rather than the local drive.
8:30 AM : Deleting All Packages, Advertisements, Etc
I’m taking Configuration Manager back down to square one as far as what I did last night (creating an OS, task sequence, etc). That means that the steps I record next would be as if Configuration Manager was brand new. The one thing I’m not going to do is recreate the collection I created last night, called PXE Registered Systems. When a new machine attempts to PXE boot, WDS will reply to that PXE boot and (through the integration via the PXE filter) redirect that PXE boot to Configuration Manager. However, Configuration Manager can’t do anything with the machine until it’s registered in the system. The PXE filter helps to register the machine with Configuration Manager so that Configuration Manager can then take over and complete its work.
10:09 AM : Adding Boot Images
I just added both of the default (WinPE2) boot images back into Configuration Manager and am assigning them to both distribution points (our primary distribution point plus the distribution point created by adding the PXE site role).
10:18 AM : Adding Windows Server 2008 64-bit
I set up a nice folder structure for Windows Server and added the operating system image to Configuration Manager. I also assigned the operating system to just the primary distribution point (not the PXE distribution point).
10:28 AM : Added Configuration Manager Client Software Package
Added a software package (from definition) for Configuration Manager Client Upgrade. The “upgrade” should also work for the base installation per the screencast referenced above. To add the software package, I chose Add Software Package From Definition, then chose Configuration Manager Client Upgrade (I later renamed this to Configuration Manager Client Installation). I copied the source files from the Configuration Manager 2007 installation CD into a shared folder structure and pointed the source to that folder. I then renamed the software package and published it to the primary distribution point only.
10:47 AM : Added Custom Backgrounds to Boot Images
I added a couple of nice background images to each of the boot images for aesthetic value.
10:51 AM : Deleting Operating System Image
Apparently I should have been doing this with the screencast – it turns out I was supposed to copy the entire folder for operating system source. I’m wiping out the operating system image, copying the whole source DVD, and recreating the operating system image to the same specs.
10:54 AM : Creating Operating System Install Package
I’m now creating an operating system install package, which is where I need the full source of the operating system. I also had to deploy the install package to the primary distribution point.
11:06 AM : Adding Drivers
I need to add the Broadcom NetXtreme II drivers since I frequently use these for deploying. I commented on this last night, so I won’t belabor the point further. I simultaneously created a driver package and added the drivers to the boot images, and then pushed new versions of the boot images to the distribution points.
5:29 PM : Heading Home
So I dealt with trying to get PXE running successfully for the rest of the day and am ready to give up for a while. I can’t forestall the actual Office Communications Server deployment any longer. I am pretty frustrated. OCS tomorrow, just so I have a break from this.
Office Communications Server Deployment, Day 3
May 14th, 200812:35 PM : Reinstall Complete
Yes, you read that right. In addition to everything else I had to deal with this morning (meetings, requirements delivery for some members of my team), I did a full reinstall of Configuration Manager 2007, this time on Windows Server 2003 R2. I had some nagging errors in the logs that just wouldn’t clear up so I attributed it to Windows Server 2008 and started over. The good news is that I was able to get the entire Site Status tree to show up green this time, meaning there are no problems. The weird part was what I had to do to resolve the errors. I think (am not sure) the resolution was to go look at and delete the error messages (which ended two hours previously) and refresh the component. I don’t like that as a resolution, but on another component I went to do the same thing and clearing the error messages and refreshing the component didn’t fix the problem: I actually had to go resolve a problem. The funny part is that it was back to SPNs again. During the reinstall, I just installed SQL Server 2005 SP2 to the same box to ease my pain, and I didn’t update the SPNs.
Whatever happened, it’s entirely green now. I’m hoping to get through a few more Webcasts on operating system deployment this afternoon, but I have 2.5 hours worth of meetings and only 2.5 hours left to go in the day. Might be another late night.
6:58 PM : Catch Up
Just finished grilling. For anyone who cares, it’s not a good idea to grill a still-partially-frozen New York Strip steak. It takes forever, burns the outside 1/4 inch, and goes from rare to well-done in about two minutes, which I missed. I like my steak medium-rare. Not a pleasant meal.
Today was pretty chaotic overall, which doesn’t lend itself well to blogging my way through the deployment experience. It largely failed today because I only had a few minutes here and there to work on deployment between meetings. So, to catch you up to where I am now, let’s run through what I did in those few minute sessions. I mostly toyed with operating system deployment. I watched all of Keith Comb’s’ Webcast entitled “Deploying Operating Systems with System Center Configuration Manager (Part 1 of 2)”. It was absolutely fantastic as far as detail is concerned, and should be watched even before deploying Configuration Manager – there are a couple of key things to note that he says (permissions on Active Directory’s System node and schema extension).
I also used his Webcast to start setting up the structure for deploying operating systems. One of the first things I did was to add the Broadcom NetXtreme II drivers to Configuration Manager – they are used by some of our desktops and servers, and WinPE2 doesn’t have the driver embedded, meaning that attempts to do a light-touch or zero-touch install on those computers fails. The driver was fairly easy to import and I won’t belabor the point by stepping through what I did. I will note, however, that you have to download the RIS (remote installation services, the name for Microsoft’s deployment solution from years ago) version of the drivers or you’ll just bang your head against the wall trying to figure out how to extract the drivers. Once the drivers are added, I made sure to add them to the appropriate boot images and update the distribution point. I’ll also note at this point that I love the ability to easily embed the drivers here. In the Microsoft Deployment Toolkit I know how to get around, but this approach was much more intuitive.
After adding the NetXtreme drivers, I pulled the file called “install.wim” off of the Windows Server 2008 x64 Datacenter, Enterprise, and Standard volume licensing DVD. A WIM file is a compressed image; Microsoft started using WIM files for deployment with Windows Vista. A WIM file can also store multiple operating system installs. In this case, the install.wim was over 2GB but contained images for both the full and core installs of all three predominant flavors of Windows Server 2008 x64. (Web Server 2008 ships on a different DVD.) When I first added the WIM to Configuration Manager, I added it to the node called Operating System Images. I also created a nice folder structure to make separation of images easier to understand. When I added the WIM, I couldn’t tell whether or not I was going to be able to access all of the images in the WIM. I’ll save you the heart-stopping anxiety by informing you that you’ll be able to access the images once you get to the task sequence portion of deploying operating systems.
That was actually the next thing I tackled: creating a simple task sequence to deploy the operating system. Let me stress that it was extremely simple: I entered the options to join the machine to the domain and selected the version of Windows Server I wanted to install from the WIM, and that’s about it. My thought at that point was, “I can’t believe how easy this is!”
7:14 PM : Not So Fast
The last thing I did before I left work (this is still technically part of the Catch Up, but warranted a logical division in the flow of the post) was to attempt to install the operating system. We have an IP KVM and 12 blade servers sitting bare metal in our data center, so I used the IP KVM to try a PXE boot off of one of those servers. No dice. The first thing I found out was that I needed to install the PXE service point role. After muddling through the interface to find where to add the role (root/Site Database/Site Management/<site>/Site Settings/Site Systems/<server>), I was able to add the role and set the options pretty much to their defaults. I did set a PXE password which I removed later when I found out that it would cripple zero-touch installs. I tried the PXE boot. No dice.
After a bit more reading, I found out that I also needed the Microsoft Deployment Toolkit 2008, so I downloaded and installed that. The article I was reading (http://technet.microsoft.com/en-us/library/bb978399.aspx) also said that I needed to configure the WDS PXE filter, which I tried to do from the Start menu, but there is some error with the tool. I’ve tried a couple more PXE boots after altering a few settings, but still: no dice.
7:46 PM : Validation Exists for a Reason!
Twenty minutes of the last half hour was spent responding to an e-mail, the other ten spent trying to figure out the problem with the PXE filter. I should have thought of this sooner, but mousing over the red exclamation point revealed the validation error: no Windows Deployment Services. I had installed the Microsoft Deployment Toolkit, which is also necessary, but hadn’t installed WDS (something the Microsoft Deployment Toolkit relies on). It is now installing; I had to add it through the Add/Remove Windows Components wizard because it wasn’t in the Add Role wizard.
7:54 PM : Reboot
Rebooted the server after install and am waiting…
8:06 PM : PXEFilter.vbs
Configured PXEFilter.vbs with the following settings:
sProviderServer = “MGR1″
sSiteCode = “DC1″
sNamespace = “root\sms\site_” & sSiteCode
sUsername = “”
sPassword = “”
sCollection = “DC10000D” ‘ Corresponds to PXE Registered Systems Collection (a custom collection I created)
8:11 PM : No Dice
8:28 PM : Break Time
I think part of my problem is related to Windows Deployment Services, but haven’t triangulated what the exact problem is yet. Have to take a break to get my son to bed.
9:43 PM : Back Again
10:07 PM : Breakthrough
Some success! I have a successful PXE boot. I made a couple of changes to get the PXE boot: I told WDS that it shouldn’t authorize itself with DCHP, recycled the service, and reauthorized it and recycled the service again. I don’t think that had anything to do with it, though. I think the change that had an effect was explicitly advertising the task sequence for installing Windows Server 2008 to the PXE Registered Systems collection, where the PXEFilter.vbs told Configuration Manager to create a machine account.
10:08 PM : Configuration Manger Logo
I see a big background that says “Microsoft System Center Configuration Manager 2007″. I’m guessing that’s a good sign.
10:12 PM : Success in IP KVM
Same thing via the IP KVM with a bare-metal server.
10:18 PM : Done for the Night
So after reboot, both servers I was working with (one virtual server, one real bare-metal server) still don’t have an operating system. I’m guessing that’s a problem with the task sequence, but since I didn’t put much effort at all into the task sequence, I’m not to worried about it. I think it will come up easily from here in the morning. I’m quitting for the evening.