DCM Discovery

Most Data Centre Migration projects have a “Discovery” phase early in the project. This is where the project collects and stores a detailed set of data about everything that is in scope. Exactly what constitutes a “detailed set of data” is open to some debate. Discovery is vital to the success of a DCM project, but clear objectives and scope need to be defined at the outset.

What data do you need?

Any DCM project must decide what Discovery Data it wants to collect and why. This is often not as simple as one might think. Almost always there are multiple audiences for Discovery data with different requirements. The volume and breadth of discovery data you need to collect will vary with your migration approach. For example, if you are moving everything as-is into new Data Centres using server imaging technology you probably won’t need a great deal of information as to exactly how applications and middleware are configured. On the other hand, if you are substantially re-engineering your applications, for example as part of a cloud migration, the detailed application configuration information will be a must have. Later in this article we look at some of the classes of data you may need to collect and comment on the relative ease or difficulty of collecting it.

What data have you already got?

Many if not all large organisations will have one or more CMDBs. At first glance it may be tempting to assume that deployment of Discovery tools is not necessary because all the required data is held in the CMDB. Whilst this is theoretically possible, I have never found it to be the case in all the DCM projects I have been involved in. Typically, whilst there is some overlap in requirements the objectives of a CMDB and a DCM Discovery repository are not exactly the same. In my view the role of an existing CMDB is to compliment and validate the DCM specific discovery exercise.

Where do you put the data (The Discovery repository)?

You need somewhere to put your discovery data. It may be possible to put it in your existing CMDB but frankly that is unlikely without significantly extending the schema to include all the DCM related data that the CMDB won’t have out of the box. You may decide to purchase a Discovery product which will be supplied with a supporting repository and also be able to provide extensive reports. Having said that it is surprising the number of DCMs that elect to run on spreadsheets. Some Consulting organisations are sniffy about using spreadsheets as the discovery repository. I am neutral on that subject and I have used spreadsheets successfully as the repository on several small to medium sized (100-400 server) migrations. The issues with spreadsheets centre on modelling the data, for example how many IP addresses do you allow for a host in your sheet? This is a classic data modelling problem that is usually solved in the Database world by putting data into first normal form (break out any repeating groups) https://en.wikipedia.org/wiki/First_normal_form . On your spreadsheet you are probably going to have to allow for some multiple number of IP addresses per server (and related info) which won’t be used most of the time (this is not 1NF by the way) There are other issues with using spreadsheets such as being able to do decent reporting quickly and controlling update access from multiple Discovery team members. Nonetheless they can be used successfully if you put some good process rigour around their use. Putting the master workbook in a repository with check-out/check-in controls is a good start. It is a good idea to direct one or more competent resources to start researching the whole issue of the DCM repository very early on in the project.

What data do you need to collect?

As I have mentioned previously there is no “one size fits all” answer to this question. Primarily it will be driven by the type of migration you are trying to do. So, for example a like-for-like migration to a new DC will have quite different requirements from a Cloud based migration with a significant amount of Application refactoring. In my view, generally speaking, the gathering of accurate and useful data becomes more difficult as you move up the stack from infrastructure to applications.

Infrastructure level

At the infrastructure level you will want to know what you have in some level of detail at both the physical and virtual layers. Some large-scale DC Migrations, at the application server level, try to do as much as possible on a “like-for-like” basis. Therefore, for these types of project it is usually important to get detailed configuration information. If a VM in the source environment has 6 virtual disks you need to record that rather than just recording the total storage figure. Why? Because there are probably 6 disks for a reason, and you will probably want to re-implement that. On a similar note many UNIX systems have complex volume management configurations, possibly with so called “raw” volumes (disks that are not file system formatted) for database products like Oracle ASM and Informix. You need to capture this stuff rather than just the output of a UNIX “df” command (something which I have seen in way too many discovery repositories). On the other-hand you may not be re-implementing say your ESX Clusters “as-is”. Often there will be a new design in the target Data Centre. Even so its worth capturing as much as you can, but it will usually less critical than the actual application server configuration. Other elements will include OS type and version, NICs, IP address info, VLANs. Generally speaking, this information is the easiest class to obtain in detail. With reference to CMDBs it is better to ask the actual server itself how it is configured rather than rely on static information in A CMDB repository which may be out of date (having said that some CMDBs have agents that periodically poll servers and grab their configuration information)

Middleware and Package software level

You will want to know what middleware and COTS software is installed on each machine. Particularly if you are doing a like-for-like move. Gathering this information is less straightforward than it may seem at first glance. In many cases if you use OS native tools to list installed software you often end up with a “wood for the trees” situation. The reason for this is that the software management utilities list everything including binary support libraries, hotfixes etc. So, in an extreme case these utilities will list hundreds of items. This information needs processing with some intelligence to tease out the “real” applications. It is also worth noting that some software vendors do not make use of the OS standard software management system and prefer to use their own home-grown software install process. In this case these products will not appear in the OS list of installed software. Probably the most notable example of this is Oracle RDBMS software. Many Discovery products are “aware” (to some degree) of popular Middleware and Applications and are able to detect their “fingerprints” on servers. So, it is fair to say that accurately gathering this data in a useful form for a DCM project is typically more complex than gathering infrastructure data.

Application level discovery & dependencies

There are many ways of doing a DC Migration. Often “time to complete” is a factor due to things like upcoming building lease expiry, service provider contract renewal etc. In these sorts of scenarios, it is common for the migrations to be at the “server image” level as this is one of the quickest ways of migrating workloads. In this case you don’t need to know in too much detail how the application works or is configured at a functional level. You are picking it up in its box (either literally or metaphorically) and moving it somewhere else. What you do need to know is what that application talks to North & South and using what protocols. This is so you can gain an understanding of what servers you may need to move together. For example, you want to know if an application is, say a 3-tier service with a Web front-end an Application server and a Backend database server. It may not be a good idea to have elements of the application running in different DCs during the migration (for example App server in one DC and the Database server in another DC) Understanding the dependencies will help you make better decisions on how to structure your migrations. This dependency data also helps with re-engineering the appropriate firewall rules in your new environment. It is surprisingly difficult to determine what-talks-to -what. Quite a few discovery packages do periodic netstats and analyse who is talking to who at that point in time. This is not a bad approach, particularly if the sampling is done frequently. However, it is not a 100% fool proof and it could miss some infrequent connections. Another technique is to analyse firewall logs for SYN traffic (start of TCP conversation) There is quite a lot of work in analysing traffic this way and also it will not catch UPD traffic (which does not use SYN) nor will it capture conversations on the same subnet that do not go through a firewall.

If you have more time you may decide to take the opportunity to restructure and rationalise your application estate (aka refactoring) In that case your discovery will need to go deeper inside the application “box”. These days this is particularly relevant for cloud migrations. Your existing applications may make use of databases, message queue and other components. Most of these are available as “Serverless” components in the cloud and you may decide to restructure your applications to make use of “xxx-as-a-service” offerings. In this case application discovery will need to be much more in-depth and you will need to seek out artefacts such as the original application design documentation. It is highly probable that you will need to do some degree of application re-coding if you go the refactoring route.

Rolling your own repository

Some organisations consider developing their own Discovery repository using the in-house development team. A lot of the time this is not a particularly good option unless you are intending doing an awful lot of migrations or develop a product. Why? Well the time and cost of developing a good DCM repository and supporting data entry and reporting software are unlikely to be justifiable for a single DCM project. In fact, the time to design and develop is probably more of a killer than the cost.

Discovery Tooling

Frequently I find it is necessary to deploy some additional discovery tooling. This tooling may be paid for products or it may be open source utilities ( I will be penning an article on some popular open source discovery tools in the near future) It may also involve some scope expansion and development of your CMDB collection agents (if your CMDB stack polls servers to determine their configuration) If you are a third party performing the DCM, which is where most of my experience lies, I often find that that there is a lot of resistance to the deployment of Discovery tools. The organisation, quite reasonably, wants to assess if the tools will have any negative impact their services before they facilitate mass deployment to their estate. The Security folk want to be assured that the tool presents no significant threat and may also have other concerns such as how the security and access to the data collected will be controlled. These factors plus the provisioning of any new infrastructure to support the tooling mean that it can take a surprisingly long time before any significant Discovery data is available for the migration team to analyse. Corporate IT management need to carefully consider the needs of the DCM project against corporate standards. I remember on one project we had a not half bad Discovery Repository that the third-party migration partner had developed over several projects using Microsoft Access. The customer would not allow us to deploy this because their corporate standard banned Access databases. They suggested we redevelop it in Oracle! DCM projects are usually a once in a generation phenomenon for most organisations and it pays to be pragmatic in terms of what it takes to get them over the line.

Conclusions

Gathering discovery data (and validating what you already have) is a vital part of any DCM project. It provides important input to most stages of the migration project. Some data is easier to gather than others. An approach that I have successfully used a number of times is to initially do a quick “first pass” discovery exercise. This provides a high-level view of how many servers you have, what OS and OS versions they are running and ideally what applications those servers run. This data is useful to the management team for a “rough order of magnitude” sizing of the project and may well form part of the business case. It is also useful to put on the table if you are considering having a third-party organisation help you with the migration. A more detailed discovery pass will still be necessary if the project goes ahead.