Wednesday, October 31, 2012

Using SQL to Verify Configurations for CCIE-V Lab

It was bound to happen: two separate blog series have collided. Ever since I started down the path of getting the CCIE-V I have been thinking of ways to leverage the
Cisco Unified Communications Manager (CUCM) CLI to speed up the provisioning process. In most cases, I have determined that while I could do some complex provisioning from the CLI it didn't really save much time (or, in many cases, it cost more time).

That isn't to say there aren't some shortcuts available via the CLI. In fact, I think using SQL select queries from the CLI can save time when doing verification. A case in point: this past week I had a problem with RSVP between two sites that resulted from a configuration issue. I used SQL to identify the configuration issue and eventual fix in about 1.5 minutes.
Primer

For those who haven't been following the CUCM SQL query series, a brief primer on using SQL with CUCM is provided in the first blog of this series.

Background

During the validation phase of a particular mock lab I encountered an issue where calls from Site A to Site C were being re-routed via AAR. That would be outstanding if I was testing a WAN congestion scenario. However, that behavior is not so good when you are making a simple station-to-station test call. 
So, the obvious conclusion is I botched up the RSVP configuration somewhere. The question is where? Did I mis-configure the WAN interfaces, dspfarm, or SCCP CCM group in IOS? Or did I botch something up with media resource management, device pool config, or device associations in CUCM?
 
The "Non-Query" Method

The troubleshooting process one applies in this case is as important as the method used to do the validation. Your troubleshooting approach needs to be optimized for speed. Which really means that (a) you need to avoid wasting time looking at unrelated or remotely related configs and (b) you should start by looking at the likely "screw up" points. 

This latter facet of your troubleshooting approach may be different per individual. You may be one of those people that "forget" to "no shut" the dspfarm profile, or maybe you have a habit of forgetting to populate MRGLs, or maybe you have a tendency to put resources from Site C into MRGs/MRGLs for Site A (because, you like lower case device names and "a" looks very similar to "c" through tired eyes driven by a mind which is thinking two steps ahead of your current position. In any case, you should start troubleshooting by looking at the things you have a tendency to miss during your practice labs.

For me, that is forgetting to "no shut" the dspfarm or, in the case of RSVP MTP resources, forgetting to apply the "rsvp" keyword. So, step one in my world is:

SiteA-RTR#sh ip rsvp int    
interface    allocated  i/f max  flow max sub max 
Se0/3/0      0          1158K    1158K    0   
Se0/3/0.2    0          64K      64K      0   
SiteA-RTR# 

SiteA-RTR#sh sccp
SCCP Admin State: UP
Gateway IP Address: 10.3.110.1, Port Number: 2000
IP Precedence: 5
User Masked Codec list: None
Call Manager: 10.3.120.11, Port Number: 2000
                Priority: N/A, Version: 5.0.1, Identifier: 1
Call Manager: 10.3.120.10, Port Number: 2000
                Priority: N/A, Version: 5.0.1, Identifier: 2

Transcoding Oper State: ACTIVE - Cause Code: NONE
Active Call Manager: 10.3.120.11, Port Number: 2000
TCP Link Status: CONNECTED, Profile Identifier: 3
Reported Max Streams: 8, Reported Max OOS Streams: 0
Supported Codec: g711ulaw, Maximum Packetization Period: 30
Supported Codec: g711alaw, Maximum Packetization Period: 30
Supported Codec: g729ar8, Maximum Packetization Period: 60
Supported Codec: g729abr8, Maximum Packetization Period: 60
Supported Codec: g729r8, Maximum Packetization Period: 60
Supported Codec: rfc2833 dtmf, Maximum Packetization Period: 30
Supported Codec: rfc2833 pass-thru, Maximum Packetization Period: 30
Supported Codec: inband-dtmf to rfc2833 conversion, Maximum Packetization Period: 30

MTP Oper State: ACTIVE - Cause Code: NONE
Active Call Manager: 10.3.120.11, Port Number: 2000
TCP Link Status: CONNECTED, Profile Identifier: 2
Reported Max Streams: 200, Reported Max OOS Streams: 0
Supported Codec: g729r8, Maximum Packetization Period: 60
Supported Codec: pass-thru, Maximum Packetization Period: N/A
Supported Codec: rfc2833 dtmf, Maximum Packetization Period: 30
Supported Codec: rfc2833 pass-thru, Maximum Packetization Period: 30
Supported Codec: inband-dtmf to rfc2833 conversion, Maximum Packetization Period: 30
RSVP : ENABLED

I use sh ip rsvp int to get a quick view of how RSVP is provisioned on my Site A router. I also use show sccp to see the active SCCP resources. In this latter command I am looking for the "Oper State:" of ACTIVE for my MTP and the RSVP: ENABLED line. I also check the Supported Codec lines to ensure I have applied the proper codec configurations. 

Based on the output above and knowing what my lab requirements are, my issue is not on Site A. Of course, I need to repeat the commands on Site C to ensure that both sides of the RSVP signaling path are operating from the same sheet of music. 

If you find your issue with the IOS configuration then fix and test. If not then the next step is to check CUCM and this is where time can slip away from you quickly. The high-level flow:
  1. Web to your CUCM publisher node and log on using an account with appropriate permissions. 
  2. Check Media Resource assignment to Media Resource Groups (MRG)
  3. Check MRG assignment to Media Resource Group List (MRGL)
  4. Check MRGL, location, and region assignment to Device Pool
  5. Check Device Pool assignment to phones
That is a lot of clicking and waiting if you ask me. It is really only a couple of minutes but if I can shave any time off of my process, I will.

This Week's Query Queries

Instead of jumping over to the GUI, my tendency is to drop to the CUCM CLI whenever feasible. People following my SQL query series may think I am a loon but when you spend years tinkering in the database, it just becomes second nature. This path is not for everyone and I am not suggesting that people who are uncomfortable with the SQL query language try to use it as a tool when taking their CCIE-V lab. By all means, use what works best for you. For me, SQL is a perfect tool for a problem like the one I ran into last week.
    The Moving Parts

    I usually give an overview of the db tables at this point. Since we are dissecting a specific issue, I'll instead start with the method I would use to start finding my issue. As noted previously, I start with things I am likely to screw up first:
    • MRM Allocation. First, I would check to ensure that I have the right media resource group(s) assigned to the right MRGL.
    • Device Pool Provisioning. Next, I sometimes "fat finger" region, location, or MRGL assignments to device pools.
    • Device Allocation. Finally, I would check that phones are assigned the correct device pool. I don't usually botch this, but it is a good check if nothing else pans out. 
    These queries can get lengthy but if you are fluent with them they aren't anywhere near as clunky as they may appear. At least, that has been my experience. I guess the one exception would be the Device Allocation check. It may be faster to do a device pool search from the phone listing page in CCMAdmin.

    Query 1: Check MRM Provisioning
    You can approach this check in a couple of ways. You can look at MRGL->MRG->Media Resource assignment all in one go, you can focus on MRGL->MRG assignment (and assume you assigned Media Resources to the correct MRG), or you can check the relationships using two separate queries.

    Because I am more inclined to mess up MRGL->MRG allocation, I check these first.

    select mrgl.name as mrgl,mrg.name as mrg 
    from mediaresourcelist mrgl 
     inner join mediaresourcelistmember as mrm on mrm.fkmediaresourcelist=mrgl.pkid 
     inner join mediaresourcegroup mrg on mrm.fkmediaresourcegroup=mrg.pkid
     where mrg.name like '%RSVP%'
    
    The following is sample output from the above query:

    mrgl    mrg              
    ======= ================ 
    SA_MRGL SA_MTP-RSVP_MRG  
    SC_MRGL SC_MTP-RSVP_MRG  
    

    Hmmm. We are good up to this point, so let's check media resource assignment to MRG. (Note: If you want to make this check more generic, which may be good for a final validation process, then ditch the 'where' clause to see all MRGL->MRG associations. Further, I can filter on RSVP in my where clause because that is my naming convention.).

    Let's check those media resource assignments:

    select mrg.name as mrg,d.name as resource 
    from mediaresourcegroup mrg 
      inner join mediaresourcegroupmember mgm on mgm.fkmediaresourcegroup=mrg.pkid 
      inner join device d on mgm.fkdevice=d.pkid 
    The following is sample output from the above query:

    mrg              resource  
    ================ ========= 
    SA_CFB-SW_MRG    CFB_3     
    SA_CFB-SW_MRG    CFB_2     
    SA_MTP-SW_MRG    MTP_2     
    SA_MTP-SW_MRG    MTP_3     
    SA_MOH-Sub_MRG   MOH_3     
    SA_MOH-Pub_MRG   MOH_2     
    SA_MTP-RSVP_MRG  sc-rsvp   
    SC_MTP-RSVP_MRG  sc-rsvp   
    SC_MTP-XCODE_MRG sc-xcoder 
    SA_MTP-XCODE_MRG sa-xcoder 
    

    There it is! Notice that I stuck the "sc-rsvp" device in the SA_MTP-RSV_MRG. What a bone head move! Now, I can go to the MRG and do the ol' MTP shuffle and we should be in business.

    Yes, it is possible to combine the last query with the one before it. However, the resulting query is lengthy and prone to typos. But for completeness, here you go:


    select mrgl.name as mrgl,mrg.name as mrg, d.name as resource 
    from mediaresourcelist mrgl 
      inner join mediaresourcelistmember as mrm on mrm.fkmediaresourcelist=mrgl.pkid 
      inner join mediaresourcegroup mrg on mrm.fkmediaresourcegroup=mrg.pkid 
      inner join mediaresourcegroupmember mgm on mgm.fkmediaresourcegroup=mrg.pkid 
      inner join device d on mgm.fkdevice=d.pkid
    

    The benefit of using a query like the one above is that you can get a single view of your MRM allocation with one command. Probably more handy as a tool in the "final verification" phase of your lab strategy than as a troubleshooting tool.

    Query 2: Check Device Pool Provisioning
    Let's assume that we had to press on with our troubleshooting. The next query I would use is focused on checking our device pool allocation. It is worth noting that your mileage may vary depending on your provisioning approach in the lab. I typically assign MRGLs and locations at the device pool level, so it seems reasonable to create a view from the perspective of the device pool. 

    You can run the query from the CLI using run sql [insert query]:


    select dp.name as dp, mrgl.name as mrgl, loc.name as location, reg.name as region 
    from devicepool dp 
        inner join mediaresourcelist mrgl on dp.fkmediaresourcelist = mrgl.pkid 
        inner join location loc on dp.fklocation=loc.pkid 
        inner join region reg on dp.fkregion=reg.pkid
    


    The above query provides the following output in my sample configuration:

    dp               mrgl    location region   
    ================ ======= ======== ======== 
    SA_Media-Std_DP  SA_MRGL SA_Loc   SA_Reg   
    SA_Media-MoH_DP  SA_MRGL SA_Loc   G711_Reg 
    SA_Media-ITSP_DP SA_MRGL SA_Loc   G729_Reg 
    SA_User-Std_DP   SA_MRGL SA_Loc   SA_Reg   
    SB_User-Std_DP   SB_MRGL SB_Loc   SB_Reg   
    SC_User-Std_DP   SC_MRGL SC_Loc   SC_Reg   
    

    What I like about this "view" of the data is I am able to quickly see how all device pools are provisioned. For our specific issue, we only care about the the device pools assigned to phones at Site A (SA_User-Std_DP) and Site B (SB_User-Std_DP). From the above output, we are provisioned as expected.



    Considerations

    The obvious consideration is "comfort". If you are uncomfortable with using SQL queries from the CLI then this method will not be beneficial. In fact, it will probably mess up your rhythm completely. However, if you are savvy with running SQL from the CLI then there are two benefits. 

    One, it may save some time. How much? In my timed runs anywhere from 30 - 60s. I was purposefully testing my typing speed vs. my clicking speed. If you weren't deliberate in your navigation in CCMAdmin, more time could be spent there. So, I estimate 1-2 minutes of time savings at a minimum. Of course, it depends on how fast you can type (I can type pretty fast) and how many mistakes you make when you type.  

    More time savings may be realized depending on how your brain processes information. Do you like text-based tabular output or do you prefer the typical output in CCMAdmin? It seems like a silly comparison, but I think it is valid. My brain processes plain text in a tabular format faster. I think it has something to do with the fact that there is more data on the screen when using CCMAdmin than what is dumped out from a targeted SQL query. At least, that is my own personal conclusion. Basically, I have found that cranking out the SQL query, processing the output, and determining a course of action is a more efficient flow for me.

    Anyway, in the grand scheme of things, the perceived time savings isn't enough by itself. One typo and you lose your time savings, which isn't that high anyway.

    The second benefit (from my perspective, anyway) is that with approx 1m15s of work I can get a snapshot view of not only the config that is contributing to my problem but of all device pool and MRM allocations. So, it is a pretty powerful "general validation" step.

    The overarching consideration is your approach to troubleshooting. You know yourself better than I ever could and, if you have been practicing, you know what you are most likely to botch up first. So, the order you check things should target your weak spots first. The method you use is secondary.
     



    Thanks for reading. If there is a specific query you are interested in seeing, let me know using the comments!

    4 comments:

    1. Great post,
      I like sql queries but I don't believe remembering these query with a lot of pressure on me LOL

      best regards!

      ReplyDelete
      Replies
      1. I certainly know what you mean and I thought about that when doing the write up. But decided to post it anyway! Thanks for reading!

        -Bill (@ucguerrilla)

        Delete
    2. They should just give you your numbers. :)

      Thanks for your posts here and at Netcraftsmen.

      ReplyDelete
      Replies
      1. Haha. I appreciate you saying so but I definitely prefer to earn my keep! Thanks for following. Glad you find the information useful.

        -Bill (@ucguerrilla)

        Delete