An update over a year later. The equipment that was mounted at the site below was pulled not too longer after the last post. We now have three 3.65ghz WiMAX sites deployed with a fourth in the works. I am feeling more confortable with the software and still working on getting the same level of monitoring that we have with our other equipment.
We have been doing propagation tests over the past two weeks with mixed results. The results we are seeing don't exactly match what we are supposed to be seeing. In locations with clear line of site, the estimated signals match up, but in locations with partial line of site and non line of site, we seem to be getting mixed results. The areas that have the worst results are locations that have a thick grove of trees between the tower and our test location (think tower, 2 miles of field, grove of trees, 2 miles of field, and then our test location).
I suspect that when the vendor ran the propagation report, they used a 60-70ft average tree height in their calculations when we are actually seeing 90-100ft trees. A quick email to the vendor proved me correct and the propagation was re-ran with 90ft average tree height. When I got the new results, it became clear that this specific tower location may not be well suited for the 3.65ghz band due to the topography and trees.
Despite the propagation study they ran, the vendor seems to think there is an issue (either misaligned sectors or other radio issue) and has asked us to test specific coordinates and report back the results. Ball is in their court while we look for a better tower to deploy the equipment.
At the same time we have been doing signal testing, we have also been doing speed tests. We noticed that when we apply speed profiles over 2mb to a radio that results get very sporadic. When we apply profiles at 8mb and above, we seem to hit a wall averaging about 3.4mb. The radios are capable of 12mb service, so this was a little concerning to say to least. I called the vendor and talked to the tech who wanted to coordinate some testing in the field.
I drove back to the tower and set up a link 1/10th of a mile out with clear line of site. The vendor logged into the radio and noticed that the radio was a bit too "hot". A little background... The base stations will tell the CPEs what power they should be transmitting. The function is Auto Transmit Power Control (ATPC). If the CPEs transmit power is too high, the base station will tell the CPE to turn its power as far down to hit the target signal strength. If the transmit power is still too high, the error rate of the CPE will go up resulting in speed issues. The fix is to turn the radio away from the tower to get the signal rate lower and reduce the error rate.
The other issue with speed is due to threading. Our test up to now have been single speed tests using iperf. The tech suggested that I try use multiple download threads and see if this helps things out. Sure enough, I specified five parallel tests and my speeds hit target rates up to 7.6mb, but my upload rates seem to hit a wall at around 3.7mb. In the base station, you specify the duplex DL/UL ratio. Since most internet traffic is download heavy, we specified 65-35% duplex ratio. Since the radio maxes out at 12mb, and you compute the 65-35% ratio with TCP/IP overhead, the numbers come out good.
After a little packet sniffing, I found the ODU status OID appropriately named "rbOduOperationalStatus". The OID will return an integer of 7 if the unit is up and 1-6 or 8 if the unit is down or in error condition. So I whipped up a small shell script that will cause Nagios to alarm if one of the radios return an ODU status other than 7. Since I am using second order diversity, I am only checking channels 1 and 2.
So the script takes two arguments, the radio address and the SNMP read community string. If either channel 1 or channel 2 ODU status is not equal to 7 the script exits with an error code 2 which in turn causes Nagios to report a red alarm. The echo strings will report the status of both channels regardless of alarm state.
I climbed the water tank and did not notice any obvious problems. Since we do not have any spare equipment, I swapped the IF cables between channels and confirmed that the same ODU would not fire up. I pulled the ODU and taped up the cable ends since its supposed to rain tonight. I sent the ground guy on his way and I made off for the shop.
On the way back to the shop, I decided to call the vendor and attempt to start up an RMA. The tech had asked if I connected the ODU to the radio on the ground (which I did not do, duh). So I drove back and low and behold, the ODU fired up just fine when connected to the radio on the ground. We had already ruled out lightning arrestors, cables, and the radio itself by swapping the channels, so according to the vendor, there are only two explanations. The first possible explanation was that its rare for the ODU to fail on initial boot and if there is an issue, they will typically fail after they "bake" for 10-15 minutes. The second explanation is a ground loop (please god, no). I hooked the ODU back up to the radio on the ground and let it sit for about a half an hour without any change in status. While the radio was "baking", I called back the ground guy and got prepped to climb the water tower again.
While waiting for the ground guy to show up, I looked at the pictures the installers took and noticed the metal to metal contact between the mast, lightning arrestor, and handrail. Long story short, the ground guy showed up, I climbed, and fixed the metal to metal contact on the problem sector and double checked the other sectors for similar problems (which there was on a couple). The first picture below is the original install (prior to weather sealing) and the second is the finished work that I did.
Once I finished up the weather sealing, we fired up the ODU and all seems well. Tomorrow I will work on getting the SNMP OIDs for the ODU status and work up a Nagios plugin so we will have some kind of notification when an ODU goes down.
Got back from vacation and discovered that one of the ODUs is not responding. I had one of the guys drive to the site and swap the channels and ruled out the IDU. This leaves either the tower or ground lightning arrestors or the ODU itself. Since we are just testing at this point and no customers are installed, I am going to wait until tomorrow to go on-site (got to catch up on my email).
The Airspan radio that wouldnt cooperate with me yesterday turned out to be a known (but forgotten) wiring issue with the tower side lightning arrestor. One pair of the tower side arrestor was wired backwards and the ground arrestor needed to match or it wouldnt work. Interesting side note is that the pair that was wired wrong is not even used. Did I mention that I'm not a big fan of wireless.
Turned out to be a perfect day with temperatures in the 60s and partly cloudy skies. Got to the site a little later than I wanted, but I had enough help on hand to get things knocked out. We assembled the antennas and masts on the ground and then roped them up to the catwalk. The masts are just 1.5" galvanized EMT with caps. We used stainless steel hose clamps to fix the masts to the water tanks handrail. Each sectors antennas needed to be about 4.25ft apart and it worked out just perfect that the vertical cross members of the handrail were just under 4.5ft apart which is good enough. It was a long day, but we are up and now talking to the ODUs. We will start site testing tomorrow. Below are pictures of one of the sectors (not weather sealed or grounded yet) and the finished cabinet.
We installed the new cabinet, sank a new ground rod and laid down the new ground wire to the new cabinet, and got the old equipment migrated over. We also installed the GPS unit and have the cabinet fully wired up. If the weather cooperates, we should be in good shape to start installing the tower side equipment tomorrow.
The migration from the old cabinet to the new cabinet was supposed to be an easy transition, but there was not enough room on the wall to have both cabinets mounted. I ended up wiring all new lightning arrestors for the Airspan equipment with new leads to the new cabinet and then disconnected the old cabinet and wired up the Airspan radios to the new arrestors. Things went pretty smooth with the exception of one sector that did not want to work with the lightning arrestor. If you bypassed the arrestor, the radio came up, if you put the arrestor back in it would not. I tried multiple arrestors with no change. Tried re-crimping the connections, still no luck. I left the radio bypassed to maintain service and will have to revisit this tomorrow I guess. It is supposed to storm tonight, so my fingers are crossed that the radio doesn't get fried.
I started getting the equipment prepped and ran into a couple of gotchas. The IF Polyphasers (lightning arrestors) we received do not seem to have any mounting hardware, which is odd. This is not a big deal for the tower side as we can just secure the arrestors with tin-wire or zip-ties, but on the ground side, I need to mount them to a backing board. A quick trip to the hardware store yielded some 1in corner braces and I'll use copper ground lugs to wire the units to ground. Since I was working on the lightning arrestors, I decided to go ahead and mount the needed ground lugs on the ground bar.
Another gotcha is going to be the mounting of the new cabinet. While we have installed a couple these cabinets before, this would be the first unit with an integrated air conditioner. The A/C unit almost doubles the weight of the cabinet, so it will take more than two guys to install the cabinet. To make things more interesting, the space that we have to work in is limited, so getting more than two guys to install the unit would be impossible. We started talking and it was decided that we would remove the A/C unit and then re-install it after the cabinet was mounted. The A/C unit is mounted to a mounting plate that is tap-screwed into the cabinet. The initial plan was to remove the plate, but it was quickly figured out that it would be very hard to install that plate back onto the cabinet after mounting the cabinet. We decided then to remove the unit directly from the plate which uses 8 hex bolts to mount the A/C unit from inside the cabinet.
Its towards then end of the day and all of the equipment has been programmed and prepped for install. Hopefully the weather holds out next week for the install.
It was decided that we would start installing the BreezeMax gear next week. I spent most of the day getting all the parts and pieces together and we worked up an install plan. While we were installing the new gear, we decided that we would clean up the grounding on the existing installed radios.
Now that we have provisioning figured out, today was mostly spent on figuring out how the equipment is going to be mounted in the cabinet. This sounds like a pretty easy stage in our install, but when you have to factor in cable paths, ease of repair, etc, it becomes a rats nest very quickly. On top of that, have a couple of your coworkers, each with a different idea of how things should go, and the process starts to get a little frustrating. In the end, I think we came up with an excellent layout that incorporated a little bit of everyones ideas. The test is when the equipment actually is deployed.
I spent a little bit during the afternoon to figure out how we are going to monitor the customers signal, error rates, and traffic. With our existing gear, the information is polled from each CPE and BSR. With the BreezeMax gear, you no longer poll the CPE, just the BSR for the CPE information. This gets complicated further when you find out that there is a unique radio identifier assigned to each associated CPE that does not seem to reference anything (CPE MAC address, etc). So in order to get the radio identifier, you need to snmpwalk a specific OID in the BSR and grep out the username. When you find the correct username, you take the last 5 octets of the OID and that is the radio identifier. So in order to get the RSSI and signal rates, you either have to keep a table of username and identifier, or snmpwalk and grep the identifier out of the BSR.
Our resellers has a Windows based monitoring system based on the open source graphing system called Cricket. It consists of the rrdtool to collect information and then generates png or jpg graphs in daily, weekly, monthly, and yearly charts. For our existing Airspan and Alvarion equipment, we used the graphing templates to make our own "non-Windows" monitoring system. Since the BreezeMax gear is fairly new, the reseller does not have any templated made yet, so we either have to wait or come up with something on our own.
Fired up the equipment again and spent most of the day automating the provisioning process. This involves nothing more than starting up a packet sniffer, running through the manual provisioning process with the management software, and then looking up the various SNMP OIDs. Using those OIDs, write a script that sets various settings and then confirm that the settings take in the management software. When we did this for the Airspan equipment, it took about two weeks to get just about everything we needed. For the BreezeMax gear, it took only a day to figure out how to get a CPE from temporary mode to permanent and then assign the appropriate service profiles to that CPE. Cool stuff.
So now, our installer process will work something like this... Installer goes to customer site and programs the radio with a username. The username is the unique identifier for this CPE (the CPEs MAC address is also unique, but using the username is easier). The installer will open up a web page (at this point, installer has limited internet access) and open up a web page that will ask for the username the installer is installing and what tower he is pointed at and clicks submit. The provisioning script will then scan all the base station radios (one for each sector) on that tower and with the information in the database, set the appropriate service profiles (upload/download speed, VOIP, etc).
One thing I did learn was the use of chained set commands with SNMP. To seasoned SNMP programmers, this is not new, but it had me hung up for a while. When setting a service profile for a CPE, the SNMP set string the management software sends consists of 4 OIDS. In the past, I would send a snmpset command for each OID, like:
While this should work, for some reason it does work when setting the service profiles. After a couple of google searches, I found that chaining the OIDs together in a single snmpset command worked like it should:
So I learned something new today. Now that I have the OIDs figured out, I whipped up a shell script and tested it on a couple of radios with excellent results. Now that the framework is established, I pass the script over to my colleague for him to write the actual HTML and CGI code. Also, Its hard to keep the equipment on for more than a couple of hours at a time with the ODUs and antennas plugged in. After todays testing, I ended up with one hell of a headache.
We received the equipment today and checked it into inventory. All looks good except for one vital part, the GPS receiver. There is a CAT5 connector on the pack that goes to the GPS antenna. The connector was in pretty banged up shape. The shipping box looked fine, so the damage had to happen when the reseller packed it up. I contacted them and they will send us a new one. At first it looked like we would not be able to fire up the equipment without the GPS, but after some discussion, we decided to try hooking up the GPS and see what happens. After a couple of minutes of wiring, we fired up the equipment and despite the torqued connector, the GPS appears to be working.
The equipment was supposed to ship earlier this week, but was held up for some unknown reason. Got word that the equipment shipped today. If all goes well, we should see the stuff tomorrow (fingers crossed).
We have the test site selected, but still need to have some site prep done. The site we selected is a 150ft water tank that currently has 2.4ghz and 900mhz gear installed on the top of the dome. The idea is to mount the 3.65ghz WiMAX gear on the handrails (approx 125ft) and then compare signal propagation between the different frequencies. Ideally, we would have wanted the 3.65ghz gear at the same height, but there is not enough separation room.
The horrors of grounding
Up to now (on water tank sites), we basically ground the tower side equipment to the water tank and tie in the ground equipment to a 8 foot copper ground rod. In fact, we have been doing this for the past couple of years with minimal equipment lost due to storm damage. Apparently, this is arrangement is considered *BAD*. Let me rewind for a second. During the training, it came up that the WiMAX gear is a lot more sensitive to ground loop/fault issues than our existing equipment. When we started looking at our site, we quickly found out that everything that we had been doing in the past has been wrong. I am not an electrician, I am a "data" guy. We have been actively discussing the grounding issue for 3 days and counting now. I learned that a 150 foot metal structure filled with water and connected to miles of pipe is not grounded. We confirmed this by talking to a water tank engineer and testing with a ground rod and meter. So now we need to either ground the water tank or ground the equipment connected to the tank. Grounding a water tank is way to expensive and since we are not sure the gear will not be permanently installed there, not practical. So now we need ground the tower side equipment and keep it isolated from the tank (which is ungrounded, I'm confused).
The not-so-funny thing about this on going discussion is that I have 15 different theories on grounding and electrical code (some conflicting) from 5 different people (none of which I believe are certified electricians). If we come up with a solution, I'll post it. If your reading this and have run into this, please let me know what your solution is..
A couple of us in network operations finished the two day training today. The webinar based training used our equipment that was set up at the resellers office, so basically we were working in our equipment remotely. I was a little skeptical of this arrangement, but it worked out pretty good. Training mainly focused on the provisioning of the customer premise equipment (CPE) which is quite different from our existing equipment. If anyone is familiar with RPR or DSL provisioning, its along the same lines. Instead of assigning the CPE with a SSID and setting up a MIR (maximum information rate) and/or CIR (committed information rate) and away you go, this system uses service and customer profiles to be applied. Also, there is no direct TCP/IP connectivity via wireless to the CPE itself, so this is something else we will need to deal with.
Before anything can happen, we needed to set up various service and customer profiles. These profiles determine upload/download rates, QOS and vlan settings, etc. This requires a bit more pre-planning than our existing equipment where we could set up MIR/CIR rates on a per radio basis. One way this becomes an issue is a scenario (one that we deal with on a regular basis) where you have a tower that has 40 customers and one customer starts a bunch of bit-torrent downloads that quickly saturate the radio and in the end drags down the other customers. Normally, we just log into the base station radio and turn down the offending customer speed until they stop. In this new system, we would need to create a service profile, possibly a QOS profile, and a customer profile for each customer.
There are two ways to get a CPE provisioned with the system, manual and radius based. With the radius setup (the preferred method), the CPE is programmed with a username/password combination and the base station is configured to connect to a radius server. The radius server is then configured with your users and the services they are configured for. So when the CPE associates to the base station using a specific username/password, the base station will check the radius server to see if the login is valid and then the radius server will spit back what service(s) that customer has assuming that the logon was valid.
The manual method is a bit more complicated. The CPEs will by default, associate to any base station regardless of provider, assuming that the base station is operating wide open. The CPE will associate in "temporary mode" with a default service profile and then the ISP operator will need to manually assign the CPE to "permanent mode" and then assign the appropriate service profiles. So every time a customer is installed, the installers will need to call the operator to have the radio permanently provisioned.
We are still debating over the provisioning method. Before the training, we had pretty much written off the radius method due to internal implementation issues. During the training when we actually did the manual provisioning, we saw what a PITA it is and started seriously looking at the radius method. Now, we currently have scenarios where more than one service profile needs to be applied to a user (ie a VoIP customer will have two profiles where a non-VoIP customer would only have one profile). When we tried to apply multiple service profiles to a CPE via radius, it wouldn't work. We have a ticket open with Alvarion concerning this. If we cannot get the radius issue resolved, our only hope is that we can automate the manual provisioning through SNMP.