Thursday, October 16, 2008

Is Google the next Bell Labs?

If you had asked in the late eighties and early nineties the question "Which organization leads in research and innovation in computer science (and for that matter most of the science disciplines)?" anyone would reply back "Bell Labs". With the merger of Alcatel and Lucent last year and Alcatel-Lucent subsequently announcing that it would stop funding researches in fundamental science at Bell Labs, it is no longer the hub of innovation in computer science. The Plan-9 (Inferno) distributed OS project hasnt got the thrust in market it ought to have. So where are all the great minds that worked in Operating Systems and Distributed Systems at Bell Labs now?

Google for Robert Pike and Ken Thompson and you will know where they are :-) The list of top notch computer scientists at Google doest just stop here. Vinton Cerf is also at Google. Is Google redefining research in distributed communications and continues what was left by Bell Labs?

Sunday, August 31, 2008

Java vs Twenty-Twenty

I always have this opinion: Teaching Java as the first programming language to a Computer Science graduate even before teaching the fundamentals of computers and computing, is not going to produce any quality developers and computer science professionals. Of late I started comparing this with how 20-20 is spoiling cricket.

Why is teaching Java as the first programming language a Twenty-Twenty affair? Simply because when you throw open a gamut of libraries and ask a developer to use it to program his application without even teaching the fundamentals of algorithms, why libraries are needed, how to develop modular and object oriented systems etc., he forgets to think what happens behind the scenes and how those libraries are implemented, in what thread context a particular interface method is invoked by the run-time, how are the Java threads implemented, how are they scheduled etc.,.

Compare this with introducing Twenty-Twenty to young kids who learn to play cricket. They will start forgetting that the fundamentals of cricket lie in a solid defense, strong temperament, concentration and will power to last against the odds. Classic case is Yuvraj Singh who seems all lost when facing quality opposition in test cricket.

I don’t have anything against Java as a programming language. But when I see engineers claiming to be “expert Java programmers” lack even the basic skills it becomes very tough to manage a product with them. They increasingly struggle while trouble shooting issues. Even if they study the java documentation and go through it they don’t understand how it works. It is for this reason; I discourage recruiting the so called “Java experts” without sound fundamentals.

Saturday, August 30, 2008

Why is it still a distant dream to get more Olympic medals for India?

Everyone is happy that India, for the first time in history, got more than 2 medals in an Olympics and the media are gaga over it. They even started predicting 10+ medals for India in London 2012. Though me too is confident that one day India will reach there, I am skeptical if it would be within four years. Here are my reasons.

We keep cribbing about government not spending enough for sporting infrastructure. Where will the money come from when the taxes collected is not sufficient even for our basic needs and defense?.

Does our Indian education system impart the value in our citizens to be duty bound towards our country and pay our taxes? Does it give importance to the moral and ethical responsibilities of a citizen to be taught at school level? It all starts from the exam system by giving more importance to marks than knowledge and hence making the student get it by all means, even if unethical. A value less education produces tons of graduates and our GDP is also increasing but income to the government is still less. How can the government spend on sporting infrastructure when there are hundreds of basic necessities to be fulfilled?

The second malady of our exam based education system is it inherently discourages sports. In the mad race to secure perfect 100 in all subjects in board exams who will look at practicing sports?

Unless a change happens at this grass root level we can only dream of becoming a sporting super power. We will produce a few gems here and there but they will come up despite the system and not due to it. Miles to go .... but lets keep the hope...

Tuesday, July 29, 2008

Evaluating COTS Telecommunication Middleware

The market for Commercial Off The Shelf (COTS) middleware for high availability and management of telecommunication application along with modular compute platforms like ATCA and Carrier Grade operating systems like Carrier Grade Linux (CGL) are gaining momentum. Developers working in the teleocm industry whose expertise mainly lie in the development of protocol stacks and telecommunications applications, are often clueless as to how to evaluate these commercial off the shelf middleware. "Managers" :-) believe that a "Carrier Grade HA middleware" is a magic box which when integrated with their telecom application makes it five nines available overnight. But the fact is that the COTS middleware are just platforms over which one has to integrate their application. The onus of designing how the application needs to be deployed to achieve high availability still lies with the application developer. These COTS frameworks only provides mechanisms to incorporate those designs. Think of the COTS middleware as an "Operating System". An Operating System like Windows / Linux as such is of no use to the user. Its the applications built on top of them that the user is using. Similarly the COTS HA middleware is just a platform.

As a developer who has worked in the development of a carrier grade SAForum compliant middleware and also architected / designed some proprietary HA and management frameworks I would like to bring out my own list of questions to evaluate any commercial off the shelf middleware.

1. Does the COTS middleware provide a tool to model the application in terms of services, their redundancy policy etc.,. (prefarrably in SA Forum compliant way)? What redundancy models it supports? 1:1? N+K? N-Way active load balanced?

2. Have any legacy network element been integrated with the middleware? What was the redundancy model followed for it? What was the cycle time taken to make the legacy software into a HA product using your middleware?

3. What operating systems and hardware platforms are supported?

4. What is the measured switchover time from the active machine to the standby machine once a failure is detected by the HA middleware? How is the switchover indicated to the standby application? Does the middleware provide a configurable interface (preferably data driven / script based) for defining the actions to do on getting a switchover indication?

5. What is the replication mechanism followed by the HA middleware? Message based redundancy? Checkpointing? Distributed shared memory? Explain its performance in terms of number of messages replicated / checpointed per second with the average message size being 64 bytes.

6. Does the middleware provide a management framework? If so what is the interface provided by the framework? What is the database used for storing managed objects? Is it a proprietary database or a commercial database? What is the model follwed by the database - (i.e) Is it relational or hierarchical object oriented? How are the managed objects addressed in the database? Also does the management framework provide any IDE to model the management information (i.e) Does it provide a management information modeling tool?

7. How to integrate an existing application MIB with the management middleware? If the internal managed object database follows hiearchical object oriented model how do you map SNMP OIDs (which follow relational database model) to the managed object instances in the hiearchical database?

8. How are the events on hardware (boards, hot-swap events) mapped to state transitions on the related managed objects?

9. How can one define cross consistency check rules between MO attributes in the information model for management? Does the middleware provide any tool to define it? Also does the management framework handle the backend consistency check logic once the rule is defined through the tool without the need for writing any code?

10. Does the HA middleware provide location transparent communication between the service units defined? If a service unit A needs to send a message to service unit B will it be possible to address service unit B without knowing its physical whereabouts? What is the transport protocol used? Is the location transparent mechanism a sort of "naming service" over TCP / UDP transport or location transparency is supported at the transport protocol itself (TIPC)? If supported through "naming service" what is the messaging latency introduced by the overhead of a naming service layer over the transport?

11. Is it possible to model fault recovery policies in the HA middleware? For example if one wants to associate the continuous receipt of critical alarms from say an SS7 card to a service failover, is it possible to do it?

12. How are faults (alarms) from various service units (service unit could be a software or it could be a hardware proxied by a software) modeled and how are they associated with the MO classes in the model? At run time how is an alarm instance associated with an MO instance?

13. Does the management framework support the delivery of Attribute value change notification (AVCN), state change notification (SCN), object create notification (OCN) and object delete notification (ODN) on the managed objects to the northbound management stations?

14. How are alarms to MO state transitions mapping modeled?

15. Is it possible to define northbound interface level access permissions for each attribute while modeling? For example I would like to model certain attributes as RO to SNMP interface but RW to CLI interface. Is this possible?

16. Is management service realized as a separate process? If so, how are queries to "run time" object attributes that are owned by the call processing services handled? For example, if one defines the current active calls table as a MO class in the model (which is Read Only to the operator) the MO class can have 1000s of instances (dynamically changing) at run time. If the operator at some point does a "getbulk" on this table does the management service issue a query to Call processing service (involving an IPC)? How is the call processing application performance at this time?

17. What are the types of searches supported on the MO tree? Depth first? Breadth first?

18. Is it possible to configure delivery of critical alarms to email addresses and SMS numbers in the management framework? This is not a critical requirement for a management framework used in the network element. Many NMS (Network Management Systems) that collect management data from a cluster of network elements they manage provide this facility. But having this feature at the network element's management framework is one of my wish lists.

19. Does the alarm management service persist the current active alarm list? Is the alarm reporting compliant to ITU-T X.733 specification?

20. Has the managed object database replication in a 1:1 configuration been measured for performance? How many bytes can be replicated per second? Is the replication done synchronously or asynchrously?

21. Does the management database provide transaction semantics? (i.e) If a configuration change needs to be applied to more than one node it should happen in "all or none" fashion.

22. Does the middleware provide mechanisms to define software upgrade and firmware upgrade policies and a smooth integration path with the management model? What is the HA state transition followed while upgrading? How does it ensure minimum downtime upgrade?

Uuuuh quite a mouthful of evaluation criteria. If any COTS middleware supports atleast 60% of this let me know. I am interested in evaluating it. I am more concerned about ease of integrating, manageability and performance of message replication mechanism.

Wednesday, July 2, 2008

Such a simple life

Last weekend I had been to Munnar and visited the Chinnar wild life sanctuary on the way back to Bangalore. The forest department at Chinnar allows treks into the jungles to spot animals in their natural habitat. But the trek has to be guided by some 'adivasi' - the native people of the forest.

We were guided by an adivasi boy "Selvam". During the trek I was chatting with him asking about the kind of animals one can spot and the life style of the tribes. I was amazed at his simplicity and attitude towards life. This one sentence from him summarizes everything:

"Ungala maadhiri padichavinga sondha oora vittu veli ooruku poi pala makkala sandhichu avangala pathi therijukareenga. Enagalukku indha kaadu dhan ellam. Naanga inga kaata pathi therijukarom. Pala thara patta makkala pathi naan inga irundhe theinjukarom, yenna avanga inga varanga"

(Educated people like you leave the native place, go out where opportunity is and learn various cultures by visiting places. For us this jungle is everything. We study this jungle. We learn other's culture by sitting here because everyone comes here)





He may be illiterate but in my opinion he is "educated". Only thing is his domain is the jungle and the animals. He was quite knowledgeable about the behavior of animals and when they will cross a particular place.

When the trek finished we had just one word to say to him: Thanks !!!

Saturday, June 21, 2008

Wah!!! What a Customer Service

I want to share my experience with BSNL customer service. Last month I shifted my house to New Thippasandra and had given a letter to transfer my BSNL connection to the new address. The letter was given on 26th of April, 2008. This was given at Indiranagar BSNL office (in 80ft road). Initially they told that the transfer will be done in 3 days. One week passed and still no trace of getting the connection. On 4th May I decided to visit BSNL office and enquire about the status. The person at customer service (in 3rd floor) told that my old connection is yet to be disconnected and right in front of my eyes she was pressing commands to place that order in the system only on 4th May (and that too because I visited. God knows when they would have placed the orders if I hadnt). Then again I visited them couple of days later. This time they told that my old connection is disconnected and now to get a connection at the new address I have to contact the engineer who is dealing with Thippasandra area. I went and met that engineer and told him that I am awaiting for my connection. He noted it down on a piece of paper and told me that it will be done shortly. I also gave him my mobile number to contact and asked him to call me before coming home to give the connection. He noted down my number also in the same paper.

Now the great customer service of BSNL starts. The engineer who took down the orders went on a week long vacation. No one other than him knows that he took down the orders. There is no centralized system to log in customer requests. I visited BSNL again a couple of days later. This time another engineer took down my orders again in a piece of paper but again no follow up. This continued for another week and by that time I had visited BSNL 4 times and each time someone is taking down orders but none is executing. Finally I got fed up and decided to lodge a complaint to Bangalore East Divisional Engineer. I went to his office the next week but the time I went he was not there. So I met up one more officer sitting in the Divisional engineer office's front desk. I explained him my frustration and told him that no one is logging the complaint in a centralized system. He gives great response which is the highlight of their customer service

"NO SIR. THERE IS A CENTRALIZED SYSTEM TO LOG IN. THERE IS A TOLL FREE NUMBER YOU CAN DIAL AND LOGIN YOUR COMPLAINT. THIS NUMBER IS GIVEN IN YOUR TELEPHONE DIRECTORY."

Hearing this I was laughing my guts out. The toll free number can only be dialed if I have a connection. I need not explain beyond this.

(Note: Though I got such a bad response at least this guy was helpful. I explained him the fault in his statement and he immediately called the engineer and ordered him to give the connection. The next day I got it.)

Sunday, June 15, 2008

Operations issues to set right in a startup

It has been 3 years since I left the comforts of a big organization and started working in the startup mode firefighting every day. The learning curve for me in the past three years have been phenomenal both technically and operations wise. Here is my learning on what a startup company should address in terms of operations related issues

1. Lab and network infrastructure - Dont under estimate this. A poor network and lab infrastructure will blow in your face just when you don't want it. In my experience I had come across a unique situation where the servers in the lab were just having a hard disk wipe out just because the AC in the lab was not adequate and temperature in the lab was soaring beyond 35 degrees centigrade and the temperature near the servers was like a hot tandoori oven. Outsource your infrastructure maintenance to people who know it best rather than you fighting it daily.

2. Developers are your primary resources - Value them. Take care that developer's comfort is of utmost importance. Not every startup can provide a cozy office for the developers. But at least see to it its not a cramped space. Giving developers laptops (rather than desktops) and providing them the freedom to sit and work ( a bean bag office environment) in the place they want will save you cost and also make the developers comfortable

3. Power - And a place like Bangalore needs utmost attention in this regard. Make sure you go for a trusted vendor for UPS and also get a consultant to help you in setting up the power connections and proper earthing.

4. Wiring in conference rooms - This is the place I hate to have wires floating around here and there. If 10 people sit in the room and all have their laptops connected to network through wires imagine the mess when someone gets up and walks around. Get a Wifi connection for your conference room. Also get the connection from your projector to power socket go through a sealed cabling. This is another candidate for getting tripped.

5. Have a concierge - Developers need undisturbed time to think and design. Let them not worry about paying their utility bills or getting the weekend movie ticket. Have a concierge setup to help them out. It doesn't cost much for the company to setup a concierge.

6. Keep your hiring standards high - Never compromise on this even if you are hiring for a maintenance / support job. If you have made your first release to market and got into the support mode of it don't think you can afford to lower your recruiting standards to get support staff. After all they are the ones who are going to fix your bugs. Having a bad developer fixing your bugs is asking for disaster. They will ultimately screw up your fundamental design and the entire code will be nothing but a set of patch works.