Sky is the limit: 2008

Sunday, November 9, 2008

Interviews!!! - God Save Me

It has been a hectic three weeks of interviewing at my office. We have been hiring a lot for design and development of some new products in Fixed Mobile Convergence. We have been looking for people who are very good programmers in C and C++ with some telecom background in the experience range of 1-8 years. To me personally, I dont stress much on what protocols people have worked. If a person is a very good programmer and has good design skills he can learn any protocol and can design scalable solutions for telecom. But sadly the industry around us in Bangalore, especially the bigger services companies, doesnt seem to think so. Here are my observations of people from big companies who claim fancy things in their resumes.

1. Almost all of the resumes that we receive have the candidate projecting himself as having worked in design, development and deployment. You ask them to design solutions for simple problems, they cant even understand the requirements. I am not talking about big system requirements here. I am talking about small programming problems with some scope for design challenge.

2. The more experienced a candidate is the worse his programming skills are. People rejoice in claiming to be "Leaders" / "Managers". Its been a nightmare for a smaller company like us to find good developers. A small company survives because of good software they write and market and not because of excellent people managers they have.

3. Almost all attach the word C++ if they know C programming. People think that C++ is just an enhanced syntax of C. Having said this I am ready to hire a person if he is excellent in programming in C and can independently develop software even if he doesnt know object oriented design principles and programming. A person who can think through programming logic can easily pick up object oriented design concepts.

4. Anyone who has seen a TCP socket program (forget about writing TCP client/servers) claims he is an expert in TCP / IP. No one cares about how a TCP server is designed for concurrency and scalability, how queue sizes can be optimally designed etc.,. What good subjects like "Operations Research" that one learns in their engineering do?

5. Designation matters to them most than how good they are at solving problems.

6. People who claim to have used a particular tool or framework are only API users. They fail to understand the internal design of those tools. For example I came across a few resumes that boasted of expertise in ACE (Adaptive Communication Environment). I am not an expert in ACE myself and neither am I an expert in design patterns. But I atleast know some of the design philosophies of ACE. But these self proclaimed ACE experts cant even understand what a factory pattern is.

7. This is especially about Java developers. Somehow our Java textbooks teach it all wrong. Almost everyone who has done only Java programming thinks "Thread" is a feature of language.

I desire to go to teaching profession one day and teach software development and design to college grads. Such fruit less interviews are driving my desire more. I know we severly lack quality teachers at undergraduate levels that teach good programming skills. But apart from that what I feel is we as a society somehow attach greater importance to status and designation rather than what effective work one actually does. I would love to teach the young grads the importance of learning and problem solving rather than how to get designations.

Thursday, October 16, 2008

Is Google the next Bell Labs?

If you had asked in the late eighties and early nineties the question "Which organization leads in research and innovation in computer science (and for that matter most of the science disciplines)?" anyone would reply back "Bell Labs". With the merger of Alcatel and Lucent last year and Alcatel-Lucent subsequently announcing that it would stop funding researches in fundamental science at Bell Labs, it is no longer the hub of innovation in computer science. The Plan-9 (Inferno) distributed OS project hasnt got the thrust in market it ought to have. So where are all the great minds that worked in Operating Systems and Distributed Systems at Bell Labs now?

Google for Robert Pike and Ken Thompson and you will know where they are :-) The list of top notch computer scientists at Google doest just stop here. Vinton Cerf is also at Google. Is Google redefining research in distributed communications and continues what was left by Bell Labs?

Sunday, August 31, 2008

Java vs Twenty-Twenty

I always have this opinion: Teaching Java as the first programming language to a Computer Science graduate even before teaching the fundamentals of computers and computing, is not going to produce any quality developers and computer science professionals. Of late I started comparing this with how 20-20 is spoiling cricket.

Why is teaching Java as the first programming language a Twenty-Twenty affair? Simply because when you throw open a gamut of libraries and ask a developer to use it to program his application without even teaching the fundamentals of algorithms, why libraries are needed, how to develop modular and object oriented systems etc., he forgets to think what happens behind the scenes and how those libraries are implemented, in what thread context a particular interface method is invoked by the run-time, how are the Java threads implemented, how are they scheduled etc.,.

Compare this with introducing Twenty-Twenty to young kids who learn to play cricket. They will start forgetting that the fundamentals of cricket lie in a solid defense, strong temperament, concentration and will power to last against the odds. Classic case is Yuvraj Singh who seems all lost when facing quality opposition in test cricket.

I don’t have anything against Java as a programming language. But when I see engineers claiming to be “expert Java programmers” lack even the basic skills it becomes very tough to manage a product with them. They increasingly struggle while trouble shooting issues. Even if they study the java documentation and go through it they don’t understand how it works. It is for this reason; I discourage recruiting the so called “Java experts” without sound fundamentals.

Saturday, August 30, 2008

Why is it still a distant dream to get more Olympic medals for India?

Everyone is happy that India, for the first time in history, got more than 2 medals in an Olympics and the media are gaga over it. They even started predicting 10+ medals for India in London 2012. Though me too is confident that one day India will reach there, I am skeptical if it would be within four years. Here are my reasons.

We keep cribbing about government not spending enough for sporting infrastructure. Where will the money come from when the taxes collected is not sufficient even for our basic needs and defense?.

Does our Indian education system impart the value in our citizens to be duty bound towards our country and pay our taxes? Does it give importance to the moral and ethical responsibilities of a citizen to be taught at school level? It all starts from the exam system by giving more importance to marks than knowledge and hence making the student get it by all means, even if unethical. A value less education produces tons of graduates and our GDP is also increasing but income to the government is still less. How can the government spend on sporting infrastructure when there are hundreds of basic necessities to be fulfilled?

The second malady of our exam based education system is it inherently discourages sports. In the mad race to secure perfect 100 in all subjects in board exams who will look at practicing sports?

Unless a change happens at this grass root level we can only dream of becoming a sporting super power. We will produce a few gems here and there but they will come up despite the system and not due to it. Miles to go .... but lets keep the hope...

Tuesday, July 29, 2008

Evaluating COTS Telecommunication Middleware

The market for Commercial Off The Shelf (COTS) middleware for high availability and management of telecommunication application along with modular compute platforms like ATCA and Carrier Grade operating systems like Carrier Grade Linux (CGL) are gaining momentum. Developers working in the teleocm industry whose expertise mainly lie in the development of protocol stacks and telecommunications applications, are often clueless as to how to evaluate these commercial off the shelf middleware. "Managers" :-) believe that a "Carrier Grade HA middleware" is a magic box which when integrated with their telecom application makes it five nines available overnight. But the fact is that the COTS middleware are just platforms over which one has to integrate their application. The onus of designing how the application needs to be deployed to achieve high availability still lies with the application developer. These COTS frameworks only provides mechanisms to incorporate those designs. Think of the COTS middleware as an "Operating System". An Operating System like Windows / Linux as such is of no use to the user. Its the applications built on top of them that the user is using. Similarly the COTS HA middleware is just a platform.

As a developer who has worked in the development of a carrier grade SAForum compliant middleware and also architected / designed some proprietary HA and management frameworks I would like to bring out my own list of questions to evaluate any commercial off the shelf middleware.

1. Does the COTS middleware provide a tool to model the application in terms of services, their redundancy policy etc.,. (prefarrably in SA Forum compliant way)? What redundancy models it supports? 1:1? N+K? N-Way active load balanced?

2. Have any legacy network element been integrated with the middleware? What was the redundancy model followed for it? What was the cycle time taken to make the legacy software into a HA product using your middleware?

3. What operating systems and hardware platforms are supported?

4. What is the measured switchover time from the active machine to the standby machine once a failure is detected by the HA middleware? How is the switchover indicated to the standby application? Does the middleware provide a configurable interface (preferably data driven / script based) for defining the actions to do on getting a switchover indication?

5. What is the replication mechanism followed by the HA middleware? Message based redundancy? Checkpointing? Distributed shared memory? Explain its performance in terms of number of messages replicated / checpointed per second with the average message size being 64 bytes.

6. Does the middleware provide a management framework? If so what is the interface provided by the framework? What is the database used for storing managed objects? Is it a proprietary database or a commercial database? What is the model follwed by the database - (i.e) Is it relational or hierarchical object oriented? How are the managed objects addressed in the database? Also does the management framework provide any IDE to model the management information (i.e) Does it provide a management information modeling tool?

7. How to integrate an existing application MIB with the management middleware? If the internal managed object database follows hiearchical object oriented model how do you map SNMP OIDs (which follow relational database model) to the managed object instances in the hiearchical database?

8. How are the events on hardware (boards, hot-swap events) mapped to state transitions on the related managed objects?

9. How can one define cross consistency check rules between MO attributes in the information model for management? Does the middleware provide any tool to define it? Also does the management framework handle the backend consistency check logic once the rule is defined through the tool without the need for writing any code?

10. Does the HA middleware provide location transparent communication between the service units defined? If a service unit A needs to send a message to service unit B will it be possible to address service unit B without knowing its physical whereabouts? What is the transport protocol used? Is the location transparent mechanism a sort of "naming service" over TCP / UDP transport or location transparency is supported at the transport protocol itself (TIPC)? If supported through "naming service" what is the messaging latency introduced by the overhead of a naming service layer over the transport?

11. Is it possible to model fault recovery policies in the HA middleware? For example if one wants to associate the continuous receipt of critical alarms from say an SS7 card to a service failover, is it possible to do it?

12. How are faults (alarms) from various service units (service unit could be a software or it could be a hardware proxied by a software) modeled and how are they associated with the MO classes in the model? At run time how is an alarm instance associated with an MO instance?

13. Does the management framework support the delivery of Attribute value change notification (AVCN), state change notification (SCN), object create notification (OCN) and object delete notification (ODN) on the managed objects to the northbound management stations?

14. How are alarms to MO state transitions mapping modeled?

15. Is it possible to define northbound interface level access permissions for each attribute while modeling? For example I would like to model certain attributes as RO to SNMP interface but RW to CLI interface. Is this possible?

16. Is management service realized as a separate process? If so, how are queries to "run time" object attributes that are owned by the call processing services handled? For example, if one defines the current active calls table as a MO class in the model (which is Read Only to the operator) the MO class can have 1000s of instances (dynamically changing) at run time. If the operator at some point does a "getbulk" on this table does the management service issue a query to Call processing service (involving an IPC)? How is the call processing application performance at this time?

17. What are the types of searches supported on the MO tree? Depth first? Breadth first?

18. Is it possible to configure delivery of critical alarms to email addresses and SMS numbers in the management framework? This is not a critical requirement for a management framework used in the network element. Many NMS (Network Management Systems) that collect management data from a cluster of network elements they manage provide this facility. But having this feature at the network element's management framework is one of my wish lists.

19. Does the alarm management service persist the current active alarm list? Is the alarm reporting compliant to ITU-T X.733 specification?

20. Has the managed object database replication in a 1:1 configuration been measured for performance? How many bytes can be replicated per second? Is the replication done synchronously or asynchrously?

21. Does the management database provide transaction semantics? (i.e) If a configuration change needs to be applied to more than one node it should happen in "all or none" fashion.

22. Does the middleware provide mechanisms to define software upgrade and firmware upgrade policies and a smooth integration path with the management model? What is the HA state transition followed while upgrading? How does it ensure minimum downtime upgrade?

Uuuuh quite a mouthful of evaluation criteria. If any COTS middleware supports atleast 60% of this let me know. I am interested in evaluating it. I am more concerned about ease of integrating, manageability and performance of message replication mechanism.

Wednesday, July 2, 2008

Such a simple life

Last weekend I had been to Munnar and visited the Chinnar wild life sanctuary on the way back to Bangalore. The forest department at Chinnar allows treks into the jungles to spot animals in their natural habitat. But the trek has to be guided by some 'adivasi' - the native people of the forest.

We were guided by an adivasi boy "Selvam". During the trek I was chatting with him asking about the kind of animals one can spot and the life style of the tribes. I was amazed at his simplicity and attitude towards life. This one sentence from him summarizes everything:

"Ungala maadhiri padichavinga sondha oora vittu veli ooruku poi pala makkala sandhichu avangala pathi therijukareenga. Enagalukku indha kaadu dhan ellam. Naanga inga kaata pathi therijukarom. Pala thara patta makkala pathi naan inga irundhe theinjukarom, yenna avanga inga varanga"

(Educated people like you leave the native place, go out where opportunity is and learn various cultures by visiting places. For us this jungle is everything. We study this jungle. We learn other's culture by sitting here because everyone comes here)

He may be illiterate but in my opinion he is "educated". Only thing is his domain is the jungle and the animals. He was quite knowledgeable about the behavior of animals and when they will cross a particular place.

When the trek finished we had just one word to say to him: Thanks !!!

Saturday, June 21, 2008

Wah!!! What a Customer Service

I want to share my experience with BSNL customer service. Last month I shifted my house to New Thippasandra and had given a letter to transfer my BSNL connection to the new address. The letter was given on 26th of April, 2008. This was given at Indiranagar BSNL office (in 80ft road). Initially they told that the transfer will be done in 3 days. One week passed and still no trace of getting the connection. On 4th May I decided to visit BSNL office and enquire about the status. The person at customer service (in 3rd floor) told that my old connection is yet to be disconnected and right in front of my eyes she was pressing commands to place that order in the system only on 4th May (and that too because I visited. God knows when they would have placed the orders if I hadnt). Then again I visited them couple of days later. This time they told that my old connection is disconnected and now to get a connection at the new address I have to contact the engineer who is dealing with Thippasandra area. I went and met that engineer and told him that I am awaiting for my connection. He noted it down on a piece of paper and told me that it will be done shortly. I also gave him my mobile number to contact and asked him to call me before coming home to give the connection. He noted down my number also in the same paper.

Now the great customer service of BSNL starts. The engineer who took down the orders went on a week long vacation. No one other than him knows that he took down the orders. There is no centralized system to log in customer requests. I visited BSNL again a couple of days later. This time another engineer took down my orders again in a piece of paper but again no follow up. This continued for another week and by that time I had visited BSNL 4 times and each time someone is taking down orders but none is executing. Finally I got fed up and decided to lodge a complaint to Bangalore East Divisional Engineer. I went to his office the next week but the time I went he was not there. So I met up one more officer sitting in the Divisional engineer office's front desk. I explained him my frustration and told him that no one is logging the complaint in a centralized system. He gives great response which is the highlight of their customer service

"NO SIR. THERE IS A CENTRALIZED SYSTEM TO LOG IN. THERE IS A TOLL FREE NUMBER YOU CAN DIAL AND LOGIN YOUR COMPLAINT. THIS NUMBER IS GIVEN IN YOUR TELEPHONE DIRECTORY."

Hearing this I was laughing my guts out. The toll free number can only be dialed if I have a connection. I need not explain beyond this.

(Note: Though I got such a bad response at least this guy was helpful. I explained him the fault in his statement and he immediately called the engineer and ordered him to give the connection. The next day I got it.)

Sunday, June 15, 2008

Operations issues to set right in a startup

It has been 3 years since I left the comforts of a big organization and started working in the startup mode firefighting every day. The learning curve for me in the past three years have been phenomenal both technically and operations wise. Here is my learning on what a startup company should address in terms of operations related issues

1. Lab and network infrastructure - Dont under estimate this. A poor network and lab infrastructure will blow in your face just when you don't want it. In my experience I had come across a unique situation where the servers in the lab were just having a hard disk wipe out just because the AC in the lab was not adequate and temperature in the lab was soaring beyond 35 degrees centigrade and the temperature near the servers was like a hot tandoori oven. Outsource your infrastructure maintenance to people who know it best rather than you fighting it daily.

2. Developers are your primary resources - Value them. Take care that developer's comfort is of utmost importance. Not every startup can provide a cozy office for the developers. But at least see to it its not a cramped space. Giving developers laptops (rather than desktops) and providing them the freedom to sit and work ( a bean bag office environment) in the place they want will save you cost and also make the developers comfortable

3. Power - And a place like Bangalore needs utmost attention in this regard. Make sure you go for a trusted vendor for UPS and also get a consultant to help you in setting up the power connections and proper earthing.

4. Wiring in conference rooms - This is the place I hate to have wires floating around here and there. If 10 people sit in the room and all have their laptops connected to network through wires imagine the mess when someone gets up and walks around. Get a Wifi connection for your conference room. Also get the connection from your projector to power socket go through a sealed cabling. This is another candidate for getting tripped.

5. Have a concierge - Developers need undisturbed time to think and design. Let them not worry about paying their utility bills or getting the weekend movie ticket. Have a concierge setup to help them out. It doesn't cost much for the company to setup a concierge.

6. Keep your hiring standards high - Never compromise on this even if you are hiring for a maintenance / support job. If you have made your first release to market and got into the support mode of it don't think you can afford to lower your recruiting standards to get support staff. After all they are the ones who are going to fix your bugs. Having a bad developer fixing your bugs is asking for disaster. They will ultimately screw up your fundamental design and the entire code will be nothing but a set of patch works.

Friday, June 13, 2008

Solaris Kernel SCTP Issue

We recently we came across an interesting issue with Solaris Kernel SCTP. We observed that if you write a SCTP server program and then initiate thousands of connections from a SCTP client program loop, the server is able to establish the SCTP associations. The point to be noted here is that to simulate thousands of connections from client, the client program loops, creates a socket, binds to the client machine's IP address and a different port number each time. With this setup we were able to create thousands of associations towards the server. But when tested in a live setup where thousands of client machines (each a different machine and hence a different IP address) try to connect to the SCTP server, the solaris kernel sctp goes into a high CPU utilization of near 100%. Interestingly all the clients were bound to the same port number. If we change the port numbers used by the client machines, the CPU utilization @ server comes down to normal.

My hunch here is that kernel SCTP's hashing mechanism to index the hash table of SCTP associations is using only the remote client's port number part to generate the hash value and not the remote port + IP address combo. So if all clients are bound to the same port number, then the hashing algo will generate the same hash value. This will lead to all the client connections colliding to the same hash value thus effectively reducing the search to O(n) or O(logn) based on the collision resolution algo being used.

Refer this for the complaint.
http://forum.java.sun.com/thread.jspa?threadID=5288890

Saturday, May 31, 2008

Startup City @ Bangalore

Wow!!! was the expression I had when I came out of the startup city event organized by SmartTechie magazine, the first of its kind in Bangalore. The event was meant to create awareness among technology enthusiasts about various successful startups in Bangalore and encourage the entrepreneurial spirit in them. The event was held at NIMHANS convention center, Diary circle, Bangalore on the 24th of May, 2008 . It was a whole day event. There were talks and sessions given by leading entrepreneurs and technology evangelists in the morning. There were also parallel sessions going on in the morning where folks who have an idea to start a company can meet VCs one on one and discuss their plans. This in my opinion was a great opportunity for many budding entrepreneurs.

Among the talks that were given in the morning session, the one that impressed me was Lead India winner R.K.Mishra’s talk on his entrepreneurial journey and his subsequent interest in community service and social work. True to his statements in the Lead India finals, “That is why my motto is ‘work more, talk less”, he really is a man of actions. For the rest visit his blog and his website.

The afternoon session was open for us to go and interact with the startups who had put up stalls there. Due to a huge crowd I was not able to spend time with all of them and discuss the technologies they work on. But a few of the stalls I visited impressed me.

OboPay’s concept of using the mobile as a virtual debit card to handle transactions was good. But I am not sure on their revenue model. As per their explanation, they dont charge the user anything for sending the SMS to Obopay’s SMS short code for transferring money from one’s account to the receiver’s account. Their revenue model at this time seems to be only entering into partnership with banks. I am not sure this is a sustainable revenue model as the banks may not see a value add after sometime unless obopay provides other value added services in the interests of the banks with which they are tied up. Also I am not clear on how secure their mode of trafer is. They have a few questions to answer. A similar service is provided by mCheck also. Check out their service here. This one looks impressive.

Soliton’s demonstration of its camera that identifies defective parts in a manufacturing pipeline was very impressive. They demonstrated it by letting the audience draw a circle by hand and let the camera detect whether it was a perfect circle or not (mathematically speaking, since there is no perfect value for “pi” there is no perfect circle yet.). Being a Coimbatorean I have a great admiration for Soliton as it is started by a Coimbatorean and has a development and support centre in Coimbatore too.

Though most of the startups were in Web 2.0 what I was interested was the startups in telecom domain. I went around and collected few materials on Sloka Telecom and Starent Networks. Sloka is working in the WiMax space and Starent is working in solutions for 3G services, IMS and thier unique ground up solution called “Inline services” which enables operators to define value added services inline with the core network elements. This approach is quite different from some of the out of the “box” (box here means the network element) mechanisms that many companies are still struggling to define for customized services.

And finally for the negatives. The ugly face of Bangalore’s infrastructure showed up here too. There were frequent power cuts during the morning sessions and the audience were left to sit in the dark auditorium way too often.

Sky is the limit