Showing posts with label software development. Show all posts
Showing posts with label software development. Show all posts

Sunday, July 06, 2008

5 Steps to Becoming a Terrible Programmer

During your career, you will come across both great and poor programmers. While one may be tempted to group all poor programmers together, there are different levels of poor programmers. It is instructive to know the difference, because generally (especially if you have just joined a new company), you cannot go on a firing rampage, throwing out anybody you deem incompetent. You may to have to tolerate some levels of incompetence and know what you can work with.

Here are the 5 levels of incompetence, in increasing order:

  1. No intelligence
  2. No interest
  3. No sense of responsibility
  4. No fear
  5. No self-respect

No Intelligence

A more intelligent developer means more productivity, more creativity and fewer mistakes. The reason why a particular developer is performing poorly may be that they are unable to comprehend or remember things as effectively as a more intelligent person. The population of programmers is, on average, more intelligent than the general population, but you will find some extraordinarily brilliant persons and some not so good.

Sometimes, the problem is that you are expecting too much from someone that is beyond their physical capabilities. So, you have to tone down the challenges a bit. Reduce the multi-tasking. Provide systems that help them remember and keep notes. Establish standards. Use software and/or procedures like code reviews to help them in their work. They may perform adequately in their reduced responsibilities.

No Interest

A person of average intelligence, but with a high degree of interest in software, can be a very useful developer. When a person develops a special interest in a subject, they spend an undue amount of time tinkering with the subject and, by constant reinforcement, can learn the necessary techniques and tools. The only problem with a lot of knowledge and less intelligence is that the person cannot be depended on to find any creative solutions.

But when the person has no interest, you have to depend on external motivation and rewards to get the person to do anything. You may need to allocate separate time and funds for the person to improve their skills, because they hardly spend any time outside the office even thinking about their work. You should never assume that the passage of time has automatically made them more experienced and more capable.

No Sense of Responsibility

An external factor for many career programmers is a sense of obligation towards customers. The customer pays the bills and hence they are entitled to good quality software that performs to the customer's expectations. Or the developer may feel obliged to pull their weight for the team. When the programmer stops caring about the customer, the team or the company, they don't worry about productivity or quality. They just think about the paycheck that they receive without any moral qualms.

At this level, you cannot reward the programmer to do more. Instead, you can only use punishments to enforce quality. Fear of losing one's job could be a factor, especially in a bad economy. However, there are many problems for you as a manager. Until used, the firing option is simply a vague threat. You must have demonstrable differences of the fired persons with respect to the rest of the team. Finally, firing someone can have unpleasant side-effects on the project and the team.

No Fear

The talent-less and irresponsible programmer may still produce something if he/she knows that the work is being monitored and is aware of the consequences of bad quality (such as demotion or unemployment). Some persons may not even care about that. For example, during a labor shortage, some employees may be job-hopping and may not be worried if they lose a job. Or for reasons like nepotism or favoritism, the employee may never have to fear about punishment.

At this stage and assuming you cannot fire them, you have nothing left except to beg the person to get something done. Appeal to anything that they may have left in their character. A more practical approach would be to assume that they don't exist, and then hire more people to fill in the gaps. That is pretty expensive, but that may be your only option.

No Self-Respect

You have nothing left. Period. If the person does not even care what others think of him or her, they will just do what they want. If they don't feel ashamed that their poor quality of work reflects upon them in front of their boss or their co-workers, it is impossible to do anything with them.

Final Thoughts

The last three factors (and perhaps even the second one) are a function of age. The older one gets, the more one feels the need for security, belonging and self-worth. It is also more difficult to change one's profession. People develop a greater pride in their work. So being a terrible programmer is not a static situation.


Sunday, June 22, 2008

Web List Functionality

Initial implementations of lists in a web application are generally very simple. As the list grows in size, more functionality is required. Here is an attempt at compiling the major functional requirements of a list. You may not use many of these functions, but it is useful to know what is possible.

Searching

image

There should be an easy way for users to search all fields of the entire list. The simplest implementation would be a single textbox with a Search button. Users may then want to search on specific fields. A straightforward method is to present separate input fields for each searchable field. This however limits the user to searching each field for one particular value. You may have to present additional fields for specifying ranges or conditional operators (like greater than or equal to).

image

For more complex conditions, it may be helpful to present a query language like a simplified version of SQL. Or better still, provide a query builder, where the user can create a complex query and execute it. This is more preferable as it reduces the learning curve from beginner to intermediate. The query builder should also provide for searching dynamic values such as "today", "current time", etc. and aggregate functions such as "total sales is greater than x".

Users may wish to save their queries for a later time. It should be easy to give them a meaningful name, description and even allow for tagging. Sometimes, the user may want to save the results of the query (a snapshot in time) instead of the query itself. Saving a data set may only be possible if the list size is small. Otherwise it should allow the users to export the information out to a document in Word, PDF, Excel or other formats.

Paging

Paging is more a practical than usability concern. Users may wish to have all records displayed to them, but doing so would be time-prohibitive and involve unnecessary data processing, when the user is interested in just a few records. Paging is generally implemented by links to different pages, though nowadays, it is common to see the "infinite" page where vertically scrolling down displays more records fetched dynamically that weren't there initially. Live Image Search and Google Reader (see below) have good implementations of the latter design.

image image

If the list is organized alphabetically or chronologically, being able to access a specific page is a more useful feature than displaying an infinite page. The reverse is true if the list organizes the results by relevance or randomly. For example, you might guess that the middle page of a contact list may contain names starting with "K" or "M", but retrieving the 100th page of Yahoo! search result page is meaningless. For the former case, provide links or "jump-to-page" functionality at the top and bottom of the list.

Users would also find it useful to control the number of results per page. A global setting is usually appropriate, though it is more helpful to provide a setting in each list screen and remember the last selected value. For the default page size, choose a value that returns a reasonable-sized amount of data without the user having to navigate or scroll too much. Somewhere between 10 and 25 seems like a good guess, but you can make a more educated decision by observing your users and finding out how much they click to a different page or scroll for each page.

Displaying the number of results across all pages can be useful. But performance concerns may prevent you from giving an accurate value. In such cases, you can do what Gmail does, give an estimate:

image

Sorting

The most common implementation of sorting results is doubling the header column as a sort column. You click on a header column; the data sorts by that field. You click again; it sorts in the reverse order. Display the sort direction as an icon or arrow in the column. The limitation is that you can generally only sort the data by one column. Also, the sorting is not linked to the search functionality, so each time the user issues a search, he or she has to sort the search results again.

image

A better implementation would be to allow the user to specify multiple sorting criteria, so that the user can say, sort the contacts by city and then by last name within each city. This can be part of the search section and so each saved search can have its own sort. You can still retain the column heading sort, but it would be overridden by the search sort condition when the search executes. But after the results are displayed, clicking on the columns would override the search sorting, thus avoiding the need for users to change the search sort condition for a temporary sorting need.

There are client-side implementations of sorting data, but they should be avoided because generally the client does not receive all the data because of paging. Sorting the single page of results provides incorrect results from the right method of first sorting the entire data and displaying the first page. Another mistake commonly seen in client-side sorting is that numeric values or dates are sorted as if they were strings.

If the user has not sorted the data, it should default to the most meaningful sort. For example, contacts should be sorted alphabetically. Bills should be sorted by the one pending for the longest time.

Layout and Information

Should you have lines for separating the columns? What should be the font size of the columns? Should you display alternating rows in different colors? There are so many different choices that you can make for presenting the list information. Let me point you to Matt Berseth for some examples (and code!). Of course, tables do not have to be your first choice, either. You could present the data in a visual manner, such as graphs or maps. In such visual displays, paging and sorting are less of a concern, but searching (or filtering) remains important.

While designing the layout, you should be concerned about noise. The user is generally interested in only a few records and perhaps only a few columns. Searching solves the first problem. Providing the ability to show and hide columns helps with the second. I think it is better to allow selection of necessary columns when providing search criteria, and then allow a separate override when the search results are displayed.

Summary headers or footers are useful. While the conventional method is to have a summary row, on the web, providing the summary information in the header can avoid scrolling. You can even create a summary that provides information other than an aggregate function of a particular column. Of course, it goes without saying that the summary information must be emphasized using a greater font size or color.

List Management

Users want to do more with their lists. People like list management because it reminds them of a spreadsheet program like Microsoft Excel, which makes it very easy to add and change data. Though a web application has other better features, users want it all. A common need is to apply an action across multiple items in the list, such as Delete or Mark as Read. We see this uniformly implemented using a checkbox column. Users can choose the required items and select an action. While this works effectively, it limits the user to working with the current page of results.

Many implementations also have a master checkbox (in the header column) that selects or unselects all the checkboxes in the list. However, Gmail differs by offering text links for selecting "All", "None", "Starred", etc. Hotmail has, perhaps, the worst implementation of the checkbox column. First, the checkbox is hidden behind an envelope icon. Second, if you miss the checkbox when clicking, it opens the message, making you lose all the previous checkbox selections you made.

image

With the increase in Ajax-enabled applications, users are demanding more from the list screen, such as the ability to add and edit records right in the grid. This works well when the number of fields in a particular record is relatively few. When a record has more fields, the user is forced to scroll horizontally. Some fields may require more advanced controls making the page code-heavy and unwieldy.

Security

A final piece of the puzzle is security. Some records must be hidden or disabled for some actions for different groups of users. A simple case would be each user owning the records that they have created and others unable to view them, unless the user decides to make them public or share them with select friends. This is a common situation with most social networks.

Advanced security may require an authorization module that allows the system to define user roles and access rights. Building this or re-using an existing framework may take a life of its own especially if security is very fine-grained, because the authorization module is linked to every part of the application. Keeping it simpler can mean easier development.

Security is a large and complex domain. This is one area that you will have to pay increasing attention to as your application gains more users. Managing the authorization framework, managing the needs of the end users and continuing to build your application can be quite demanding. So plan well for it.


Sunday, June 15, 2008

Pair Programming and Code Reviews

One simple definition of pair programming would be continuous code review. Instead of doing code reviews at definite intervals, one programmer continuously monitors the code written by his/her pair programmer. One could have yet other code reviewers in addition to the pair programmers, but in general, the principle stands.

A single programmer producing code without any code reviews may be the most efficient scenario, if it didn't conflict with the reality that programmers make mistakes. Those mistakes are uncovered during testing and the programmer is forced to spend additional time fixing those mistakes. The effort and time involved in fixing a bug is greater than that involved in correcting it before the code was sent for testing.

One reason is the time taken to perform the release processes again after the bug is fixed. Second, the process of fixing the bug involves the programmer to spend time re-understanding the code and debugging. Thus, combining programming with code reviews increases the efficiency of the programmer.

Pair programming, considering that it is constant code review, should increase the correctness of work produced by a programmer. At the same time, it keeps one programmer off producing code. So does the efficiency of producing correct code balance the inefficiency of keeping one person idle? There are studies done to find how efficient pair programming is, but the jury is still out. However, I think a few ideas may be useful in deciding what works in your situation.

First of all, there are many development activities that lend themselves well to working in pairs or small teams. Design, at a higher architectural level or lower algorithmic level, is one such activity. Working together can help team members complement each other, providing ideas and correcting errors as they occur. I have often found that working on a relational database design in a team can be very productive.

On the other hand, some programming activities can be done in isolation. For example, if you have a well-defined HTML prototype, re-creating that in a web programming language (by a reasonably competent developer) does not require much oversight. Or, if your application requires a lot of boilerplate code and you are not using code generators, you could have developers work on it independently.

Another concern is that pair programming can avoid distractions and help you make the most use of the available time. This is one reason why study groups are so popular - each individual can maximize the time spent in productive activity. Assuming that pair programming makes more time available for development, it may be more efficient than a simple comparison with the work done by a single programmer alone would imply.


Monday, June 09, 2008

Project Scope and Bug Density

One important consequence of bug reports is that they not only affect your schedule, but also conflict with your priorities. Release of product features is negotiable to some extent. A critical bug seldom is. You have to postpone much of your planned work until the bug fix is released. That step requires you to perform the release management steps again, so your quality staff members are also involved and they cannot pursue other planned activities.

The other important problem with bugs is that every bug lowers the confidence of the stakeholders in both the product and the creators of the product. The bug does not have to be serious to reduce confidence. It can be a minor bug that is irritating or confusing or careless. Lack of confidence reduces the effort that users are willing to put in to learn the system.

There are many ways to improve quality in your project and reduce the bug count such as different forms of testing, code reviews, etc. But there is a certain limit to quality that you can achieve with the normal constraints of a project (available time and budget, quality of technical resources, etc.) Your limit may be higher or lower than other projects, but it exists, and you cannot improve the level of quality by performing more quality activities because the law of diminishing returns kicks in.

Assuming that you are operating at your highest quality level, the quality of your product is proportional to the size of your product. The more features and code it has, the greater number of potential bugs, assuming that bug-to-feature ratio stays constant. That is why the suggestion by 37signals to build less seems very attractive. 37signals has a more general (and sometimes controversial) philosophy than just reducing bugs, but let us focus on bugs for now.

It does not seem practical at first glance to reduce product size. After all, customers demand features. More functionality sells more products. And there is nothing more capable than a rush of customers leaving you for other products to increase product scope dramatically. But if we cannot reduce product features, what can we do?

One answer is to tightly integrate features instead of distributing them throughout the application. For example, instead of building different screens for displaying records for data entry purposes and for reporting, merge the two screens. Provide more information on a single screen instead of breaking it up into multiple tabs. Avoid wizard screens and make all your screens easily learnable for the beginner user.

So reduce the project size to project feature ratio. This requires more work because you have to create a more visually compelling and intuitive user design. And your code will have to be modified to make it more streamlined and able to implement more complex needs. But in the end, having an overall smaller code base and project size earns its reward in higher quality, greater consumer confidence and better project planning.


Monday, June 02, 2008

To Learn C or Not

The past few weeks, I have been listening to the podcasts of Joel Spolsky and Jeff Atwood at stackoverflow.com. The discussions are perhaps not as informative as a Hanselminutes podcast, but they are definitely entertaining and cover many topics related to software development. Atwood has posted some of the discussed topics on Coding Horror. One of the subjects that caught my attention (and others as well) was the fact that Jeff did not know the C language and didn't feel that it was necessary, while Joel holds the opposite view.

This debate is different from most language fights. Most of them are about which programming language you must use. Here, no one is suggesting that you use C to replace your current programming language. The C advocates recommend that you learn C to gain a more fundamental idea of the low-level issues of programming, something that current languages isolate developers from. Since my first programming language was C, my first instinct was to agree with Joel, and then I realized several problems in my reasoning.

The first mistake was to forget the important place that C had in my learning curve. When I moved to C++, I was only learning the object-oriented features provided on top of C. The basic language structure (except for the OO constructs) remained the same. Moving to Java from C++ once again required only learning a few additional features and unlearning some of the C++ features. Thus, with every other language (C#, VB, Python, etc.), the learning process was always about trying to relate my existing knowledge to the feature set in the new language.

For a person who started with C, it is difficult to imagine how someone else could have learnt anything without the initial C knowledge. But if that person had started with Visual Basic, they would have learnt many fundamental aspects of programming (such as logical structure of programs, debugging, etc.) through those languages. And that person, when moving to other languages, would use their VB knowledge to understand and learn them.

The difference here is like mother tongues. For example, one person is a native English speaker and another is a French or German citizen who learnt English as a second language. But regardless, both can understand and speak English fluently, despite the fact that one of them "thinks" in another language.

But what about the advanced stuff? What about pointers and dynamic memory allocation? I think a fallacy here is to compare an advanced C programmer and a naive VB or Java programmer. If we look at the skills of an advanced programmer in any language, they would have the same level of understanding complex concepts, algorithms and data structures. The point being, you have to look at more than the programming language knowledge of a developer to understand his or her true worth.

I think it is instructive to look at the reasons why C was displaced in the first place by other languages. C has many peculiarities that get in the way of writing business logic. For example, different sizes of data types on different operating systems, null-terminated strings, etc. While knowing these intricacies gives you a better idea of what goes under the hood, it does not help you solve your business problems any faster. Secondly, they can even hinder learning. For example, a language that has automatic garbage collection may make it easier to learn data structures or algorithms without the C-specific pointer and memory handling.

A C programmer may have to unlearn some lessons when they move on to higher-level languages. The advanced C programmer has a tendency to write tight, clever code, sometimes at the cost of making it incomprehensible for anyone else. The lesser C programmers ignore concepts of data encapsulation and tend to create classes that are simply a bunch of related functions that keep shuffling data around.

It is also a mistake to assume that just because a programming student has learnt C, that they have somehow achieved a state of greater understanding. It is possible to write non-trivial large programs in C while staying away from complex constructs (in fact, it may make for less debugging even if it introduces inefficiency). A bad programmer can also make bad mistakes with pointers and memory allocation, and have a program that runs correctly for many different test cases and for long periods, just because it didn't hit the error conditions.

The final fallacy is a generational problem. The older generation always feels that the younger generation have had it easy so far, and because of their inexperience with failure, they are on the verge of making some colossal errors. The idea goes like this, "We suffered. We learnt the hard way. And that is why we are successful today." Hence, there must be something wrong with the success of the new generation. Maybe they are just lucky. This argument, of course, is not unique to programming.

But perhaps, it is possible that maybe you don't have to fail before you succeed. Maybe you don't need to go through all the hardships and sacrifices suffered by the people before you. The new generation can start programming in their simplistic programming language and still create works of wonder without ever knowing one hardware term. Maybe the rules have changed.

At some level, I think people understand this. But there is a real sense of nostalgia that believes that the new generation will never get to experience what they had. The feelings of achievement on solving a complex problem. The respect and love for the programming profession. The constant quest for learning. The acquisition of minor trivia. The knowledge that makes inside jokes funny. But this sense of loss ignores the fact that different people can reach the same goals by different journeys.

A real programmer is always on the lookout for improvement. Some of them, who started with a different language, may choose a path where they go back and learn C, maybe even contribute to projects that require C knowledge. Others may choose a different path where they learn about new languages or techniques. In the end, I think what matters is not what a programmer knows, but what and how quickly he or she can learn, and how well he or she can contribute to solving real problems.


Tuesday, February 12, 2008

The Different Aspects of Intelligence

When companies look to hire people, they always want to hire the most intelligent people they can find or afford. Many companies try to recruit from the top universities and colleges. Others snatch away the top performers in other companies. But what exactly do we mean by intelligence? And how can you leverage intelligent people for business success?

Intelligence is frequently stereotyped in caricatures such as the absent-minded professor, the mad scientist, the black hat hacker, and so forth. Such exaggerated images fail to convey how intelligence is a combination of many different aspects. Here are some characteristics I can think of:

  1. Ability to grasp information from data: Two people may receive the same information, but one of them is able to comprehend it better and derive meaning from it. This is not necessarily a function of past knowledge, but an ability to recognize patterns in the data and derive conclusions. A person with this ability is best suited to work in analytical situations, being able to process huge amounts of information and make sense from them.

  2. Ability to remember information: This is perhaps a misunderstood aspect of intelligence, and usually negatively associated with exam cramming. However, you can notice differences there too. Some people are able to cram more information than others. Some are able to remember relevant information from long ago. Memory is an essential part of the brain function. A person who is able to operate with large volumes of relevant information easily accessible while working can be highly efficient. Such a person is well suited for technical work. For example, a software developer who is very familiar with the language API's.

  3. Ability to juggle multiple things: A good soccer player can run down the field with the ball, and also simultaneously remember exactly when and where to pass the ball, because while running, he can visualize where the other players are. Or think of how an aircraft controller works. Many people break down when confronted with multiple things at the same time. The person with the ability to multi-task revels in such situations.

  4. Ability to concentrate on one thing: This seems the opposite of the previous point. But in some cases, the ability to juggle also requires the ability to tune out certain things. The soccer player in the previous example tunes out the thousands of cheering fans, his personal life and whatever happened 5 minutes ago as he runs. The aircraft controller shuts out all the other distractions in the room. By only focusing on the essential, the entire brainpower is devoted to the main task at hand.

  5. Ability to apply solutions to problems: Most people associate intelligence with problem-solving, but it is not the solution per-se that displays intelligence, but it is the process. Intelligence is the ability to match and apply strategies to solving problems. All strategies have to be learnt, but some people do better than others at understanding when to use them. For example, most accountants are familiar with the various strategies to minimize taxes, but some are just better at actually doing it.

  6. Ability to devise new solutions: When faced with a unique problem, a person with this ability can come up with new ways of solving problems. For example, Leibniz and Newton (independently) invented calculus to solve their mathematical (and physics) challenges. The key difference with the previous point is that a person without this skill will get stuck when faced with new challenges, even though they can use the strategies they know to devise a new solution.

  7. Ability to imagine: This goes beyond logical ability to devise solutions. People with this skill think unconventionally. Their ideas do not come out of some combination of putting existing ideas together or improving them. People working in creative fields are good examples of this. Another example is the discovery of the ring structure of benzene. Without such people, innovation would always be incremental and human progress would not be where it is.

In many cases, companies measure only a few of these aspects. Some of these are difficult to measure in a typical interview. For example, asking people to solve puzzles in an interview may only be measuring their ability to remember information, if they already know the answer. But I still think it is worthwhile to think about ways to understand where a person stands with regard to these traits.

For example, you are hiring a developer. If you look at the previous abilities, here is what they mean for that role:

  1. Ability to grasp information from data: Understand specifications and other written documents.
  2. Ability to remember information: Remember programming standards and libraries, and also important information conveyed through conversations.
  3. Ability to juggle multiple things: Work in multiple modules at the same time.
  4. Ability to concentrate on one thing: Spend vast amounts of uninterrupted time building programs, thus increasing quality.
  5. Ability to apply solutions to problems: Know when to use which best practices and design patterns.
  6. Ability to devise new solutions: Create new programming modules that solve common problems faced by other developers.
  7. Ability to imagine: Design better. Debug better.

And this goes for other roles in the company.



Saturday, January 12, 2008

Replacing the HTML drop-down control

The standard HTML drop-down control is convenient for allowing users to easily select from a set of choices. It is preferable to using multiple radio buttons for selection because it helps reduce screen real-estate needs. It is not only a data entry field, but is also frequently used for providing a set of user actions or screen navigation.

However, it has a few shortcomings that make it inadequate for many purposes:

  1. As the number of choices increases, the drop-down control becomes more difficult to use. The user has to spend more time and effort scrolling to find the right choice. Ordering the values alphabetically only helps to some extent. When the number of records increases into the thousands, it presents technical difficulties, such as performance issues. 

  2. Although the user can type a few letters to bring the focus to a particular selection, those letters are not displayed on the screen. If the user wants to cancel their input, they cannot hit the backspace key, because the browser would go to the previous page. Instead the user has to wait for 1-2 seconds and start typing again from the start. Also, not having a regular text input prevents the application from using user input to filter the records in the drop-down instead of simply navigating to the record.

  3. The drop-down control cannot easily display a columnar format for multiple fields (say Serial Number, Product Name and Vendor) for a single record. You could resort to pseudo-formatting by appending the fields, padding them with spaces and using a fixed width font, but that would conflict with the fonts on other fields. Even if you manage to show multiple values on a single choice, the user cannot sort them as they could normally by clicking on the header column.

  4. Although users can manage drop-down records in an administrative section, they frequently request the ability to do so in any screen where those records appear. Users find it inconvenient to cancel their work on a form because a drop-down value was missing and have to go to another screen to enter the record.

    The standard drop-down control does not allow any easy way of adding a new value.  One way around this is to provide a link to a modal window that would allow the user to add, update and delete records in the table referenced by the drop-down. However, those changes must then show up back in the drop-down control choices in the parent form.

  5. Sometimes, users have the need to "expire" choices in the drop-down control so that they are not available for selection in future data entry. An example would be "East Germany" in a country selection. However, when the user wants to edit a prior record, the country drop-down no longer has the previous value and it cannot be used to display the value in the form. This even if the user may have no intention of editing that field.

  6. This last one is more a browser issue: In some versions of Internet Explorer, the drop-down control prevents any DIV controls to be overlaid over them. If you had a floating layer, the drop-down would show through the layer, resulting in an ugly and unusable display. Developers have to come with various code workarounds to solve this problem.

This is not an exhaustive list, but let me stop here. I think there is a real need for programmers to stop using the default HTML drop-down control. The better option would be to design a combo-like control that has the following features:

  1. Text box: This can be used to filter the values in the drop-down or directly enter the value. For example, a product id can be entered using a barcode scanner instead of scrolling through a long list. The text box could be used to display "expired" choices that may not be available in the drop-down list.

  2. Drop-down list: The display could be configured in different values - a simple list, a hierarchical list, a sort-able table, etc. The values themselves could be directly edited or deleted in the display itself, and new records added. The display could be a static pre-loaded list (if there are only a few records), an asynchronously loaded list (to reduce start-up delays) or an incomplete list that displays a few records and more on request. Searching can bring the right records to the top of the incomplete list for easier selection.

This is not the same as the standard Windows combo-box control as the drop-down list is more configurable in terms of display and event handling. It also goes without saying that keyboard and mouse interaction with this combo control should try to retain the efficiencies of the simpler drop-down control. These include keyboard events such as ENTER, ESC, TAB, SHIFT-TAB, arrow keys, etc. and mouse events like scrolling, clicking (inside and outside the control), etc.

One key design principle when creating this control should be that it should have the capability to entirely displace all instances of the standard drop-down control. This is required for the sake of consistency and uniformity in user interface to avoid user confusion.


Thursday, December 27, 2007

Code Size in Software Projects

In a recent post, Steve Yegge talked about a software game he had written and how the code size (500,000 lines of code) had become too big for him to manage. He had previously written the application in Java, and now has decided to rewrite the application in Rhino to reduce the code size to around 150,000 lines.

For managers, the topic of code size has many implications for software projects in the areas of software quality, resource allocation and effort. Let us look at these:

  1. Code is written to create product functionality. However, writing code requires resources and is expensive. The less code that can be written to accomplish the same functionality, the better in terms of cost. A concise language is better than a verbose one.
  2. The more lines written by a developer, the greater the potential for bugs. This is particularly true of boiler-plate code where repetition can introduce typos and other mistakes.
  3. Fewer lines of code does not always mean less time for overall development. Sometimes, smaller code can be difficult to read and debug, resulting in greater costs for testing, debugging and maintenance.
  4. If software developers can reuse code or functionality in tested libraries, they can save a lot of development time. The best choice is the language framework itself, followed closely by open source libraries which have business-friendly licenses.
  5. Software developers are most efficient in their own languages and tools. Although another language may be more concise, they may take more time to effectively use it, or make mistakes, resulting in greater costs.

The ideal goal for a technology choice for a software project would be a language known to all developers on the team, and which has the best reusable libraries and expressive syntax. The reduced code size and development effort translates into tangible benefits (less cost, less time, greater quality).

To be most effective, the decision should be made at the start of the project. The project should be staffed with the best people (available to you) in that technology. You should purchase the best tools that you can afford for working with the specific technology.

Code reviews, when done right, are very effective in identifying copy-and-paste or inefficient code. The people participating in the code reviews can suggest different ideas for making the code better, such as redesigning a class. By sharing their ideas, the developers become increasingly knowledgeable in code reduction techniques and their work automatically improves.

Now, what happens if you never paid attention to the size of your code base for a long period, or worse, you have just been handed a large existing project? Before you proceed, the first thing to understand is that your primary goal is not making the code size smaller or trying to understand the code base. Your foremost objective is to enhance the functionality of the application.

With that in mind, the first question is: How much of the code base do you need to be familiar with to enhance the functionality? If you don't need to make any changes to some portions of the code, you could work as if their source code never existed, and just link to them.

Secondly, do you completely know the existing functionality and dependencies of the code you will be modifying? Usually, the existing code base will have some convoluted code written for bug fixes and change requests. Sometimes, a particular line of code may affect other modules.

In this situation, refactoring code makes it much easier to make changes to the code without affecting functionality. Refactoring may result in larger number of lines, but a significant portion of them can be isolated away and never looked at again.

So, when you inherit a large code base, your objective should be to treat as much of the code base as a black box, never to be tinkered with. This will reduce the code that you need to learn, understand and modify. And most importantly, your goal of enhancing the application functionality will be met.

Still, developers will continue to worry about the huge code size that has now been isolated, usually citing performance, memory needs and maintainability. In several instances, there is no actual evidence that there is a negative impact detrimental to the user and it is an assumption by the developer.

However, if any part of the isolated code base does impact on performance, and is causing maintenance issues, then its status should now be upgraded to the "working source code" and become a candidate for refactoring (or in extreme case, rewriting).

So, what about Yegge? I think he is making a huge mistake by committing vast amounts of his productive time to rewriting hundreds of thousands of lines of code. He could be adding more functionality to the application by spending significantly less time to understand those pieces. He could release the application as open source and start work on creating some other application.

A good developer has a penchant for order and organization. Unfortunately, it can be taken to excess at the cost of useful work. Keeping your house clean is a good thing. It just doesn't make sense to tear it down when you cannot remember where you kept your photo albums.


Saturday, December 15, 2007

DataModel and ViewModel

Recently in a conversation with a friend, he described a problem with an algorithm he was writing for displaying a calendar with appointment data. His original algorithm worked fine, but when faced with appointment conflicts because of multi-booking, he had to employ a 2-pass algorithm over the data so that he could display the information properly in the calendar view. Let us explore this problem further (please also visit the references).

A well-structured architecture separates the presentation of information from the storage of information. The MVC model is the most commonly-used architectural framework with a view representing information display and the model representing the data source. The advantage of this approach is that you can have multiple views for the same data. For example, you can represent sales data in a tabular format, as a bar diagram, or expose it as XML data for consumption by other applications.

However, one model cannot adequately serve the needs of different views without additional overhead. For example, while a view that displays the data as a table can operate with just rows and columns of data, a bar chart view needs additional information (such as number of values and the data value range) to display the data properly.

Some of this is additional metadata about the information that the model could take responsibility for. However, it introduces a complication: For example, if metadata information is calculated as part of a GetAllRecords() operation, it introduces an overhead that simple views may not desire. A different scenario is when you have no control over changing the model. This may be the case where you are working with specialized storage and the vendor provides you an API. Without access to the source code, you cannot change the model interface.

Consider a view that must work with a rigid model. One strategy is to implement a 1-pass algorithm that calculates metadata and changes the view on the fly. Changing the view may be appropriate in some situations (a map that displays cities one by one), but not appropriate, or even practical, in others (a map that changes zoom sizes as it renders its data).

A 2-pass algorithm is better, because it gives you the capability to calculate metadata and then render the view appropriately. However, this re-introduces the original complexity of making the view responsible for handling data.

Hence, you introduce the ViewModel, a model that is meant for use by a specific view. The ViewModel is a bridge between the View and the DataModel (which was our original Model). The DataModel is responsible for data storage and isolates the rest of the application from the implementation details of data access and manipulation. The ViewModel provides a model that the specific view can use to render the presentation more easily.

Now, the performance issue: Clearly, there is some amount of data replication and duplicate data processing. But how much duplication is going on? Clearly, most views operate with a subset of the entire data. A calendar displays (usually) one person's appointments over a small date range. A histogram is a summary of the data points, not the entire data set. However, if your View is processing a large volume of data and has performance issues, you may want to rethink many parts of the entire architecture itself, not just the Model.

Thus, the ViewModel converts the DataModel into a smaller data set that can be conveniently processed by the view. Since the ViewModel is associated with a specific View, its events, methods and properties can be changed to suit the demands of the View. All the time, the original Model remains intact to serve the common needs of all views.

References:


Technorati tags: , ,

Sunday, December 02, 2007

The Joy of Setting Up a Java Framework

Joy? Actually, no. Unfortunately, Java is becoming a victim of too much innovation. Anyone doing web development using a Java-based framework is faced with an abundance of choices and very little guidance and help from the innovators. Here, take a look at these lists: Apache projects, web frameworks, persistence solutions, web servers, IDEs, etc. What tools and projects would you choose for your next development?

Here are the challenges:

  1. Many projects have no published timelines for compliance with the latest standards or emerging technologies.
  2. They rely on external solutions to provide key functionality such as object-relational mapping.
  3. They have a very cumbersome installation process that involves tinkering with a multitude of XML files.
  4. Documentation is non-existing, elementary or plain wrong. For some reason, the developers are interested in only writing books, not free online documentation.
  5. If you are interested in going bald, try combining multiple components to accomplish something useful.

How would you feel (or rather, experience) if you were given the following steps for disposing a bomb over the phone?

First, cut the green wire.
Now, cut the blue wire.
Sorry, before cutting the blue wire, you should cut the red wire.

That is how installation works. When you follow all the steps and try to run your application, you encounter an error. You spend a tremendous amount of time walking through and verifying all the steps again. It doesn't work. Then you read the next chapter and it says, "Hey, by the way, you should also do this." Wonderful, thank you.

A few years back, I remember seeing a website that had a single installation that bundled all the popular Java and open source products and promised to get it all working. I couldn't find that site, but I think that is a business idea that must be revived with respect to a Java framework. Here is what I think it should look like:

  1. It goes without saying that all the components should be open source. But there are various types of open source licenses. Hence you may have multiple versions: "Free for any commercial use without restriction", "Free as long as you give away the source code", etc. Yes, that means not forcing people to figure out the difference between GPL, LGPL, Apache and BSD licenses.
  2. Again, obviously, there must be different installations for different operating systems. My point in stating this is that each operating system has its own preferred way of installation. If it is a Windows operating system, provide an installer, not a ZIP file.
  3. Any installation configuration must be done through a visual interface both at and after installation. The user should not even know how that information is stored.
  4. The installation must package a visual IDE. Otherwise it is just a development kit. Eclipse is a popular choice.
  5. The installation must package a lightweight J2EE server. When you create a new web application, it will run in that server when you press the Run button. No questions asked.
  6. The installation must bundle the latest database drivers for different databases.
  7. It must contain searchable user-friendly help and meaningful tutorials that are available locally. It should ideally have the ability to download and index new articles from selected web sites.
  8. And the killer feature: It will offer its own choice of frameworks by combining the more popular frameworks to provide end-to-end functionality.

Let me explain the last point with an example. For example, Struts is an MVC framework. You could have a more powerful framework by adding a Java persistence component. So an installation process could let you choose "MVC - yes or no?", and "JPA - yes or no?". As you choose each one, it could offer you choices of the most popular solutions available.

When any of the underlying components change, a patch build should be available to download and install silently. The vendor should do the hard work of making sure that the new version of the component does not affect backward compatibility.

Doing something of this sort is pretty expensive, but not impossible. It would be similar in scope to creating a Linux distribution and certifying it with various hardware devices. Someone interested in pursuing this business idea has to hire testers, technical writers, and usability experts, in addition to developers familiar with different operating systems and environments.

What I have outlined is simply applying the Visual Studio concept to Java. Unfortunately, the companies that have the resources to do this, like IBM or Oracle, are intent on plugging their own Java solutions in the server and IDE space. Maybe someone else will come along. Till then, less joy and more pain.


Communicating Software Estimates

In my last post, I had written about the conflict over software estimates between the people running the business and the people writing the software. I mentioned that most projects are likely to be underestimated in the first place and further pressure from the business side only makes things worse.

How can engineers get business people to understand this? Let us first look at some of the dynamics between the two sides:

  1. Business people generally have more power in the company than purely technical people. They have the power to hire, fire, outsource, buy, etc. This power equation is a conscious or sub-conscious aspect of every communication.
  2. Business people are used to trying to get the best deal. They would like to get the most features in the product for the least cost at the earliest. Before an estimate is provided, they may not actually need all those features, but after that, they believe that they are entitled to what they were promised.
  3. Business people seldom have any idea of the complexity of changes. This sounds really obvious, but software developers don’t understand this. They are confused at the irrationality of changes requested when they are already behind on the schedule.

A new developer eager to please business people starts off by giving or agreeing to relatively aggressive estimates. After the project blows through deadlines on its own or with the help of a few "change requests", the developer becomes more circumspect. However, the answer here is not over-estimation (charitably called "adding a buffer"). Contrary to what many people think, over-estimating a task is not that easy in practice. Here is why:

  1. Over-estimation cannot be so excessive that it belies common sense. For example, normally you cannot say that you need 30 days to change the layout of an existing report.
  2. Over-estimation should not arouse suspicions of deliberate over-estimation. Otherwise, the other party will haggle and negotiate down the estimate, sometimes even below what is acceptable.
  3. If the over-estimation is accepted, one has to show signs of being busy. Otherwise, the next estimate negotiation could bring up that issue.
  4. And most importantly, over-estimation is over-estimating what you know today. It does not account for misunderstanding requirements or change requests.

So what does an engineer do? The right answer, in my opinion, is to start with the statement, "I want to help you, but I don’t know how much it will take without learning more. Can you help me understand the problem better?"

This statement does two things. First, it puts you on the same side as the business people and both looking at the problem. Being on the same side is important to put thoughts of self-benefit, negotiation and compromise away. Secondly, it allocates responsibility to the business people to help you get to the right answer. It establishes a quid pro quo: You need an estimate; you provide the knowledge for the estimate.

The next step should be to arrive at an understanding of how long it will take to understand the project enough to provide an estimate. This could be a single meeting or it could be a few weeks, depending both on the complexity of the project and the current knowledge of the estimator. These sessions will attempt to reduce the mystery around the estimates.

What I am suggesting seems suspiciously like a waterfall model, but that is not my intention. A fixed point estimate in a waterfall model is a loss for both sides because it does not deal with the reality of change management. Any project lasting more than a few weeks will have to undergo change because of changing business needs. A fixed estimate forces the technical team to reject valid business needs or, worse, accept them within the current estimate.

An iterative model that accommodates changing business requirements is a better option. But even in that situation, business people need estimates in one form or another (Sprint backlog, dynamic project plans, etc.) to plan their activities. By collaborating with them to understand before estimating, you both win.

A final question remains: What do you do about the person who says, "I don't care what you think it takes. I must get this done by date 'x'. Otherwise the following bad things will happen to the company/you/me."

After many years in the field, here is my opinion: Even if a negative outcome could affect you, it is not a problem you have created. You do not have to feel obligated to solve it if there is no solution. Even if you put your best effort, it is very unlikely that you will hit the deadline and more likely that you will be blamed in some way. There is nothing in it for you.

I realize that personal circumstances (such as not losing your job just then) may prevent someone from taking such a stand. But absolving yourself of the responsibility and preparing for the worst can relieve you of the mental stress involved in death marches. You are trying your best. It is not going to happen. Let it go...


Wednesday, November 21, 2007

Software Estimation and the Business-Technical Conflict

People often talk about the 5 day workweek, but in reality, such a division of time and work is only applicable to hourly-paid workers. Financial people divide time into quarters of the year. Marketing folks divide time into advertising campaigns and trade show events. The technical crowd divides everything into projects. The IT support crew divides everything into periods of relative calm punctuated by hectic fire-fighting.

Most companies are dominated by people who work on the business side (sales, marketing, finance, etc.). And like it or not, their milestones tend to become the milestones of other people in the company. For example, if the marketing vice-president has an exposition coming up, that event acts as a point in time around which product deliverables get done. If the finance head needs to meet certain targets before the date of filing results, newer versions have to released and sold before then.

This causes significant conflict between the business and technical sides in the company. The delivery dates to which the technical team can commit and the order in which they want to work on the deliverables have no relation to the business deadlines. Unfortunately, neither side knows the language of the other. The businesspeople don't understand the technical difficulties involved. The technical people don't know how to express their concerns in an understandable manner or how to negotiate deadlines well.

Often, the business people accuse the technical team of padding the estimates. In my experience, this is very ironic because

  • Software developers and teams are more likely to underestimate the complexity of building an application, especially during the initial rounds. This is because they have been given "high-level" product features and have not understood the implementation details.

  • Software development invariably tends to run over budget and over time. The accusation would make sense if the business folks could point to any project that did the opposite. You cannot win this argument though - someone always points to Parkinson's rule.

  • Experienced engineers with less management experience are likely to over-estimate their capability of building the application. When formal estimation techniques are not used, they will try to relate the project to some other work they have done and provide an estimate.

Business people are more persuasive and more aggressive at getting what they want. If they talk loud enough, even the most confident engineer has self-doubts and tries to figure out how much he can bend the estimate. For example, he thinks, "If this deliverable takes 2 months, I could probably bring it down to 6 weeks by using tool "X" and maybe working a few extra hours every week."

The problem is that an under-estimated project is further under-estimated, resulting in a Death March. The tragic aspect is that everyone knows that the goals are not realistic, but are so committed to the project that they put extraordinary effort. In the end, all they get for their hard work is blame and the bitterness of failure. Sometimes, people can be burnt for life after going through such an episode.

My point is that if you are a business person with little understanding of technology or project management, do not presume to know better. You must trust your technical people when they give you estimates. If you don't like the estimates, it does not mean that they are wrong, just that they are inconvenient to you. If you over-promised someone, go back and re-negotiate. It is better to lose your face in front of someone than to submit your team to weeks of hell.

In a future article, I will suggest ways for technical people to avoid committing to difficult or impossible deadlines.


Saturday, November 17, 2007

A Theory of Simplicity

Simplicity is one of those goals that everyone talks about, but few achieve. When designing applications, simplicity is supposedly a paramount concern, yet many applications never achieve that state. Very often, we see simple applications that are very basic in terms of functionality. Or, we have highly functional applications that are very complex for end users.

I believe a reason for this situation is that people consider only 2 stages of simplicity - "simple" and "complex", while in fact, there are 3 possible stages as follows:

Simplistic -> Complex -> Simple

The arrows designate the passage of time in designing and creating such applications.

When you design an application to meet a particular need, you create a "simplistic" application that caters to that particular need. It does not meet any other need, but it is easy to use. This is a good strategy for designing version 1.0 of an application. You can focus on the most important need for your customers and create an application that meets their needs, without making it cumbersome to use. In other words, do not try to make everyone happy, just your core customers.

Once your application is in production, users will clamor for more features that may or may not be directly related to your core product objectives. For example, they may ask for export functionality of certain data to another application. To stay in business, you must try to meet at least some of those demands; otherwise you will lose your users to your competition. As you add more features, the complexity of your application keeps increasing. When you add more clutter, your software becomes more difficult for new users who find it complex and hence may be lost.

The challenge, therefore, is to add new features without sacrificing ease-of-use of the application. One strategy is to let the application become cluttered, refactor the user interface, become cluttered again, repeat. Another, probably better strategy is to carefully understand user interface implications before adding new features. The latter is not always an option when you are faced with competitive pressures.

In the real world, we find different companies coming out with applications in each phase, and making the older products obsolete. For example, first generation simplistic cell phone just made phone calls. Then complex phones came out with several functions and accompanying complex user interfaces. And now you have the iPhone and other newer generation phones.

Many people confuse "simplistic" and "simple". If your software doesn't do much, it can be very easy for end users. It can also be very easy to develop. But your end users will get frustrated when the application does not grow to meet their needs. And, your application will be easy pickings for your competition to copy.

Thus you have a paradox: You have to continuously add complex behavior into the software while reducing (or least not increasing) the complexity it presents to the user. That is true simplicity.


Thursday, November 01, 2007

Complex Requirements

Of the many differences that separate a simple software project and a complex one, this one is the most critical: A complex project has requirements that do not fit into one human brain. Many people understand this concept vaguely, but never understand it deeply or consciously. So let us analyze what this means.

A complex project has data and functional relationships that are many magnitudes numerous than a simple project. This is not necessarily related to the number of features that an application has. Nor is it necessarily related to the number of screens, reports, tables or lines of code. For example, a Garbage-In, Garbage-Out application with 5 input screens and 1000 canned reports may be less complex than one with 20 input screens and 30 dynamic reports, even though the former is a larger application.

The complexity comes from how interdependent the various modules the application are. Most of us are familiar with the concepts of coupling and cohesion in software design. But those concepts are also relevant to requirements management. Any requirement has the potential to affect the behavior of another requirement in the system. The more they actually affect each other, the higher the coupling between requirements in your proposed system.

Let us take a billing system. Every transaction affects AR, AP and associated financial statements. Now, let us add the requirement of managing a clerical error. Can the transaction be canceled? How long after the transaction can it be canceled? What happens if the error is discovered after a quarterly reporting period? What do you do about past reports? Introducing one seemingly simple requirement has affected many modules within the application.

Let us take an opposite example. When Google added a new Presentation module to their Google Docs, it hardly affected the rest of the system, except a name here and a link there. Although the Presentation module is very complex in itself, it did not affect the functionality of the rest of the system in any significant way. Now, Google can continue adding new modules and it won't have a problem until it attempts to do what Microsoft Office does: Embed one document type in another document.

In a complex project, it is very difficult to understand all the complex rules involved. Consider 3 dependent events, the order of which is unknown. We have to consider 6 different permutations of the events. 4 events result in 24 permutations. 5 events result in 60 permutations. In a real system, the different transactions affecting the system result in an incredibly high number of possible permutations.

With a simple level of complexity, it is easy to keep all the system requirements in your head. Many small projects operate that way, successfully. As the complexity increases, you have to rely on external methods like some form of documenting requirements. Take your pick - software requirements documents, wiki pages, user manuals, test cases, etc. The primary symptom of non-documented requirements is the increasing number of regression bugs. If you ever have to tell someone, "This was working before. Now you broke it", then it usually means that the other person does not know what the requirement is.

With complex requirements, you cannot rely on documented requirements alone. The reason is that since requirements affect one another, the person documenting them has to continuously cross-link and/or duplicate requirements so that it is evident to the developer what is happening. For example, in our billing system example, when we document "canceling a transaction", we have to link to many other sections in the requirements. This is practically difficult and usually error-prone.

One strategy is to reduce permutations by enforcing some constraints in real life. For example, many billing systems do not allow cancellations. Instead, you have to put another entry - a negative entry. This means that functionality used to handle regular entries can be re-used, reducing the complexity. However, this may be problematic, because it creates greater hurdles for end users. Also, when your rules are defined by government regulations, you don't have much leeway in changing them.

When you are building a system from scratch, another strategy is to build incrementally. Build a core that is less complex. Make it stable. Add a complex requirement. Stabilize the system. Repeat until you are done. Sometimes, you realize that a requirement is so incredibly complex that you simply cannot get your arms around it and it is consuming too much of your time. The best thing you can do is abandon that requirement after negotiating with your customers or doing more market research.

If your complex system already exists and you are adding new requirements, my heart goes out to you. It is a thankless job. 99% of the time, proper documentation is missing. Customers do not appreciate how much time you have to spend understanding the application. The benefit of any change is heavily outweighed by the outrage and anger that happens if you break something, purely because you didn't know about it.

That brings me to the final word. Don't be gung-ho about complex requirements. They are the meanest of all - they have no pity for people mouthing buzzwords or the latest technological fad or management double-speak. Tread with caution. Accept your human limitations. Try to reduce complexity at every step. Analyze every requirement carefully. Maybe, just maybe you may succeed.


Saturday, September 29, 2007

Preparing for a Software Career - Part Two

Sometime back, I had written an article on how to prepare for a software career. Today, there was an interesting comment on that post stating that the advice was generic and could apply to any field of study. Also since most people invest a lot of their time and skills in one particular field, they do need specific advice.

I agree on the point of original selection. People could waste precious years of their lives learning the wrong things and finding themselves at a dead-end one day. So what should a computer science student do?

Here are the possible career options after you finish doing a computer science or computer applications course:

  1. Find a job at a start-up, small or medium-sized company or a large corporation.
  2. Start your own company.
  3. Go into future research and possibly becoming a teacher.

In this article, I will only discuss the first option, primarily because that is the common choice for most people and secondly, I have little knowledge of the latter two options.

If you are looking for a job, the important thing is to maximize your options for getting a job and, to a less immediate extent, maximize your future opportunities after getting the job. Here are the things you need to care about:

  1. Size of the job market: If you want to choose between two subjects, find the relative market demand for such knowledge. You can make an educated guess by looking at the postings in various job sites and advertising in newspapers. You can also look at the size and reach of online communities in such subjects. For example, Java has a greater market than Perl and hence is a better choice in that respect, regardless of the actual merits of either language.

  2. Simplicity of the subject. The easier the subject, the greater the supply of talent in the market. And the lower the chances of you landing a job, or getting a good salary. For example, knowing web design (HTML, CSS, etc.) was very much in demand a few years back, but since it was easy to learn and tools became more powerful, it is perhaps not a good career option today.

  3. Popularity trend of the subject: Some subjects have a large market, but they are declining. For example, Perl had a good time in the 1990s, but other open source languages are gaining traction over it. It is also important to recognize temporary upsurges in popularity. For example, Lisp experienced resurgence because of Paul Graham's writings, but we don't know if that is a permanent thing. Also, the use of a language by a highly successful company, like Python and Google, can give a language momentum. 

  4. Is the subject a point-of-no-return? Choose a subject that gives you more options in the future. For example, learning a C++-like language (like Java, C#, etc.) can give you more flexibility compared to other languages. Similarly, other choices that can affect you are your program specialization, your project work, your grades in specific courses, etc.

  5. Can you learn it well? If you have never mastered the basics of a subject, say mathematics, then it is better to stay away from such a course. Otherwise, you will constantly struggle to learn it well. You may end up with bad grades. This could possibly have a ripple effect on the rest of your career.

There is also the question about your personal preferences. You may like one subject or language better than another. It is perfectly fine to make your choice based on that, if that will keep you happy regardless of the outcome in terms of getting a job. In my experience, it doesn't. The worst thing you can do to yourself is to marry yourself to some liking or principle and then face the specter of unemployment.

A problem with making any choice is that you are doing it based on your present understanding of the job market. It is likely that employer preferences may change in the years while you are studying. Academic curriculum changes are slow. So it is important to do your own learning to stay on top of current market happenings.

At this point, I want to return to the points mentioned in my earlier article. Having made your choice, you have to work extremely hard to learn the subject well. Secondly, learn how to communicate and interact well in a professional setting. The knowledge and the people skills will make a significant difference in your ability to get a job.

Remember that when you are applying for a job that is advertised for straight-out-of-college, then your advantage over your competition is

  • How much you know: Your knowledge.
  • How much you prove you know: Your ability to communicate your knowledge, talents and intelligence. No interviewer can scan your brain. You have to prove yourself to him/her.
  • How much you can use what you know: Any projects you have done or knowledge outside what has been taught to you.
  • How much you will get along: Your attitude and demeanor during the interview.

Once you get the job, the best way you can improve your career is to continuously learn more and adapt yourself to changing business conditions and work circumstances. In fact, improving yourself is the only thing you have control on. So better get started on that.


Wednesday, September 26, 2007

Writing Software Requirements for Developers

Writing requirements is difficult. Most, if not all, requirements documents are generally incomplete to varying degrees. When developers start coding, they realize that certain conditions are not addressed and get stuck. At this point, someone has to step in and fill the gaps.

Achieving 100% complete requirements is an impossible and highly expensive goal, not least because business needs keep changing. However, there is such a thing as inadequate requirements. When requirements are not enough, developers have to spend time to find the missing pieces. And if they resort to assumptions, the result is costly re-work.

One way for an analyst to create better quality requirements is to continuously think of what the developer would want to know. Requirements document what the customer wants to build. But unless the analyst anticipates questions that will arise during development and gets them answered, then such documents will keep developers on edge.

Let me elaborate upon this. Let us take any business entity in the application such as an employee or department record. Here are some of the questions that would need to be answered:

  1. Operations: What can users do with this record - Add, View, Edit, Update & Delete? What are the restrictions on such operations? For example, the customer may not want users to delete employee records at all. A clerk cannot edit the Point-of-Sale information, but can only add a correction. A mechanic may want to do a "search" for an inventory part, but probably just see a list of car models.

  2. Security: Who can perform each operation? For example, data entry operators can only do certain operations. Only certain managers can see aggregate reports. No one in the Chicago branch should be able to see the financial reports of the Houston branch and vice versa. A student could only see the names of other students in their class, but the school administrator can also see their Social Security Numbers.

  3. Logging and Audit: Should the application track and record various operations with each record and the user who did the operation? Does the application need to maintain and compare different versions of the record? Should the application have the capability to turn off this audit capability when required? Should there be diagnostic capability to understand the health of the application with respect to data integrity and performance?

  4. Dependencies: How does an operation on one entity affect the remainder of the application? For example, when a sale is made to a new customer, what other operations (such as future marketing) should be activated? Am I prevented from deleting an employee record if I have written some notes on that employee? Or should the notes be deleted automatically?

  5. Locking and Concurrency:  Do multiple users need access to the same record at the same time? What kind of operations can they perform simultaneously? Do we need to prevent access in any way? Will that create any deadlocks? Will users lose data in this process? Thinking about such considerations is very important in applications having shared data.

  6. Ownership and Sharing: Does each record belong to a particular user? Can that user share that information with other users? Can the other users edit that information? Can they share that information further with yet other people? Can a user prevent others from spamming him/her by sharing unnecessary information?

  7. Workflow: What is the lifecycle of each record? For example, a document record may move from one user to another seeking approval. A bug may move from a tester to a developer asking for action and return back to the tester requesting closure. Who can act upon that record at each stage in its lifecycle? What are the possible next stages from each stage?

  8. Data Formats: Can the user export the data to Excel, Word, CSV or some other format? Can they import it back from those formats? Does the application need to provide access to the data through API's or web services? Can power users directly interact with the data using reporting tools, bypassing the application and its controls?

  9. Visual Formats: Do end users like a form style of dat