ABC of Software Development
- A Maintainable code should have following characteristics
- Code should be Readable
- Code should be understandable
- Extendable
- Testable
- Reliability and Performance
1. Programming Methodologies
I started my career as a programmer, writing assembly codes. Probably this is how the software development had started. Writing assembly code is a tiresome process which take too much of man power even to finish the simple tasks. And writing complex tasks is seemingly impossible with assembly coding. This is why Programming Languages (middle-level or high-level) were invented. Currently, assembly language programming is generally considered as firmware development. Programming in any high-level or middle-level language is considered as Sofwtware development.
1.1 Structured Programming When software programming just came in to existence, most of the tasks (to be programmed) were very simple and small. All focus was to get a functional code (and it was really a wonderful ahievement in those days). But with time, the programming tasks became more and more complex. The sizes of code for such tasks, also became huge and it became increasingly difficult to manage such huge code bases. This is when concept of modular and structured programming came in to existence. Under structured programming, the complete tasks is devided in to smaller subtasks (which in turn can be further subdivided in to simple functions and so on). Programmers start code development in form of basic program structures (or functions or modules or procedures) which are called from higher level functions. Structure programming essentially consists of data stuctures (global or local) and a set of program structures (to manipulate these data elements). The main application logic makes use of program structures to manipulate the data structures (that is why word structured). A certain state of data elements is considered the output of the program. Structured programming can be achieved with programming languages which have support specific programming structures (e.g. loops, conditional execution, input/ouput handling etc). "C" is the most popular structured programming language.
Structured programming makes the entire application modular (data sturcures and functions are the modules). This improves the clarity and quality of the code. Also the basic modules can be resused (in the same application or in another application), thus reducing the development time.
1.2 Object Oriented Programming Structured programming was a major step to improve the quality and clarity of the code. It produced codes, which were easy to manage and easy to modify. However, when programs became larger and larger, even structured programming started to exhibit problems. Structure programming model consists of Data elements (or Data Structures) and functions (or Program Structures) to manipulate those data elements. But there is very less control over which function should modify what data. Looking at a code gives absolutely no information about which data elements are being modified by what funcitons (in a given code), AND which function is modifying which data elements (in the given code). When the code size becomes very large, it becomes very difficult to manage and modify such codes (specially for a programmer who is new to a codebase). Another major issue is extendability. Let us consider and example of a function which computes area of a triangle. The input parameters are in integers. After some time if you want the input parameters to be floating type, it will require a major modification to the entire code base or you will need to write an entirely new function (for floating type inputs). Also if you want to add a new parameter to a function, the situation will be even more difficult. This will need a major modifcation in the code base or you will need to write a new function all together. This is where object oriented programming comes to help.
Under structured programming the emphasis is on how to (logic to) manipulate the data. But in OOP the emphasis is on how to define the data objects (which has to be manipulated). The first step is to identify all the objects (which you want to manipulate) and define their relationship with each other (This step is called data modelling). Once all the objects have been identified, we generalize a class of objects. The class also defines Methods, which can manipulate the data objects. The important benefints of OOP are: Data Encapsulation, Inheritance, Abstraction and Polymorphism. Java, C++ And Smalltalk are most popular Object Oriented Languages.
1.3 Aspect Oriented Programming Separation of concerns entails breaking down a program into distinct parts that overlap in functionality as little as possible. All programming methodologies, including procedural programming and object-oriented programming supports some separation and encapsulation of concerns (or any area of interest or focus) into single entities. For example, procedures, packages, classes, and methods all help programmers encapsulate concerns into single entities. But some concerns defy these forms of encapsulation. Software engineers call these crosscutting concerns, because they cut across many modules in a program. , the programming paradigms of aspect-oriented programming (AOP), and aspect-oriented software development (AOSD) attempt to aid programmers in the separation of concerns, specifically cross-cutting concerns, as an advance in modularization. AOP does so using primarily language changes, while AOSD uses a combination of language, environment, and method.
Since cross-cutting concerns are spread over different modules, they do not get properly encapsulated in their own modules. AOP attempts to solve this problem by allowing the programmer to express cross-cutting concerns in stand-alone modules called aspects. Aspect contain advice (code joined to specified points in the program) and inter-type declarations (structural members added to other classes). The advice-related component of an aspect-oriented language defines a join point model (JPM). A JPM defines three things:
Join Points, Point Cuts and Advice.
1.4 Component Oriented Programming Component Oriented Programming is a new dimension in field of Software Development. Component Oriented progamming invloves use of binary components (instead of codebases) to form a System. Some of the distinct benefits of COP are faster time to market, more robust and highly scalable applications, and lower development and long-term maintenance costs. COP views the software development as a collection of independent binary components and a component-enabling technology (like CORBA, COM, J2EE, .NET) which can be used for plumbing these componenet together. Components (under COP) can be compared to the objects (under OOP), but componenets have much lower coupling (than objects). For example if you modify a class uner OOP, you need to recompile the files only related to that particular class and then link the entire application. Under COP the relinking is not required any Component can be modified (or replaced) independently. This can be done even when the application is running (provided that this particular component is not being used). Component development is done independently of the application development (and any available language can be used). When new requirements need to be implemented, new components can be developed (independent of existing components) which can later replace the existing components.
2. Characteristics of a Quality Code
A Quality code should be reliable, maintainable, extendable and testable.
2.1 Reliable
A reliable code should produce same output each time for same input conditions. In case of errors it should gracefully exit after reporting (or logging) the error conditions. "Graceful exit" means that all the environment changes must be saved prior to exit (indstead of sudden crash).
2.2 Maintainable
Maintainable software should be easy to read, easy to understand, and easy to modify for any new user (not only for the original programmer).
2.3 Extendable
Exentable code allows flexibility of feature edition with minimum cost and in minimum time. Extendable code needs to have clear interfaces and the dependencis between different modules should be nill (or minimum)
2.4 Testable
All the error conditions must be handled gracefully. There should be a provision to log (different levels) the code execution. There should be a provision for code profiling (which can be turned off when needed).
3. Why Quality
Code should be reliable because sudden crashes are very annoying to the user. Sudden crash might cause a lot of (unsaved) work to be lost, which is rarely expectable by users. In most of the organization, most of the programmers (infact all at some time or other) deal with code written by other programmers. If code is difficult to read, programmers will have to spend lot more time while maintaining (managing or modifying) this code. If code is difficult to understand, programmers will find it difficult to fix any bugs or to add any new features to the code. Maintanable code thus improves overall efficiency of software development. Code re-usability also improves the efficiency of Software development, hence initial design of the code should take care of the extendability. More testing time adds to the overall product cost of a software. A testable code can help to keep the overall product cost as low as possible.
4. Coding Principles to Achieve Quality
4.1.1.1 Indentation
Unindented code is difficult to read and difficult to understand. Proper indentation makes a code easy to read. A few white Spaces (2 or 3) should be used for indentation. Use of more white spaces might lead to very wide lines (which we will discuss later).
4.1.1.2 Small Functions
Functions size should be kept as small as possible. Small functions are easy to read and easy to understand. Also a bigger function is most likely to have many independent functionalities contained which reduces the reusability of logic. Ideally a function should fit on to your monitor screen.
4.1.1.3 No Tabs
Software developed by you is likely to be used and maintained by different programmer under different environments. All the editors can not handle the tabs properly, so it is likely that your code will not be readable under some environments, if you use tabs. Hence you should use white space instead of tabs.
4.1.1.4 Line Widths
Generally it is a good practise to keep the line width less than eighty characters, because wider lines can not fit on most of PC screen. Wider lines will mean that the reader has to scroll left and right (which unnecessarily consumes time). The text should be joined to the next line, if line width is exceeding 80 characters.
4.1.1.5 Files and Directory Structure
All the related functions should be contained in a file. All the related files should be contained in a directory. This kind of modularization makes the codebase more manageable.
4.1.2.1 Naming Conventions
Meaningful variable names and function names should be used in the code. For example a function which multiplies two matrices can be named as "MatrixMul()" instead of being named as "MyFunction()" (or any other meaingless name). A variable which keeps frame count can be named as "FrameCount". Also some standard prefix (e.g. Hungarian Notation ) should be used with variable names to imply their data types.
4.1.2.2 Comments
Function header must specify the input and output parameter and the actions performed by given function. Appropriate comments should be used within the functions, wherever necessary.
4.1.2.3 KISS
Keep It Simple Sir: In most cases, simplest solution is the best solution. Use simple logics (to implement a given task) instead of complex and difficult to comprehend logics.
4.2.1 Platform Independent
Write the code in a platform independent way. Code should be reusable on different platforms (different processors, different OS). Take care of the Architecture specific constraints like Endian-ness, Word-Size etc. Also, use uniform datatypes (along with OS specific header files).
4.2.2 Zero Coupling
Divide the entire application in to different independent modules. Each module should be independent of any other modules. Thus, any requirement change or feature edition can be handled by modifying the affected modules (no changes are needed in other modules).
4.2.3 Aovid Duplication
Duplication, increases the probability of errors when duplicated code is modified on a later stage (it is likely that you will make the modification at some places, but will miss it in the other places). Do not use copy paste. If same code needs to be used at different places, use Functions or Macros. Avoid redundant comments (it amounts to duplication). For example you should only specify what a function does, and not how it does (because it is likely that if some one changes the programming logic, he will miss out on updating the comments).
4.3.1 Allow Logging
Different log options should be provided (detailed log, minimum log etc).
4.3.2 Use Assertions
Any condition which should never occur, should have a check and error handling.
4.4 Reliability and Performance
As such there are no coding guidelines to improve the reliability and performance of a code. However as a programmer you should develope a mindset which can help you to write reliability and performance of the code. While writing a code think of all possible input conditions and use cases. Make sure that your code handles all the valid conditions and returns an error on invalid conditions. While implementing any logic, choose the simple solution (likely to be most optimal).
5. Software Development Checklist
Make sure that you have following things in place before you start developing the code for any project
5.1 Coding Guidelines
If you are a team leader, make sure that every one in your team has received, understood and agreed to the coding guide-lines. Arrange periodic code reviews to ensure that developers are adhering to the coding guide-lines. Perioding reviews are necessary, because any changes at the end stage of project are going to be costlier and tough to manage. If you are a team member, make sure that you receive coding guidelines before you write the first line of code. Respect the coding guide-lines and adhere to them. In case you do not agree with some guidelines, raise this to the management and call for a discussion (rather than just ignoring them while writing the code).
5.2 Source Code Control
Source Code Control is the most basic fundamental of Software Development. But a large number of organizations and development teams do not follow (or do not follow effectively) them. If you are a Team Leader, make sure that every one in your team understands the importance of Source Code Control and they are comfortable with configuration management tool being used in your ogranization. Make sure that they peridocially check in the modifications (instead of checking in the code once in a while). Ideally, code should be checked in after each major modification or feature edition. Make sure that any new check in has been fully tested. If you are a developer, make sure that you follow the source code control guidelines given to you. If you have not received any guidelines, ask for them.
5.3 Test Set up
While developing the code, also think about the test cases. Make sure that you test for all the possible (Valid or invalid) conditions during unit testing. If a test set up is not ready, do not go ahead with the code development. Any bugs caught during the early stages of software development are easy to fix. If you wait for the testing, til the completion of product, it is very likely that you will spend too much time while fixing the bugs. Any fix can create new bugs, which will again be time consuming.
5.4 Automation Tools
These days there are lot of code generation, test pattern generation, test validation and build tools which can automate a lot of development and test process. A Good knowledge and effective use of these tools ensures lower development times.
6. Managing Software Projects
Following below mentioned principles can greatly improve the efficiency of a project
6.1 No Broken Windows
The "Broken Window" Theory was purported initially by "George L. Kelling" and "James Q. Wilson" in an article in "The Atlantic Monthy" in March 1982. Later "George L Kelling" and "Catherine Cloes" published a book discussing the petty crimes and strategies to contain or eliminate them. For example Consider a building with a few broken windows. If the windows are not repaired, the tendency is for vandals to break a few more windows. Eventually, they may even break into the building, and if it's unoccupied, perhaps become squatters or light fires inside. A successful strategy for preventing vadalism, therefore, is to fix the problems when they are small. Similar theory holds good for software development. At times, some programmersadd modify/add some sections in the code base in a manner which violates the coding guidelines. If such sections, are not fixed immediately, there will be a tendency amongst other programmers (who see the faulty code) to violate the coding guidelines. Very soon, the codebase will become difficult to manage. Therefore, any broken windows in the codebase should be fixed as soon as these are noticed. "Andrew Hunt" and "David Thomas" in their book "Pragmatic Programmer" discuss this phenomenon in detail.
6.2 Accountability
It is necessary that some one in the team is assigned accountability for each (small or big) task. The responsibilities of each individual working on the project should be clearly defined and shared across the team. It is likely to increase co-operation with in the team.
6.3 Team Composition
Team composition is very important towards success of a project. As first step, one should make a note of all the technologies and tools which will be used during the project. It is necessary that there should be atleast one expert on each techonology and tool within the team. In absence of this expertise, it is likely that team will end up wasting considerable time on petty issues. The team should have a mix of experienced and fresh people and they should be paired in such a way that fresh people learn from experienced people. Contribution of experienced and expert developers should be formally acknowledged.
7. Principle of Software Testing
Following the below mentioned principle for software testing can greatly improve the efficiency. Result is a quality product at less cost and in less time
7.1 Verification Vs Validation
The primary aim of testing is to verify that the given product complies to the product specifications provided by the customer. This process is called verification. Another important part (which is generaly missed during the primary stage of project planning) is Validation. Validation is the process of verifying if the Product Specifications are correct. Is this what customer finally wants? In most cases, customer keeps on updating the specification (till last moment of project deliver). Any specification change in later stage of the product development is tough and costlier to implement. A good validation (based on prior experience and common sense) can help to stablise the product specifications in the early stage of prject.
7.2 Communication
Good communication between Testers, Developers and Customers is a crucial aspect of efficient testing. Most of the times there is a big gap between testing team and development team. Also the testing team is completely isolated from the customers. Testers, Developers and the customer should have an open discussion during the validation phase. After validation, tester should come up with a document on the test plan. This document should be reviewed by developers and customers. These days, specifications for the project never gets frozen. It takes a long time for the specifcation proposed by customers to reach the Testing Team. Developers keep on discussing urgency of these changes (to figure out if these changes can be skipped) with customers. Testing team is generally not involved in such discussions and is unaware of these proposed changes. Any Specification changes might greatly impact the design of test set up. Therefor, information about any proposed changes should be passed to testers without any delay. This can help testers to prepare for the test set up in advance (if changes are not accepted, the set up can be just scrapped).
7.3 Hierarchal Testing
For any complex system it is inefficient to do the testing only on system level. To effecively develope the product a heirarichal testing is suggested. The System level testing needs to be complemented by unit testing and integration testing. Code Coverage and Memory Leaks test the overall quality of the software.
7.3.1 Unit Testing
Unit is the smallest collection of code which can be (usefully) tested. The unit testing is primarily focused on the implementation - Does the code implement what the designer intended? Do all the speciall cases work? Are all the errors detected? Unit testing should be done by programmers (rather than independent testers). Unit testing must be complemented with some minimum documentation (about what test cases and what result). The documentation must be reviewable.
7.3.2 Integration Testing
Integration testing is a logical extension of unit testing. In its simplest form, two units that have already been tested are combined into a component and the interface between them is tested. A component, in this sense, refers to an integrated aggregate of more than one unit. In a realistic scenario, many units are combined into components, which are in turn aggregated into even larger parts of the program. The idea is to test combinations of pieces and eventually expand the process to test your modules with those of other groups. Eventually all the modules making up a process are tested together. Beyond that, if the program is composed of more than one process, they should be tested in pairs rather than all at once. Integration Testing exposes defects in the interfaces and interaction between integrated components (modules).
7.3.3 System Testing
System Level Testing tests an integrated system to verify that it meets its requirements, which can sometimes be sub-divided into:
7.3.4 Code Coverage
Code coverage is inherently a white box testing activity. The target software is built with special options or libraries and/or run under a special environment such that every function that is exercised (executed) in the program(s) are mapped back to the function points in the source code. This process allows developers and quality assurance personnel to look for parts of a system that are rarely or never accessed under normal conditions (error handling and the like) and helps reassure test engineers that the most important conditions (function points) have been tested.
Test engineers can look at code coverage test results to help them devise test cases and input or configuration sets that will increase the code coverage over vital functions. Two common forms of code coverage used by testers are statement (or line) coverage, and path (or edge) coverage. Line coverage reports on the execution footprint of testing in terms of which lines of code were executed to complete the test. Edge coverage reports which branches, or code decision points were executed to complete the test. They both report a coverage metric, measured as a percentage.
7.3.5 Memory Leaks
In computer science, a memory leak is a particular kind of unintentional memory consumption by a computer program where the program fails to release memory when no longer needed.A memory leak can diminish the performance of the computer by reducing the amount of available memory. Eventually, in the worst case, too much of the available memory may become allocated and all or part of the system or device stops working correctly or the application fails. Memory leaks are the most difficult bugs to detect. Generally one needs to use "Memory Leak Detection Tools" to unearth the memory leaks in any code. Purify and Valgrind are some of the commerically use tools. Mozilla team has also developed a number of TOOLS for memory analysis.
|